Opened 5 months ago

Closed 4 months ago

Last modified 4 months ago

#26412 closed defect (fixed)

KeyError in can_exit_to caused by lru_cache

Reported by: juga Owned by: atagar
Priority: Medium Milestone:
Component: Core Tor/Stem Version:
Severity: Normal Keywords: tor-bwauth
Cc: pastly Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by juga)

As pastly reported in https://github.com/pastly/simple-bw-scanner/issues/198,
stem giving this traceback:

File "/home/pastly/src/simple-bw-scanner/venv-editable/lib/python3.5/site-packages/stem/exit_policy.py", line 285, in can_exit_to

if not self.is_exiting_allowed():KeyError: (<stem.exit_policy.MicroExitPolicy object at 0x7f8de6438ba8>,)

Apparently caused by  a Python bug with lru_cache.

pastly is already trying to find out which Python versions fail, to use stem's lru_cache for those ones.

This ticket blocks #26848

Edit: blocks ticket

Child Tickets

Change History (12)

comment:1 Changed 5 months ago by juga

Keywords: tor-bwauth added

comment:2 Changed 5 months ago by pastly

I'm running a little toy script that kinda sorta behaves like sbws in hopes that I can reproduce this. I have been doing so for 24 hours or so now.

I'm also running sbws 0.4.2-dev through a very limited Tor client with BandwidthRate 500 KBytes set, and have been doing so for 24 hours.

I'm doing all of this with Python 3.4.8, 3.5.3, and 3.6.5.

I will do this for a few more days. If I don't get these errors, I will switch sbws to the version it was at when these KeyErrors were happening. Of note, at that time we were getting exit policies from micro descriptors but now we get them from server descriptors.

comment:3 Changed 5 months ago by atagar

Status: newneeds_information

Thanks juga, thanks pastly! Gonna swap this ticket's status. Feel free to toggle it back when there's something for us to do on the Stem front.

comment:4 Changed 5 months ago by pastly

This bug doesn't reveal itself when descriptors are used. Just microdescriptors. Therefore sbws no longer has this bug.

I switched to sbws 0.4.0 which hits this bug.

3.5.2 and 3.5.3 hit it essentially immediately. 3.5.4 does not within 2 hours.

I'll check if anything in 3.4.X or 3.6.X is affected.

comment:5 Changed 4 months ago by pastly

Status: needs_informationnew

sbws version 0.4.0
stem version 1.6.0-dev at 0192b29a4784465e5f69f11ced584a54644e4a90

See below for tested versions.

matt@play:~/src$ grep -c KeyError dotsbws-3*/debug.scanner.log
dotsbws-345/debug.scanner.log:0
dotsbws-346/debug.scanner.log:0
dotsbws-347/debug.scanner.log:0
dotsbws-348/debug.scanner.log:0
dotsbws-350/debug.scanner.log:228
dotsbws-351/debug.scanner.log:8175308
dotsbws-352/debug.scanner.log:234
dotsbws-353/debug.scanner.log:69750
dotsbws-354/debug.scanner.log:0
dotsbws-362/debug.scanner.log:0
dotsbws-363/debug.scanner.log:0
dotsbws-364/debug.scanner.log:0
dotsbws-365/debug.scanner.log:0

The huge difference in the numbers is because I ran these for vastly different amounts of time. 3.5.1 is not 3 orders of magnitude more buggy than 3.5.0. 3.5.4 ran for 24 hours and didn't display the issue; 3.5.0 ran for 2 minutes and displayed the issue.

Example unhelpful traceback that the above grep is searching for.

[2018-07-10 10:42:47,961] [sbws.lib.relaylist] [ERROR] Got that KeyError in stem again...: (<stem.exit_policy.ExitPolicy object at 0x7f9fe6f656d8>, <object object at 0x7fa01e24b080>, ('port', 80))
Traceback (most recent call last):
  File "/home/matt/src/simple-bw-scanner/sbws/lib/relaylist.py", line 167, in exits_can_exit_to
    if policy is not None and policy.can_exit_to(port=port):

atagar: based on these findings I recommend using stem's lru_cache for python 3.5.0-3.5.3. Or ... figuring out why getting exit policies from microdescriptors doesn't play nicely with python's lru_cache in only a couple versions of python.

Switching back to new because I no longer think this needs_information.

comment:6 Changed 4 months ago by juga

In case it is useful, i paste traceback lines including the lines where the exception is raised in stem (python 3.5.3):

  File "/path/simple-bw-scanner/sbws/lib/relaylist.py", line 120, in can_exit_to
    return self.exit_policy.can_exit_to(host, port)
  File "/path/.virtualenvs/simplebwscanner3/lib/python3.5/site-packages/stem-1.6.0.dev0-py3.5.egg/stem/exit_policy.py", line 294, in can_exit_to
    if rule.is_match(address, port, strict):
  File "/path/.virtualenvs/simplebwscanner3/lib/python3.5/site-packages/stem-1.6.0.dev0-py3.5.egg/stem/exit_policy.py", line 783, in is_match
    comparison_addr_bin &= self._get_mask_bin()
KeyError: (<stem.exit_policy.ExitPolicyRule object at 0x7f3743f70a90>,)
[2018-07-14 14:55:23,007] [sbws.core.scanner] [ERROR] Unhandled exception caught while measuring Unnamed: <class 'KeyError'> (<stem.exit_policy.ExitPolicyRule object at 0x7f3743f70a90>,

comment:7 Changed 4 months ago by juga

Description: modified (diff)

comment:8 Changed 4 months ago by atagar

Resolution: fixed
Status: newclosed

Fantastic, thanks pastly! Thanks juga! Fix pushed...

https://gitweb.torproject.org/stem.git/commit/?id=0b7f195

Feel free to reopen if ya need anything more.

comment:9 Changed 4 months ago by juga

Thanks, a new stem release would be awesome

comment:10 Changed 4 months ago by atagar

Hi juga. As previously mentioned I'm willing to cut a single release for sbws if/when pastly would like it. If now's the time then great, but please have him chime in that he's certain this is when he wants it.

comment:11 Changed 4 months ago by pastly

This is when pastly would like a new stem release. We depend on the new timeout feature and can't expect anyone but ourselves to git clone stem.

We haven't needed anything new from stem in a while and I don't think that will change. *knocks on wood*

comment:12 Changed 4 months ago by atagar

Sounds good! Filed the following, I'll take care of it in a bit.

https://trac.torproject.org/projects/tor/ticket/26914

Note: See TracTickets for help on using tickets.