Opened 8 weeks ago

Closed 4 weeks ago

Last modified 4 weeks ago

#28961 closed defect (fixed)

exception in descriptor/remote.py downloading large server descriptors via orport

Reported by: starlight Owned by: atagar
Priority: Medium Milestone:
Component: Core Tor/Stem Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

download_descriptor.py works ok when descriptors are short:

bash-4.1$ python3 download_descriptor.py --orport 45.59.127.53:443 -f A0ECF41498288C3BFA33005B7C92BC791B06AB77
Downloading server descriptor from 45.59.127.53:443...

router erosi 45.59.127.53 443 0 80
identity-ed25519
-----BEGIN ED25519 CERT-----
AQQABo/EAV5E1oAtsNKuEr2iE+FhLa5WSYx+pK1VVAUrFANbbu6yAQAgBABnngvA
.
.
.

not so when they are long:

bash-4.1$ python3 download_descriptor.py --orport 199.249.230.82:443 -f A0DB820FEC87C0405F7BF05DEE5E4ADED2BB9904
Downloading server descriptor from 199.249.230.82:443...

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/stem/descriptor/remote.py", line 560, in _download_descriptors
    self.content, self.reply_headers = _download_from_orport(endpoint, self.compression, self.resource)
  File "/usr/local/lib/python3.7/site-packages/stem/descriptor/remote.py", line 969, in _download_from_orport
    response = b''.join([cell.data for cell in circ.send(RelayCommand.DATA, request, stream_id = 1)])
  File "/usr/local/lib/python3.7/site-packages/stem/client/__init__.py", line 262, in send
    raise stem.ProtocolError('Circuit response should be a series of RELAY cells, but received an unexpected size for a response: %i' % len(reply))
stem.ProtocolError: Circuit response should be a series of RELAY cells, but received an unexpected size for a response: 4048

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/stem/descriptor/remote.py", line 560, in _download_descriptors
    self.content, self.reply_headers = _download_from_orport(endpoint, self.compression, self.resource)
  File "/usr/local/lib/python3.7/site-packages/stem/descriptor/remote.py", line 969, in _download_from_orport
    response = b''.join([cell.data for cell in circ.send(RelayCommand.DATA, request, stream_id = 1)])
  File "/usr/local/lib/python3.7/site-packages/stem/client/__init__.py", line 262, in send
    raise stem.ProtocolError('Circuit response should be a series of RELAY cells, but received an unexpected size for a response: %i' % len(reply))
stem.ProtocolError: Circuit response should be a series of RELAY cells, but received an unexpected size for a response: 4048

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "download_descriptor.py", line 133, in <module>
    main()
  File "download_descriptor.py", line 115, in main
    endpoints = [args.download_from],
  File "/usr/local/lib/python3.7/site-packages/stem/descriptor/remote.py", line 484, in run
    return list(self._run(suppress))
  File "/usr/local/lib/python3.7/site-packages/stem/descriptor/remote.py", line 495, in _run
    raise self.error
  File "/usr/local/lib/python3.7/site-packages/stem/descriptor/remote.py", line 560, in _download_descriptors
    self.content, self.reply_headers = _download_from_orport(endpoint, self.compression, self.resource)
  File "/usr/local/lib/python3.7/site-packages/stem/descriptor/remote.py", line 969, in _download_from_orport
    response = b''.join([cell.data for cell in circ.send(RelayCommand.DATA, request, stream_id = 1)])
  File "/usr/local/lib/python3.7/site-packages/stem/client/__init__.py", line 262, in send
    raise stem.ProtocolError('Circuit response should be a series of RELAY cells, but received an unexpected size for a response: %i' % len(reply))
stem.ProtocolError: Circuit response should be a series of RELAY cells, but received an unexpected size for a response: 4048
]}}

unencrypted no problem:

{{{
bash-4.1$ python3 download_descriptor.py --dirport 199.249.230.82:80 -f A0DB820FEC87C0405F7BF05DEE5E4ADED2BB9904                        
Downloading server descriptor from 199.249.230.82:80...                                                                                          

router QuintexAirVPN29 199.249.230.82 443 0 80
identity-ed25519                              
-----BEGIN ED25519 CERT-----                  
AQQABo5iATaZrxEunmfnvKy6Y+5N5M7B7+8ofuFMdtlmcBDX+/QFAQAgBAAmRN42
.
.
.
}}}

versions:

libffi-3.2.1
Python-3.7.1
python-cryptography-2.4.2
openssl-1.0.2
stem-20181216-git0EB8FDA3

https://stem.torproject.org/tutorials/examples/download_descriptor.html as of date ticket opened

Child Tickets

Attachments (1)

stem_timeout_stacktrace.txt (4.4 KB) - added by starlight 5 weeks ago.

Download all attachments as: .zip

Change History (12)

comment:1 Changed 8 weeks ago by atagar

Thanks starlight! This is a really good catch. Been digging in for a while and it turns out I completely buggered up our handling of cells that exceed a certain buffer size. Unfortunately the mistake I made is down in our socket wrapper and is gonna take me a few days to sort out.

comment:2 in reply to:  1 Changed 8 weeks ago by starlight

Replying to atagar:

This is a really good catch

I was cursed at birth to find obscure software defects.

handling of cells that exceed a certain buffer size

figured it was something like that

a few days to sort out

Took me from April till now for me to come around to trying out the feature, no emergency, take your time :-)

comment:3 Changed 7 weeks ago by atagar

Resolution: fixed
Status: newclosed

Thanks again starlight, fix pushed...

https://gitweb.torproject.org/stem.git/commit/?id=e2d8575

Honestly I'm not really happy with this solution, but seems to do the trick. Patches welcome if there's a smarter approach.

comment:4 Changed 5 weeks ago by starlight

stem-20190120-git8C61FC85

now large descriptors are truncated

also noticed that stem does not handle connection timeouts with grace, blows up with a huge stack trace instead of catching the exception

comment:5 Changed 5 weeks ago by atagar

Hi starlight, do you have repro instructions or something for me to dig into?

comment:6 Changed 5 weeks ago by starlight

didn't change anything; just git fetch, git pull, python3 setup.py install

then test using the first command example above

comment:7 Changed 5 weeks ago by starlight

fetch from the DIR port and compare

curl -sv http://199.249.230.82:80/tor/extra/fp/A0DB820FEC87C0405F7BF05DEE5E4ADED2BB9904.z | openssl zlib -d
curl -sv http://199.249.230.82:80/tor/server/fp/A0DB820FEC87C0405F7BF05DEE5E4ADED2BB9904.z | openssl zlib -d

comment:8 Changed 5 weeks ago by starlight

the timeout happened randomly while testing, a bit more trouble to reproduce but this will do so after some time

 A=-1; while (( ++A < 1000 )); do B=$(printf "%03d" $A); TORSOCKS_ISOLATE_PID=1 torsocks python3 download_descriptor.py -t extrainfo -f A0DB820FEC87C0405F7BF05DEE5E4ADED2BB9904 --orport 199.249.230.82:443 >xxzz$B 2>&1; done

Changed 5 weeks ago by starlight

Attachment: stem_timeout_stacktrace.txt added

comment:9 Changed 4 weeks ago by atagar

Resolution: fixed
Status: closedreopened

Thanks starlight! Great catches. Made a small fix for the timeouts. As for cropped responses you are absolutely correct, thanks for catching that!

Reproed locally and in hacking things up I can trivially fix it with...

--- a/stem/client/__init__.py
+++ b/stem/client/__init__.py
@@ -289,6 +289,15 @@ class Circuit(object):
       # updates when handled successfully.
 
       reply = self.relay._orport.recv()
+
+      while True:
+        more = self.relay._orport.recv(timeout = 1)
+
+        if more is None:
+          break
+
+        reply += more
+
       reply_cells = []
 
       while reply:

But that's just a hack and not valid solution. The root issue I need to sort out is 'how can I tell when we've received the full reply to our request'. There's probably a 'size' field or something else I'm forgetting to consult so this will need some more digging on my part.

comment:10 Changed 4 weeks ago by atagar

Resolution: fixed
Status: reopenedclosed

Hi starlight, fix pushed. I ended up rewriting a good chunk of the code involved.

https://gitweb.torproject.org/stem.git/commit/?id=4be6695a38c96f5fcba58d08e7f86ea1f6647c2e

Thanks again for the catch! If you run into any further issues please let me know.

comment:11 in reply to:  10 Changed 4 weeks ago by starlight

Replying to atagar:

Hi starlight, fix pushed. I ended up rewriting a good chunk of the code involved.

https://gitweb.torproject.org/stem.git/commit/?id=4be6695a38c96f5fcba58d08e7f86ea1f6647c2e

nice, works great

Thanks again for the catch!

welcome, thank you for the fixes the you good humor about it -- many devs are grouches when it come to bugs

If you run into any further issues please let me know.

I did notice compression is disabled by default, opened an enhancement request ticket.

Note: See TracTickets for help on using tickets.