Opened 5 years ago

Closed 4 years ago

#8755 closed defect (fixed)

Controller handles content as unicode in python 3.x

Reported by: aj00200 Owned by: atagar
Priority: Medium Milestone:
Component: Core Tor/Stem Version:
Severity: Keywords: controller
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I am getting some errors when using Stem with Python 3 when iterating though all values return from controller.get_server_descriptors(). The error happens in /usr/local/lib/python3.2/dist-packages/stem/descriptor/server_descriptor.py . I have fixed some of these errors (see attached file) but I think there are more problems with the way the "digest" variable is created. For example, if self.digest() returns 0222F10950D2177B99288E9D133FAB1B0848AE70, the digest variable is given the value

b'{q#a3l\xb9\xa7,\xb5\xf0EX\xa9\xca\xd6\xe1\xae!\x10'

but the proper encoding would be

b'\x02"\xf1\tP\xd2\x17{\x99(\x8e\x9d\x13?\xab\x1b\x08H\xaep'

I should also note that I have not tested this patch with Python 2.x yet.

This is the error which still happens with some digests:

Traceback (most recent call last):
File "./xxx.py", line 32, in <module>
for each in server_descriptors:
File "/usr/local/lib/python3.2/dist-packages/stem/control.py", line 1121, in get_server_descriptors
raise exc
File "/usr/local/lib/python3.2/dist-packages/stem/control.py", line 1117, in get_server_descriptors
for desc in stem.descriptor.server_descriptor._parse_file(io.BytesIO(str_tools._to_bytes(desc_content))):
File "/usr/local/lib/python3.2/dist-packages/stem/descriptor/server_descriptor.py", line 137, in _parse_file
yield RelayDescriptor(descriptor_text, validate, annotations)
File "/usr/local/lib/python3.2/dist-packages/stem/descriptor/server_descriptor.py", line 646, in __init__
self._validate_content()
File "/usr/local/lib/python3.2/dist-packages/stem/descriptor/server_descriptor.py", line 697, in _validate_content
self._verify_digest(key_as_bytes)
File "/usr/local/lib/python3.2/dist-packages/stem/descriptor/server_descriptor.py", line 768, in _verify_digest
raise ValueError("Decrypted digest does not match local digest")
ValueError: Decrypted digest does not match local digest

Feel free to contact me if you have any questions.

Child Tickets

Attachments (1)

server_descriptor.py.diff (1.3 KB) - added by aj00200 5 years ago.

Download all attachments as: .zip

Change History (15)

Changed 5 years ago by aj00200

Attachment: server_descriptor.py.diff added

comment:1 Changed 4 years ago by atagar

Hi aj00200. Sorry about the delay and thanks for the patch! Everything looks great, except that the last bit (which swaps the decode() call to long_to_bytes()) evidently breaks our integ tests for python 2.

The trouble seems to be that long_to_bytes() strips off leading zeros which causes the comparison to fail...

# digest() value of the descriptor in our integ tests
>>> digest = "00BB5385C0DF28DC6765AC465D0CC7BC6A41AD33"

# old version (encoding as hex to make it printable)
>>> digest.decode('hex').encode('hex')
'00bb5385c0df28dc6765ac465d0cc7bc6a41ad33'

# new version
>>> long_to_bytes(int(digest, 16)).encode('hex')
'bb5385c0df28dc6765ac465d0cc7bc6a41ad33'

This could be fixed by either encoding digest/local_digest and doing a zfill, or by keeping the old version. Was the long_to_bytes() change necessary to make this work with python 3 or was it a refactoring improvement?

Cheers! -Damian

comment:2 Changed 4 years ago by aj00200

Hello,

The switch was necessary for Python 3 compatibility as Python 3 does not have a decode method on strings.

I gave this some thought and I found a way to do that with the codecs module. Example:

>>> d = codecs.getdecoder('hex_codec')
>>> d(b'00021011ffab3a4b')
(b'\x00\x02\x10\x11\xff\xab:K', 16)

The only issue with this is that Python 3 still requires a bytes object to be passed. This can be accomplished with the following code:

>>> d('00bb5385c0df28dc6765ac465d0cc7bc6a41ad33'.encode('ascii'))
(b'\x00\xbbS\x85\xc0\xdf(\xdcge\xacF]\x0c\xc7\xbcjA\xad3', 40)

I have tested this code with Python 2.7 and Python 3.2. If you would like to use this solution I can code it and submit a new patch if you like. Let me know.

comment:3 Changed 4 years ago by atagar

On reflection it's actually a little more elegant for us to translate the digest to hex rather than the local_digest to bytes. As an added bonus this means that we can include them in exceptions we raise to give our users a little more information.

Pushed an adaptation of your patch that passes the python 2.x and 3.x test for me but that said, I only have the python 2.x version of pycrypto installed so this isn't really checking that we've solved this bug. Would you mind giving it a whirl?

https://gitweb.torproject.org/stem.git/commitdiff/92af5b8d50e3dac52e2380cd44ff706c88d9332e

comment:4 Changed 4 years ago by aj00200

Thank you for fixing this. Your last commit ( https://gitweb.torproject.org/stem.git/commit/a1149f5489065cf7445f83137f3747a4866351e1 ) fixed the main issue. There is still the issue where some digests consistently fail to pass the _verify_digest() check. This error only happens with Python 3:

File "/usr/local/lib/python3.2/dist-packages/stem/descriptor/server_descriptor.py", line 765, in _verify_digest

raise ValueError("Decrypted digest does not match local digest (calculated: %s, local: %s)" % (digest, local_digest))

ValueError: Decrypted digest does not match local digest (calculated: 6514CFCCF92A61907730E3E6F4EF13B8FE706712, local: 2DF931970EA4A05A53EF5B1A7371D13D5253DBFC)

At present, I do not know what is causing this issue. I will attempt to look closer, but it might take me a while to work though the crypto code involved.

comment:5 Changed 4 years ago by atagar

Your last commit fixed the main issue.

Great! Thanks again for spotting it.

There is still the issue where some digests consistently fail to pass the _verify_digest() check.

The descriptor you mentioned on irc as causing these problems doesn't appear to be online...

https://atlas.torproject.org/#search/2DF931970EA4A05A53EF5B1A7371D13D5253DBFC

Would you mind attaching its server descriptor content?

comment:6 Changed 4 years ago by aj00200

Here is the Atlas link: https://atlas.torproject.org/#details/0B9821545C48E496AEED9ECC0DB506C49FF8158D

Searching for both the self.digest() value and local_digest on Atlas return no results so perhaps both are incorrect.

comment:7 Changed 4 years ago by atagar

Ah ha, wide characters in the contact line as expected. Well, damn - I hoped that I squashed all those issues last month. :(

comment:8 Changed 4 years ago by atagar

Summary: Python 3 CompatibilityDigest validation fails with python 3

comment:9 Changed 4 years ago by atagar

Keywords: controller added

comment:10 Changed 4 years ago by atagar

Status: newneeds_information

Hi aj00200. Sorry about the delay on this. From prior experience I know that investigating this will be a bit of a time sink and I've been distracted of late with GSoC applications.

To repro this I'll need a python3 copy of python-crypto (like what you evidently have). How did you install it? I'm not spotting a package in the ubuntu repos that clearly fit the bill and the project's site make no mention of python3.

comment:11 Changed 4 years ago by aj00200

Hello,

Installation simply involves downloading the source code: https://www.dlitz.net/software/pycrypto/

Extract that file and run: sudo python3 setup.py install

However, if you have not previously built a Python 3 module, you may need some packages such as python3-dev and gcc which you can find in the Ubuntu repositories. I cannot remember exactly which packages are required, but the error messages are generally clear enough (or easy
enough to run through a search engine.

And by all means, take your time with this bug.

comment:12 Changed 4 years ago by atagar

Status: needs_informationnew

Baka! I was confusing pycrypto and python-crypto. Issue reproed...

>>> from stem.control import Controller
>>> control = Controller.from_port(port = 9051)
>>> control.authenticate()
>>> desc = control.get_server_descriptor('0B9821545C48E496AEED9ECC0DB506C49FF8158D')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "stem/control.py", line 1107, in get_server_descriptor
    raise exc
  File "stem/control.py", line 1104, in get_server_descriptor
    return stem.descriptor.server_descriptor.RelayDescriptor(str_tools._to_bytes(desc_content))
  File "stem/descriptor/server_descriptor.py", line 646, in __init__
    self._validate_content()
  File "stem/descriptor/server_descriptor.py", line 697, in _validate_content
    self._verify_digest(key_as_bytes)
  File "stem/descriptor/server_descriptor.py", line 765, in _verify_digest
    raise ValueError("Decrypted digest does not match local digest (calculated: %s, local: %s)" % (digest, local_digest))
ValueError: Decrypted digest does not match local digest (calculated: F0CE398F63E2A1A2B391DD92D3859C70C5AFB21E, local: 5A0D6EE34269CBDD0034758BC67F2C40144B5CE4)

comment:13 Changed 4 years ago by atagar

Summary: Digest validation fails with python 3Controller handles content as unicode in python 3.x

I see the issue. The problem isn't with the descriptor code (if you use parse_file() or the DescriptorReader then it works fine). The problem is that stem's Controller normalizes the content that it receives from the socket to be unicode rather than bytes. This in turn mangles the contact line, causing the validation to fail.

In particular, we need to drop the str_tools._to_unicode() call from...

https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/socket.py#l470

... then address the rest of the call stack (msg() in the BaseController, and all of its callers). Ick.

I pushed some fixes for other things uncovered by using the python 3.x version of pycrypto, but the Controller issue will take a fair bit more work to solve.

comment:14 Changed 4 years ago by atagar

Resolution: fixed
Status: newclosed

Hi aj00200. Just pushed some commits that fix this, avoiding the 'bytes => unicode => bytes' conversion that was corrupting the content. Thanks for the catch and feel free to reopen if you run into any further issues!

Note: See TracTickets for help on using tickets.