Opened 5 years ago

Closed 4 years ago

#5810 closed enhancement (implemented)

Implement verification of server descriptor

Reported by: reganeet Owned by: reganeet
Priority: Medium Milestone:
Component: Core Tor/Stem Version:
Severity: Keywords: descriptors
Cc: atagar karsten Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

We need to implement is_valid() method of stem.descriptor.server_descriptor.RelayDescriptor [1] , to do some verifications on the descriptor:

1) a contained fingerprint is actually a hash of the signing key and

2) a router signature was created using the signing key.

There's already Java code for doing this in metrics-tasks [2]. However, the Java code is a standalone test, while stem's implementation is self-contained within the descriptor.

We need some ssl library to read the pem-format keys in descriptors, and M2Crypto seems to be the best choice [3]. The problem with M2Crypto is that it requires SSL_v2 support from openssl, which is considered unsafe thus excluded from recent Ubuntu releases, and possibly Debian [4]. I don't know how many people run Tor in Ubuntu, and whether we should let users responsible for having a complete openssl library. It seems quite hard to work this around on Ubuntu [5].

[1]:https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/descriptor/server_descriptor.py#l624

[2]: https://gitweb.torproject.org/metrics-tasks.git/blob/HEAD:/task-2768/VerifyDescriptors.java

[3]: http://stackoverflow.com/a/606702/994146

[4]: http://stackoverflow.com/a/8219807/994146

[5]: https://discussions.nessus.org/thread/3174

Child Tickets

Change History (29)

comment:1 Changed 5 years ago by atagar

Hi Beck. Looks like the first step will be to come up with a counterpart for Karsten's determineKeyHash() function...

https://gitweb.torproject.org/metrics-tasks.git/blob/HEAD:/task-2768/VerifyDescriptors.java#l269

From the dir-spec...

A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in
hex, with a single space after every 4 characters) for this router's
identity key. A descriptor is considered invalid (and MUST be
rejected) if the fingerprint line does not match the public key.

I didn't realize that there was a 'MUST' clause here. We should check is_valid() in the server descriptor constructor when validate is True, and raise a ValueError if it's invalid. Note that this will break a few integ tests since I've messed with some of the data in the descriptor data directory to make the tests more interesting...

https://gitweb.torproject.org/stem.git/tree/HEAD:/test/integ/descriptor/data

We should swap out the bad test data with real instances when we come across it.

The problem with M2Crypto is that it requires SSL_v2 support from openssl, which is considered unsafe thus excluded from recent Ubuntu releases, and possibly Debian [4].

Do we need the ssl v2 support? As the post mentioned the module itself is available on Ubuntu...

atagar@morrigan:~$ lsb_release -sd
Ubuntu 11.04

atagar@morrigan:~$ sudo apt-get install m2crypto
Note, selecting 'python-m2crypto' instead of 'm2crypto'
The following NEW packages will be installed:
  python-m2crypto
0 upgraded, 1 newly installed, 0 to remove and 108 not upgraded.
Need to get 277 kB of archives.
...

I don't know how many people run Tor in Ubuntu

Lots, including me. :P

Cheers! -Damian

comment:2 Changed 5 years ago by reganeet

Thanks Damian. It turns out the M2Crypto package downloaded from their website does not work with Ubuntu, but the distribution in Ubuntu's repository is good.

However, after playing with it for several hours, I found out that M2Crypto only support PEM format keys in X.509 standard but not in PKCS, and the public keys in server descriptors are encoded in PKCS. They have slightly different headers: X.509 keys starts with "-----BEGIN PUBLIC KEY-----", while PKCS keys starts with "-----BEGIN RSA PUBLIC KEY-----". The content is also represented in different ways, so simply changing the header won't work [1].

>> from M2Crypto import RSA, BIO
>> bio = BIO.MemoryBuffer(descriptor.signing_key)
>> rsa = RSA.load_pub_key_bio(bio)
M2Crypto.RSA.RSAError: no start line

I'm looking for a substitute of M2Crypto now.

[1] http://www.cryptosys.net/pki/rsakeyformats.html

comment:3 Changed 5 years ago by reganeet

The first verification is done by using python-rsa, but I have some difficulties implementing the second part. According to the Java code, what we should do is:

  1. Read the signing key from the descriptor;
  2. Get the signature from the descriptor, filter out the header and footer, and do a base64 decode to get the signature bytes;
  3. Decrypt the signature bytes with the signing key and remove the PKCS1 padding to get the original message;
  4. Encode the message in hex and compare it to the digest of the descriptor.

I'm done the first two parts and checked they were correct. The hard part is 3. I'm trying to do it with the verify() method in python-rsa [1]. However, I always get a
ValidationError. I'm diving into the code of verify() method and trying to decrypt the signature bytes step by step, and it seems that the decrypted message does not start with the signature marker of PKCS1 padding, which should be '\x00\x01'.

Going to dive deeper tomorrow...

[1] https://bitbucket.org/sybren/python-rsa/src/5d834ee3e7e5/rsa/pkcs1.py

comment:4 Changed 5 years ago by atagar

The first verification is done by using python-rsa

Fantastic! Should we move forward with code reviewing and merging this part? It should probably have a unit test or two.

comment:5 Changed 5 years ago by reganeet

Fantastic! Should we move forward with code reviewing and merging this part? It should probably have a unit test or two.

Sure. Do you want me to write the unit tests?
I've also renamed is_valid() to validate(), and instead of returning a boolean, it raises a ValueError if the validation fails, since we MUST perform this validation before accepting a descriptor. Do you think this makes sense?

I cracked the Java crypto library code today, and printed out everything in the decryption process to find what was wrong. It seems that python-rsa uses a different way (maybe non-standard) to transform between octet strings and integers and incompatible with our signature. Things got much more subtle here, and I'm afraid I can't go on. If anyone else would like to continue this work, I'm more than happy to share what I've got with him.

comment:6 Changed 5 years ago by atagar

Sure. Do you want me to write the unit tests?

Yup. Let me know if you want any help with it, ideally it'll be for the happy case and all of the edge cases you can think of (minimal descriptor content, malformed signature, etc). Really try to break your change via the tests, that's how we best get rigorous testing. :)

I've also renamed is_valid() to validate(), and instead of returning a boolean, it raises a ValueError if the validation fails, since we MUST perform this validation before accepting a descriptor. Do you think this makes sense?

I don't think that's necessary, the constructor can simply do...

if validate and not self.is_valid():
  raise ValueError("yo, something bad is going on")

The constructor has a validate flag in case they want to accept malformed data so also calling this function 'validate' would be confusing. However, maybe we should come up with a more descriptive name than is_valid()? Would is_signature_valid() be better?

If anyone else would like to continue this work, I'm more than happy to share what I've got with him.

Yup, please add all of the details you think would be helpful for future people trying to figure this out to this ticket. Also, lets add that task to the dev wiki...
https://trac.torproject.org/projects/tor/wiki/doc/stem

Thanks! -Damian

comment:7 Changed 5 years ago by reganeet

The constructor has a validate flag in case they want to accept malformed data so also calling this function 'validate' would be confusing.

Right, I didn't think of that.

However, maybe we should come up with a more descriptive name than is_valid()? Would is_signature_valid() be better?

Well, I prefer is_valid() for its conciseness. :)

Yup, please add all of the details you think would be helpful for future people trying to figure this out to this ticket.

Nick's conversation with me contains some useful details:
https://lists.torproject.org/pipermail/tor-dev/2012-May/003544.html
The key is to find a crypto library that implements the pkcs1 signature verification process (part 3). IMHO, pyCrypto and M2Crypto don't have it, and python-rsa verify() function only works with the signature created by its sign() function.

Also, lets add that task to the dev wiki...

Done.

comment:8 Changed 5 years ago by reganeet

Hi Damian,

I've written the unit test and pushed into my repo:
https://github.com/beckbaibai/stem

In the test file for server descriptor, I added the correct fingerprint into RELAY_DESCRIPTOR_ATTR, and test_minimal_relay_descriptor() will now check if it has been read correctly. Another test case, test_fingerprint_malformed(), is added to make sure the construction will fail if the fingerprint does not match the signing key.

Notice that after the second part of the validation is implemented, testing would become more complicated, because the signature has to match the hash of the content, and signed with descriptor's signing key (the private one).

I think I've merged my branch with yours in the right way...but please double check since I'm still not comfortable with git merge.

Thanks!
Beck

comment:9 Changed 5 years ago by atagar

I've written the unit test and pushed into my repo...

Fantastic! Don't worry about merging. When doing development what you'll generally want to do is...

# switch to your master branch and make sure it's up to date
git checkout master
git pull origin

# make a new branch where you implement your feature
git checkout -b my_spiffy_new_feature
<do your work>

# when you're done again make sure your master branch its up to date
git fetch origin

# this will update your feature branch to be off of stem's current master
git rebase origin/master

# pushes into your repository
git push my_remote

Note that these commands will vary based on what you're calling your remotes. If you're new to git then I'd suggest...
http://progit.org/book/

Personally I found it to be very well written. There's no need to do a merge (though it doesn't hurt). I can then easily cherry-pick or merge your changes.

All this said though we have a couple issue that needs to be addressed before I can look at your branch...

+"""M
+Parsing for Tor server descriptors, which contains the infrequently changing
M
+information about a Tor relay (contact information, exit policy, public keys,M
...

See the M at the end of the lines? Those are Windows newlines. Whatever editor you're using are introducing them. This causes a couple issues...

  • A simple diff claims that every line is changed. I can tell diff to ignore whitespace, but it's still annoying.
  • Mixed newlines can confuse a lot of tools. They're very bad to introduce into code.

So we need to both strip these out and figure out a method for you to edit files without introducing them. This is a common gotcha so I'm sure google will have some good suggestions.

For my part I should make the whitespace checker look for these. Also, we need to make non-builtin dependencies optional. Otherwise users will get...

atagar@morrigan:~/Desktop/stem$ ./run_tests.py --unit
Traceback (most recent call last):
  File "./run_tests.py", line 23, in <module>
    import test.unit.descriptor.server_descriptor
  File "/home/atagar/Desktop/stem/test/unit/descriptor/server_descriptor.py", line 9, in <module>
    import stem.descriptor.server_descriptor
  File "/home/atagar/Desktop/stem/stem/descriptor/server_descriptor.py", line 35, in <module>
    import rsa
ImportError: No module named rsa

I'm not quite sure how we should handle this, logging a warning once if they try to verify without the module or something else...

Cheers! -Damian

comment:10 Changed 5 years ago by atagar

Uggg, stupid trac. Forgot that it misinterprets the carrot character. I meant...

+"""^M
+Parsing for Tor server descriptors, which contains the infrequently changing^M
+information about a Tor relay (contact information, exit policy, public keys,^M
...

comment:11 Changed 5 years ago by reganeet

So we need to both strip these out and figure out a method for you to edit files without introducing them.

That's weird, since I think I've removed all the CRs, and in my emacs/vim there is no CR remaining in server_descriptors.py.

I'm not quite sure how we should handle this, logging a warning once if they try to verify without the module or something else...

What about in RelayDescriptor.init:

if validate: 
  try:
    if not self.is_valid():
      raise ValueError(self.nickname + " is not a valid server descriptor!")
  except ImportError:
    print("Unable to validate. Please install python-rsa module first.")

comment:12 Changed 5 years ago by atagar

Pushed an adaptation of your change - thanks for the patch!
https://gitweb.torproject.org/stem.git/commitdiff/40a971105ff162d0caa9e0a1c5fa3c0f7d3c2b72

Changes included...

  • Making the rsa module import optional. Not only is it not a builtin, but it also isn't packaged for most platforms. Hence when we make a deb and rpm we won't be able to have a stated dependency on it, which in turn means that this won't function for most users. However, as the commit message mentions I looked around for alternatives and found them lacking so guess we'll need to go with this and hope that the python-rsa folks someday get their act together and package their library.
  • The fingerprint attribute is optional (in the wild descriptors may not have it), so accounted for this in the change.

That's weird, since I think I've removed all the CRs, and in my emacs/vim there is no CR remaining in server_descriptors.py.

If you run "git show d793809" then you should see the issue, where the whole file appears to be deleted and then re-added with the altered newlines. I ended up needing to run 'git diff --ignore-space-at-eol HEAD' and then re-apply your change by hand to the current master. Not the end of the world, but it took a bit of work.

Again, thanks for the patch! Leaving this ticket open for the remaining validation implementation.

comment:13 Changed 4 years ago by atagar

  • Keywords descriptors added

comment:14 Changed 4 years ago by eoinof

I took a look at the problems in this ticket..

I've made an initial fix here.
https://github.com/eoinof/stem/commits/stem-trac-5810

It makes the code work 'properly' but does not fix the now
broken unit tests that relied on invalid relay descriptor data.
I implemented the signature verification both with the
python-rsa library[default] and also with the python-crypto library
though I had to write some custom code in both cases as neither
library appears to do exactly what was needed.

I can look into 'fixing' the unit tests insofar as necessary.
This is an example of the kind of hard coding that no longer passes.

Traceback:
File "/home/eoin/stem/test/integ/descriptor/server_descriptor.py", line 89, in test_metrics_descriptor
self.assertEquals("2C7B27BEAB04B4E2459D89CA6D5CD1CC5F95A689", desc.digest())

AssertionError: '2C7B27BEAB04B4E2459D89CA6D5CD1CC5F95A689' != ",{'\xbe\xab\x04\xb4\xe2E\x9d\x89\xcam\\\xd1\xcc_\x95\xa6\x89"

comment:15 Changed 4 years ago by atagar

Hi Eoin. As you mentioned this breaks the unit tests pretty badly. You're completely right that mocking.get_relay_server_descriptor() provides invalid data according to these integrity checks, but it does so somewhat on purpose. The get_relay_server_descriptor() function aims to...

  • provide a *minimal* server descriptor by default that only has mandatory arguments
  • allow the caller to get a custom descriptor by providing additional entries

It would be a pita to then make sure that our content always matches our signature. Luckily it's also not necessary - we can mock is_valid() or other validation functions to always say "the descriptor is ok" in the unit tests.

I'm not entirely clear what's happening in 0d433b5 but if this lets us check descriptor validity without the python-rsa module then that would be fantastic! Is the Crypto.Util module a builtin and available in python 2.5? If so then do you think we can drop the python-rsa usage?

comment:16 Changed 4 years ago by eoinof

My original motivation here was to get rid of the python-rsa usage.. so that's certainly an option.
I think quite a few of the tests can be fixed by turning off validation, and the others can probably be fixed by replacing the hardcoded values in test functions with references to the mocking code:

e.g. to fix test/unit/tutorial.py

-        expected_line = "%i. speedyexit (102.13 KB/s)" % count
+        expected_line = "%i. speedyexit (%s/s)" % (count, str_tools.get_size_label(int(mocking.ACTUAL_OBSERVED_BANDWIDTH) ,2))                       

comment:17 Changed 4 years ago by eoinof

I'll take a look at what exactly the regular python crypto libs provide as well.

comment:18 Changed 4 years ago by atagar

My original motivation here was to get rid of the python-rsa usage.. so that's certainly an option.

Fantastic! Just to be sure, we're going with a derivation of your commit 0d433b5 but dropping the other two, right?

e.g. to fix test/unit/tutorial.py

On a side note the tutorial unit tests I wrote are awful. My aim was to test the code snippets on...
https://stem.readthedocs.org/en/latest/tutorial.html

... and in a roundabout way they do. However, the tests look almost nothing like the examples, and the heavy mocking makes those tests pretty hard to read. If you have a better idea for testing the examples then I'd welcome it. :)

comment:19 Changed 4 years ago by eoinof

Of the changes I proposed

The python-crypto lib is not builtin but it is included in ubuntu & debian by default.

I've added more info on the signature verification in the code comments this time, and you can read more about it here:
http://www.ietf.org/rfc/rfc2313.txt

https://github.com/eoinof/stem/commit/269587c646e7841d0c6fee914e2a8deabbe89fe5
I've ignored the verification failure so all unit tests will still pass.

comment:20 follow-up: Changed 4 years ago by atagar

  • Status changed from new to needs_revision

This whitespace problem is a bugfix. [I've split it out from the updated descriptor]

Merged.

I've added more info on the signature verification in the code comments this time, and you can read more about it here:
...

Wow, looks great! First, are you ok for your stem patches (past and future) to be in the public domain? Here's the reason why I need to ask...

https://trac.torproject.org/projects/tor/wiki/doc/stem#CopyrightforPatches

Besides that just a few questions and suggestions...

+from Crypto.Util import asn1
+from Crypto.Util.number import bytes_to_long, long_to_bytes

If these are only available by default on debian based distros then we should have this in a try/catch for an ImportError. Actually, since this is replacing the rsa module maybe we should make a method for this like the stem.prereq.is_rsa_available()?

+ #Ensure the digest of the descriptor has been calculated

Minor thing but please have a space between the # and the start of the comment. It makes it easier to read. The convention that I've been using for comments is to do properly puncuated sentences if it spans mupltiple sentences...

# Configuration options that are fetched by a special key. The keys are
# lowercase to make case insensitive lookups easier.

... and do all lowercase without punctuation if it's a single sentence...

# unchangeable GETINFO parameters

+ #Validate the descriptor if required.
+ if validate and not self.is_valid():
+ log.error("Descriptor info not valid")
+ raise ValueError("Invalid data")

This provides precious little information to the caller. Rather than having is_valid() that returns a boolean maybe it should be a validate_content() method that raises a ValueError? Then your present log.warn() calls could instead raise a ValueError with a useful message.

key_as_string = .join(self.signing_key.split('\n')[1:4])

This took me a couple minutes to figure out. Maybe rename it and add a comment? Also, we should end on -1 in case the size of the key content ever changes.

# strips off the '-----BEGIN RSA PUBLIC KEY-----' header and corresponding footer

key_content = ''.join(self.signing_key.split('\n')[1:-1])

#TODO - what is the purpose of allowing a NULL fingerprint ?

Because the tor spec says that it isn't mandatory...

https://gitweb.torproject.org/torspec.git/blob/HEAD:/dir-spec.txt#l411

I'm not entirely clear but maybe it's optional due to being first introduced in tor verison 0.1.0.6? If so then that reason no longer makes sense (0.1.0.6 has been dead for years). Wanna file a tor ticket to ask Nick if this should be made a mandatory field?

+ #FIXME - stopgap measure until tests are fixed.
+ #return False

I think that you misunderstood my suggestion about the unit tests. The unit tests are trying to check various aspects of how server descriptors are parsed and validated. To do this they need to be able to craft custom descriptors and those custom descriptors will not be valid according to _verify_descriptor(). Personally I don't think that this is a bug.

To account for this the *unit tests* should mock _verify_descriptor() to simply return that the descriptor is valid. This does not require a hack in the RelayDescriptor class. That is to say, the unit tests should have...

def setUp(self):
  mocking.mock(stem.descriptor.server_descriptor.RelayDescriptor._verify_descriptor, mocking.return_true())

def tearDown(self):
  mocking.revert_mocking()

+ log.warn("unable to calculate digest for descriptor")
+ #TODO should we raise here ?

Sure, feel free. We will only hit that condition if we were constructed from grossly invalid content and the user set 'validate' to False when we were constructed. At this point the caller deserves an exception. ;)

def digest(self):
...

It looks like you're making two changes here...

  1. We're stripping content prior to the "router " prefix. This is a good point, I forgot about annotations or that callers might allow for invalid content.
  1. Rather than returning the hexdigest() you're returning the digest(). Is this desirable?

+ ##################
+ ##PROPER MESSING!!
+ ##################

I'm not sure what this means.

Cheers! -Damian

comment:21 in reply to: ↑ 20 Changed 4 years ago by eoinof

Replying to atagar:

Wow, looks great! First, are you ok for your stem patches (past and future) to be in the public domain? Here's the reason why I need to ask...

https://trac.torproject.org/projects/tor/wiki/doc/stem#CopyrightforPatches

Yeah, I was aware of that, I have no problem for them to be in the public domain.

Besides that just a few questions and suggestions...

+from Crypto.Util import asn1
+from Crypto.Util.number import bytes_to_long, long_to_bytes

If these are only available by default on debian based distros then we should have this in a try/catch for an ImportError. Actually, since this is replacing the rsa module maybe we should make a method for this like the stem.prereq.is_rsa_available()?

I'm not sure exactly what platforms include the lib so I'll the put the prereq check back in.

+ #Ensure the digest of the descriptor has been calculated

Minor thing but please have a space between the # and the start of the comment. It makes it easier to read. The convention that I've been using for comments is to do properly puncuated sentences if it spans mupltiple sentences...

# Configuration options that are fetched by a special key. The keys are
# lowercase to make case insensitive lookups easier.

... and do all lowercase without punctuation if it's a single sentence...

# unchangeable GETINFO parameters

I'll update the comments to follow this convention.

+ #Validate the descriptor if required.
+ if validate and not self.is_valid():
+ log.error("Descriptor info not valid")
+ raise ValueError("Invalid data")

This provides precious little information to the caller. Rather than having is_valid() that returns a boolean maybe it should be a validate_content() method that raises a ValueError? Then your present log.warn() calls could instead raise a ValueError with a useful message.

I think I based this structure on the previous comments in the ticket. If you prefer exceptions I can change the code to use them. Do you guys mind custom exceptions ? I'd be inclined to leave the log messages in as well, but I'll stick with whatever your preferred style is for the code..

key_as_string = .join(self.signing_key.split('\n')[1:4])

This took me a couple minutes to figure out. Maybe rename it and add a comment? Also, we should end on -1 in case the size of the key content ever changes.

# strips off the '-----BEGIN RSA PUBLIC KEY-----' header and corresponding footer

key_content = ''.join(self.signing_key.split('\n')[1:-1])

Ok, It's sometimes hard to know what parts need explaining :)

#TODO - what is the purpose of allowing a NULL fingerprint ?

Because the tor spec says that it isn't mandatory...

https://gitweb.torproject.org/torspec.git/blob/HEAD:/dir-spec.txt#l411

I'm not entirely clear but maybe it's optional due to being first introduced in tor verison 0.1.0.6? If so then that reason no longer makes sense (0.1.0.6 has been dead for years). Wanna file a tor ticket to ask Nick if this should be made a mandatory field?

Ah, ok.. I'll remove the TODO so.. I'll look into filing a tor ticket about the fingerprint once I've finished all the stuff here.

+ #FIXME - stopgap measure until tests are fixed.
+ #return False

I think that you misunderstood my suggestion about the unit tests. The unit tests are trying to check various aspects of how server descriptors are parsed and validated. To do this they need to be able to craft custom descriptors and those custom descriptors will not be valid according to _verify_descriptor(). Personally I don't think that this is a bug.

To account for this the *unit tests* should mock _verify_descriptor() to simply return that the descriptor is valid. This does not require a hack in the RelayDescriptor class. That is to say, the unit tests should have...

def setUp(self):
  mocking.mock(stem.descriptor.server_descriptor.RelayDescriptor._verify_descriptor, mocking.return_true())

def tearDown(self):
  mocking.revert_mocking()

I was thinking about adding some code to create valid signed descriptors. This would mean the unit tests would work, and would be exercising more of the code base. The only changes necessary would be in cases where the unit test got a descriptor, and then changed some of it's contents.. in this case you'd need to add a new function call sign_descriptor(..)

+ log.warn("unable to calculate digest for descriptor")
+ #TODO should we raise here ?

Sure, feel free. We will only hit that condition if we were constructed from grossly invalid content and the user set 'validate' to False when we were constructed. At this point the caller deserves an exception. ;)

Ok, I'll change this.

def digest(self):
...

It looks like you're making two changes here...

  1. We're stripping content prior to the "router " prefix. This is a good point, I forgot about annotations or that callers might allow for invalid content.
  1. Rather than returning the hexdigest() you're returning the digest(). Is this desirable?

I'd prefer not to return anything from the function. I'm not sure that the digest needs to be publicly accessible? The reason I changed the code was that the decrypted signature[digest] is not in hex so I decided that rather than converting both the local digest, and the decrypted digest to hex I'd just leave them as they are.

+ ##################
+ ##PROPER MESSING!!
+ ##################

In an ideal world I'd much prefer to use a crypto lib, rather than rolling my own. So I added this to note my disapproval.. I'll remove it.

comment:22 Changed 4 years ago by atagar

Great. Please swap this ticket back to 'Needs code/patch review' when you've pushed the revisions.

I think I based this structure on the previous comments in the ticket. If you prefer exceptions I can change the code to use them.

Gotcha. Since this is being used for validation it makes more sense for us to raise a ValueError, like the other validation issues do.

Do you guys mind custom exceptions ?

Custom exceptions are fine if there's a good reason for it (see stem/init.py and stem/connection.py for some examples of places that we have custom exceptions).

However, in this case the ServerDescriptor is documented as raising a ValueError when the content is invalid. You can add a custom exception for these issues if you want, but please have it subclass ValueError.

I'd be inclined to leave the log messages in as well, but I'll stick with whatever your preferred style is for the code..

Our other validation errors (... of which there's a ton) simply raise an exception so we should probably do that here too. I can see some advantages to logging the issue as well, but if we did that then it should probably be done for all validation issues.

I was thinking about adding some code to create valid signed descriptors.

Ah ha. Agreed, that would definitely be better.

I'd prefer not to return anything from the function. I'm not sure that the digest needs to be publicly accessible?

As I understand it this function is chiefly so that we can determine if the descriptor's digest matches what the network status says that it should be...

https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/descriptor/server_descriptor.py#l257
https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/descriptor/router_status_entry.py#l248

However, in that case the ServerDescriptor's digest() method and RouterStatusEntryV3's digest attribute would ideally have the same encoding. It looks like ServerDescriptor was using a hex encoding and RouterStatusEntryV3 was simply storing the base64 value. Here's a router status entry example...

r caerSidi p1aag7VwarGxqctS7/fS0y5FU+s wyCIDl1crtM3lkLDEFFpR8+V8oM 2012-11-17 05:17:27 71.35.143.   230 9001 0
s Fast Named Running Stable Valid 
v Tor 0.2.2.35
w Bandwidth=39
p reject 1-65535

My vote would be for both of them to provide the hex value, though I could be persuaded otherwise.

Cheers! -Damian

comment:23 Changed 4 years ago by eoinof

  • Status changed from needs_revision to needs_review

I've made the changes we discussed and created a new commit
https://github.com/eoinof/stem/commit/5668385ffd024f9cc2aa910a4ed4d02edc2375fe

Some things to note:
Since the RelayDescriptor now raises an exception with invalid data there is
some fragile code that may hang. I saw this in some of the threaded calls.
I'll file a separate ticket about it. It will only occur if the data
in descriptor_archive.tar.bz2 were invalid..

The digest function returns a hex value, as these are better if being
passed around.. etc.. but it is not uppercase!

I removed some minor unit tests that no longer made sense.

comment:24 Changed 4 years ago by eoinof

Actually I'll add the info on the hang here since the code has not
yet been added to Tor.
In stem/descriptor/init__.py:121
_parse_metrics_file()

    yield stem.descriptor.server_descriptor.RelayDescriptor(descriptor_file.read())

will hang if construction of the RelayDescriptor raises an exception.. as happens
with the above patch.

This will only occur if the data is invalid.

comment:25 Changed 4 years ago by atagar

  • Status changed from needs_review to needs_revision

It will only occur if the data
in descriptor_archive.tar.bz2 were invalid..

Oops. Iirc I messed with descriptor data at a few points to either make the tests more interesting or obscure people's email address. Feel free to replace fabrications with real descriptor data. You can probably find some in your 'stem/test/data/cached-consensus' file. If not then let me know and I'll fill in the gaps.

... will hang if construction of the RelayDescriptor raises an exception.. as happens
with the above patch.

Ewww. Any idea why? I'd expect the exception to be propagated through to the above parse_file() caller, and iirc there's tests for this through the 'stem.descriptor.reader'. Does this hang when parsing invalid data without your changes or is it a new regression?

========================================

If you replace 'self._digest' with self.digest() at...

+      # The local digest is stored in hex so need to encode the decrypted digest
+      digest_hex = digest.encode('hex')
+      if digest_hex != self._digest:
+        log.warn("Decrypted digest does not match local digest")
+        raise ValueError("Decrypted digest does not match local digest")

Then you can drop the digest() call and comment from...

+    # validate the descriptor if required
+    if validate:
+      # ensure the digest of the descriptor has been calculated
+      self.digest()
+      self._validate_content()

========================================

+  def digest(self):
+    # Digest is calculated from everything in the
+    # descriptor except the router-signature.
+    raw_descriptor = str(self)
+    start_token = "router "

This isn't how a lazy loading method work (self._digest is essentially never used and you're always recalculating it). You're saving self._digest then returning that so any reason not to do something like the following?

def digest(self):
  if self._digest is None:
    ... present code...

  return self._digest

========================================

+      log.warn("unable to calculate digest for descriptor")
+      raise ValueError("unable to calculate digest for descriptor")

I still think that we should drop all of these logging calls. Maybe do the logging in parse_file() Instead so you log for all validation failures rather than just these new ones?

========================================

+    :raises a ValueError if signature does not match content,

Missing parentheses and extra comma. Probably better as ":raises: ValueError if signature does not match the content".

========================================

+    if not self.signature:
+      log.warn("Signature missing")
+      raise ValueError("Signature missing")

This check isn't really necessary now that this method is private. By this point we know that we have a valid descriptor object (except for this check), so the signature attribute must exist. That said, it's fine to leave in if you want.

========================================

+ key_as_string = .join(self.signing_key.split('\n')[1:4])

Ah, didn't catch that you're doing a '\n' split and join so the key content all resides on a single line. Is that important? If so then please note it in the above comment.

========================================

+    # if we have a fingerprint then check that our fingerprint is a hash of
+    # our signing key
+    if self.fingerprint:

If you do end up filing a ticket with tor to make the fingerprint mandatory then please add a link to it in the comment here. Ticket links can be truncated, for instance "https://trac.torproject.org/5810" works for this ticket (no need to include the "projects/tor/ticket/" boilerplate).

========================================

+      log.notice("No fingerprint for this descriptor")

This log message doesn't contain enough information to be useful. If, say, 1% of descriptors are missing fingerprints then making a 'GETINFO desc/all-recent' call would result in dozens of these messages without saying *what* is missing a fingerprint. We should either drop this to an INFO level message and include more information or drop the log message entirely. Personally I favor the later until Nick says if fingerprints should be mandatory or not.

========================================

+      log.info("Descriptor verified.")

We definitely don't want this. A single 'GETINFO desc/all-recent' would flood our INFO level logs with this message, which really isn't helpful.

========================================

+    if not stem.prereq.is_crypto_available():
+      return
+    else:

No need for the else indentation block here.

========================================

+      decrypted_int = pow(sig_as_long, public_exponent ,modulus)

Typo with placement of a space.

========================================

+  if IS_CRYPTO_AVAILABLE == None:

Oops, my bad. I don't always remember but None checks should use identity comparison...
http://www.python.org/dev/peps/pep-0290/#testing-for-none

========================================

-    self.assertEquals("2C7B27BEAB04B4E2459D89CA6D5CD1CC5F95A689", desc.digest())
+    self.assertEquals("2C7B27BEAB04B4E2459D89CA6D5CD1CC5F95A689", desc.digest().upper())

I'm a little inclined toward having digest() provide an uppercase value since every other hex thing I'm aware of in tor (fingerprints for instance) are uppercase. Is there any reason you can think of to make it lowercase?

========================================

+def sign_descriptor_content(desc_content):
...

Yikes! Very, very nice. Mind adding some pydocs?

comment:26 Changed 4 years ago by eoinof

  • Status changed from needs_revision to needs_review

Hey, we're getting there..
Thanks for the code reviews..

I only saw the hang problem when I was testing with invalid data.
The existing data is fine. The problem seems to be related to how
the threads handle an exception. The RelayDescriptor.init() raising an exception
is new behaviour I guess..
http://stackoverflow.com/questions/2829329/catch-a-threads-exception-in-the-caller-thread-in-python
I think this can be addressed as a separate problem.

My thinking with not storing the digest was that since
many of the descriptors attributes are publicly accessible
the code would behave better in the case where someone did this

a_desc = RelayDescriptor(desc)
a_desc.contact = "hello"
dig = a_desc.digest()

but I'm not sure that this is plausible usage.. also for this to work you'd need
to use digest() everywhere in the code, and there would be no point in storing
self._digest at all
So I've switched to the caching approach.

I've removed the logging calls, except the one that prints info on
the fingerprint mismatch.
I'm a big fan of logging but I'll defer to the norms of the project :)
It's quite clear that warnings & errors are not that widely used as you don't
see anything when running the unit tests with log level WARN/ERR.

My thinking on leaving the digest as lowercase was that this is how it
is stored in the signature. However I can change the code to convert it
to lowercase & uppercase where necessary.

I moved the key string splitting code to a separate function as it's done more than once.

See this checkin.
https://github.com/eoinof/stem/commit/d58abea3042a909464826e16e2b19bae10c29be4

ps: some other "== None" occurences..

stem/descriptor/server_descriptor.py~: if self._scrubbing_issues == None:
stem/descriptor/server_descriptor.py: if self._scrubbing_issues == None:
stem/connection.py: if controller == None:

comment:27 Changed 4 years ago by atagar

  • Status changed from needs_review to new

Hey, we're getting there..
Thanks for the code reviews..

My pleasure. Thanks for these patches! This has been a task that I've been dreading. ;)

Pushed, with some very minor whitespace tweaks...

https://gitweb.torproject.org/stem.git/commitdiff/e0095fbe54759c45cbf6d1b120d2b17b47a0ec21
https://gitweb.torproject.org/stem.git/commitdiff/5da6b9790da266f96258c7c6d6a439ca2ef06529
https://gitweb.torproject.org/stem.git/commitdiff/d82a70a4fb874ca295c1644e3c77f24afddcbf06

I only saw the hang problem when I was testing with invalid data.
The existing data is fine. The problem seems to be related to how
the threads handle an exception. The RelayDescriptor.init() raising an exception
is new behaviour I guess..

http://stackoverflow.com/questions/2829329/catch-a-threads-exception-in-the-caller-thread-in-python

I think this can be addressed as a separate problem.

The reference to that link puzzled me for a sec but now I think that I see what you mean - you think that the DescriptorReader worker thread is the issue? If so then this looks like a likely culprit...

https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/descriptor/reader.py#l486

The earlier parse_file() call catches ValueErrors but this one doesn't. Hence invalid descriptor in an archive might cause problems. Does adding the following fix it?

except ValueError, exc:
  self._notify_skip_listeners(target, ParsingFailure(exc))

Do you have a repro for this? I just ran the integ tests but they passed.

I've removed the logging calls, except the one that prints info on
the fingerprint mismatch.
I'm a big fan of logging but I'll defer to the norms of the project :)

Thanks. Part of the trouble with logging in the descriptor parsing is that the validation is actually partly used ot figure out if a file *is* a specific descriptor type. Ie, if it can be parsed as a server descriptor then it is a server descriptor. Hence the parser seeing invalid data doesn't necessarily indicate a problem.

I love logging too, but it loses its usefulness if there's many false positives.

It's quite clear that warnings & errors are not that widely used as you don't
see anything when running the unit tests with log level WARN/ERR.

That's only because I haven't encountered many things yet that seem to warrant them. I'm trying to keep with tor's pattern of logging which is...

ERROR = Something critical is broken, and impairing functionality. The user should be worried.
WARN = Something's wrong but we're carrying on. The user should know.
NOTICE = Information that isn't necessarily a problem, but the user should be informed.
INFO = High level runtime information.
DEBUG = Low level runtime information.
TRACE = Request/reply logging. Tor doesn't have this, but it's a handy level to have.

I'm generally happy for things to be at INFO level or lower, but things NOTICE or above should warrant end user's attention. Hence I'm far pickier on those. If you find something that should have additional logging then please do add it. I haven't payed as much attention to logging as I probably should.

ps: some other "== None" occurences..

Shame on me. I'm sure there's quite a few "!= None" occurrences too - I haven't corrected those yet.

comment:28 Changed 4 years ago by eoinof

My repro for the hang was to just temporarily put a

raise ValueError("test")

at the start of the _verify_descriptor function!

The extra except does indeed prevent the hang..
And the integ tests run to completion with no other 'hangs'.

Eoin.

comment:29 Changed 4 years ago by atagar

  • Resolution set to implemented
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.