Opened 5 years ago

Closed 4 years ago

#11243 closed defect (implemented)

Don't fetch any descriptor which we already fetched and found to be ill-formed

Reported by: nickm Owned by:
Priority: High Milestone: Tor: 0.2.6.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay 026-triaged-1 nickm-patch
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

It's hard to add code that makes a previously valid descriptor invalid (as we'd like to do for #7484 and #9286), since doing so can put us in a loop where we download the descriptor, get it, reject it, and download it again.

Instead, we should record that the descriptor with some given hash is simply invalid. That's easy for microdescriptors, but a bit harder for router descriptors, since the hash doesn't include the signature itself. So we need to check the signature in that case before rejecting.

Child Tickets

Change History (22)

comment:1 Changed 5 years ago by arma

Is #11149 another case where this ticket applies?

comment:2 Changed 5 years ago by nickm

I don't believe so. I think that code applies when we get uploaded descriptor; not when we download one. (I'm just assuming, since it's been in the codebase for so long without exploding.)

comment:3 Changed 5 years ago by nickm

Keywords: 026-triaged-1 added

comment:4 Changed 5 years ago by nickm

Keywords: nickm-patch added

I have a possible fix implemented in my branch "bug11243". It needs unit tests before I believe in it, but it's not totally crazy.

comment:5 Changed 5 years ago by andrea

I see no bug11243 branch in your repository, nickm. Did you forget to push, perhaps?

comment:6 Changed 5 years ago by nickm

Oops. Pushed now. Still needs tests.

comment:7 Changed 5 years ago by nickm

Status: newneeds_review

Now it has unit tests! And the branch in question now also has unit tests for parsing a bunch of stuff whose parsing we didn't test before.

comment:8 Changed 5 years ago by nickm

Priority: normalmajor

comment:9 Changed 5 years ago by nickm

rl1987 says 185868241b7130b2f1b24cf8ed2e50c998d1ca4c and c062758486dd75b3fb3030324cd76a2a206676bd both added a changes file.

comment:10 Changed 5 years ago by nickm

asn says this looks okay to him.

comment:11 Changed 5 years ago by andrea

Begin code review for nickm's bug11243 branch:

c062758486dd75b3fb3030324cd76a2a206676bd:

  • /* XXXX Change this to digest256_len */ ?
  • Relevant changes are microdesc.c, routerlist.c, routerparse.{c,h}, the rest just carries along and doesn't use the extra parameter.
  • Theory here is router_parse_entry_from_string() and extrainfo_parse_entry_from_string() use the can_dl_again_out pointer to pass out an indication of whether the descriptor should be ignored because immutably flawed - i.e., something covered by the hash is invalid, or it's possible for a future download to yield something different and succeed.
  • Then microdescs_parse_from_string() and router_parse_list_from_string() use that flag to assemble lists of known-immutably-invalid descriptor hashes to be blacklisted from future download attempts.
  • In router_parse_list_from_string(), it'd be possible for a suitably formed string to result in the same hash being added to invalid_digests_out multiple times. Are the callers of this code aware of that?
  • The same comments as for router_parse_list_from_string() apply to microdescs_parse_from_string().
  • In router_parse_entry_from_string(), we parse things that would make the descriptor immutably invalid first, having can_dl_again set to zero, and then set can_dl_again to one before the one check that would possibly reject the descriptor but accept one with the same hash: the signature verification. This seems to be correct.
  • The changes to extrainfo_parse_entry_from_string() parallel those to router_parse_entry_from_string() and appear to be correct.
  • In both router_parse_entry_from_string() and extrainfo_parse_entry_from_string(), correctness depends on all of the reasons for rejecting an input before setting can_dl_again = 1 be functions only of the input itself as covered by the hash, and with no dependencies on any other state of the router. Seems unlikely we'd want to ever change the parsing to be any other way, but perhaps a comment warning future editors of those functions would be advisable.
  • Use of digestmap_t in microdescs_add_to_cache() looks like it will handle repeated hashes, so that worry is assuaged. These magic 1/2/3 constants seem a bit ugly though. Can we get some #defines or something?

a99a7fdfdf5e02ef1460f5a49960e77f2ecc6c8b:

  • This looks okay

d64041a1ba4ff765a3fe65db8819f20dce3cf8e0:

  • Good catch

69d7a56972e2982c7c5817ab2139f3e83cba27d5:

  • This is just a whitespace fixup

24b24e4a97117c295b310ccf0f8da327e9d7e8a3:

  • Yay, we get new unit tests even for code we already had!

cbde09755392a705b56b4c04290bb2885cef55a7:

  • This looks fine.

cf4cdd034174600ddedb5145d5eaa7e7601082e9:

  • Looks good.

dc145f2ec77ed812d24d7c4e4f639e6200533647:

  • Looks good.

87b32019e93f12dc37f8352750928e5f183ddfd2:

  • Looks good.

185868241b7130b2f1b24cf8ed2e50c998d1ca4c:

  • Change description sounds fine, but isn't it redundant with the one before?

End code review.

comment:12 in reply to:  11 Changed 5 years ago by nickm

Replying to andrea:

Begin code review for nickm's bug11243 branch:

c062758486dd75b3fb3030324cd76a2a206676bd:

  • /* XXXX Change this to digest256_len */ ?

This is a good idea; I've opened a ticket for it as #13399.

[...]

  • In router_parse_list_from_string(), it'd be possible for a suitably formed string to result in the same hash being added to invalid_digests_out multiple times. Are the callers of this code aware of that?
  • The same comments as for router_parse_list_from_string() apply to microdescs_parse_from_string().

Hm. Interesting. I should check these out. I believe the answer is "yes that's okay", but I should check. (TODO 1)

[...]

  • In both router_parse_entry_from_string() and extrainfo_parse_entry_from_string(), correctness depends on all of the reasons for rejecting an input before setting can_dl_again = 1 be functions only of the input itself as covered by the hash, and with no dependencies on any other state of the router. Seems unlikely we'd want to ever change the parsing to be any other way, but perhaps a comment warning future editors of those functions would be advisable.

ok. (TODO 2)

  • Use of digestmap_t in microdescs_add_to_cache() looks like it will handle repeated hashes, so that worry is assuaged. These magic 1/2/3 constants seem a bit ugly though. Can we get some #defines or something?

Will do. (TODO 3)

[...]

24b24e4a97117c295b310ccf0f8da327e9d7e8a3:

  • Yay, we get new unit tests even for code we already had!

It seemed important and worthwhile. Also it was fun.

[...]

  • Change description sounds fine, but isn't it redundant with the one before?

Fixed.

comment:13 Changed 5 years ago by nickm

re TODO 1: Added a comment; confirmed as harmless. (ca53b357922b50bcabf5dfdbc9ed56771c73e8cc (Note that parse-list functions may add duplicate 'invalid' entries)

re TODO 2: Fixed in e74290d8be59612bfe321e66bd10e9def656ea0f (Bugfixes on bug11243 fix for the not-added cases and tests) among other stuff.

re TODO 3: Fixed in 770ca94a602d98811e375986623db21e972c8f7e (Use symbolic constants for statuses in microdescs_add_to_cache).

comment:14 Changed 5 years ago by nickm

oh hey. That wasn't re TODO 2. It was just annother issue, re my unit tests not passing. Adding another comment.

comment:15 Changed 5 years ago by nickm

Okay. Probably better now. I'm going to have another quick look at ca53b357922b50bcabf5dfdbc9ed56771c73e8cc to make sure it looks right.

comment:16 Changed 5 years ago by nickm

err, no. Another look at e74290d8be59612bfe321e66bd10e9def656ea0f. :/

comment:17 Changed 5 years ago by nickm

Okay, I've reviewed that again, and made bug11243_squashed, and merged it to master. Thanks for the reviews, everyone!

comment:18 Changed 5 years ago by nickm

Leaving in needs-review till somebody has looked at 223d354e344ad3d07766b7c19e1841342c270ad2

comment:19 Changed 4 years ago by arma

Running moria1 on master (1ce57267), it chokes on startup trying to read its cached descriptors:

(gdb) where
#0  0x00007f2d201ab625 in raise () from /lib64/libc.so.6
#1  0x00007f2d201ace05 in abort () from /lib64/libc.so.6
#2  0x00007f2d219590b8 in crash_handler (sig=<value optimized out>, 
    si=<value optimized out>, ctx_=<value optimized out>)
    at src/common/backtrace.c:143
#3  <signal handler called>
#4  extrainfo_insert (ei=0x7f2d27382500, msg=<value optimized out>, 
    from_cache=1, from_fetch=<value optimized out>) at src/or/routerlist.c:2963
#5  router_add_extrainfo_to_routerlist (ei=0x7f2d27382500, 
    msg=<value optimized out>, from_cache=1, from_fetch=<value optimized out>)
    at src/or/routerlist.c:3508
#6  0x00007f2d218b5194 in router_load_extrainfo_from_string (
    s=0x7f2d1c0fc84f <Address 0x7f2d1c0fc84f out of bounds>, 
    eos=<value optimized out>, saved_location=<value optimized out>, 
    requested_fingerprints=0x0, descriptor_digests=0)
    at src/or/routerlist.c:3980
#7  0x00007f2d218b58a6 in router_reload_router_list_impl (store=0x7f2d23caed48)
    at src/or/routerlist.c:1196
#8  0x00007f2d218b5a6e in router_reload_router_list ()
    at src/or/routerlist.c:1243
#9  0x00007f2d2186fddb in do_main_loop () at src/or/main.c:1972
#10 0x00007f2d21871825 in tor_main (argc=<value optimized out>, 
    argv=<value optimized out>) at src/or/main.c:2997
#11 0x00007f2d20197d5d in __libc_start_main () from /lib64/libc.so.6
#12 0x00007f2d2186e019 in _start ()
(gdb) up
#4  extrainfo_insert (ei=0x7f2d27382500, msg=<value optimized out>,
    from_cache=1, from_fetch=<value optimized out>) at src/or/routerlist.c:2963
2963        r = (sd->extrainfo_is_bogus) ? ROUTER_BAD_EI : ROUTER_NOT_IN_CONSENSUS;
(gdb) print sd
$1 = (signed_descriptor_t *) 0x0

Looks like it was introduced in git commit 223d354e3:

   if (routerinfo_incompatible_with_extrainfo(ri, ei, sd, NULL)) {
+    r = (sd->extrainfo_is_bogus) ? ROUTER_BAD_EI : ROUTER_NOT_IN_CONSENSUS;
     goto done;
   }

where it assumes that sd will be defined.

...which is how I ran across this ticket.

comment:20 Changed 4 years ago by nickm

Thanks for the thorough report, Roger! That made it easy to fix. I just added commit f5fc7e3306f03c8a304dad89ada2280466d54c9a, which should make it better.

Still hoping for a more thorough review of 223d354e344ad3d07766b7c19e1841342c270ad2

comment:21 Changed 4 years ago by asn

Reviewed 223d354e344ad3d07766b7c19e1841342c270ad2 and it looks good, I think.

Should WRA_NEVER_DOWNLOADABLE() also include ROUTER_WAS_NOT_NEW? IIUC you get that if you have already have a newer version of this extrainfo descriptor. Do we want to keep on asking in that case? Or did I misunderstand?

comment:22 Changed 4 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

If we have a newer version of the ei descriptor or a newer one, then we won't try to download the old one again, until/unless we are holding an ri that lists it. I think.

Thanks for the review! Opening a new ticket (#13644) for confusing names.

Note: See TracTickets for help on using tickets.