It's hard to add code that makes a previously valid descriptor invalid (as we'd like to do for #7484 (moved) and #9286 (moved)), since doing so can put us in a loop where we download the descriptor, get it, reject it, and download it again.
Instead, we should record that the descriptor with some given hash is simply invalid. That's easy for microdescriptors, but a bit harder for router descriptors, since the hash doesn't include the signature itself. So we need to check the signature in that case before rejecting.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
I don't believe so. I think that code applies when we get uploaded descriptor; not when we download one. (I'm just assuming, since it's been in the codebase for so long without exploding.)
Relevant changes are microdesc.c, routerlist.c, routerparse.{c,h}, the
rest just carries along and doesn't use the extra parameter.
Theory here is router_parse_entry_from_string() and
extrainfo_parse_entry_from_string() use the can_dl_again_out pointer
to pass out an indication of whether the descriptor should be ignored
because immutably flawed - i.e., something covered by the hash is invalid,
or it's possible for a future download to yield something different and
succeed.
Then microdescs_parse_from_string() and router_parse_list_from_string()
use that flag to assemble lists of known-immutably-invalid descriptor
hashes to be blacklisted from future download attempts.
In router_parse_list_from_string(), it'd be possible for a suitably formed
string to result in the same hash being added to invalid_digests_out
multiple times. Are the callers of this code aware of that?
The same comments as for router_parse_list_from_string() apply to
microdescs_parse_from_string().
In router_parse_entry_from_string(), we parse things that would make the
descriptor immutably invalid first, having can_dl_again set to zero, and
then set can_dl_again to one before the one check that would possibly
reject the descriptor but accept one with the same hash: the signature
verification. This seems to be correct.
The changes to extrainfo_parse_entry_from_string() parallel those to
router_parse_entry_from_string() and appear to be correct.
In both router_parse_entry_from_string() and
extrainfo_parse_entry_from_string(), correctness depends on all of the
reasons for rejecting an input before setting can_dl_again = 1 be functions
only of the input itself as covered by the hash, and with no dependencies
on any other state of the router. Seems unlikely we'd want to ever change
the parsing to be any other way, but perhaps a comment warning future
editors of those functions would be advisable.
Use of digestmap_t in microdescs_add_to_cache() looks like it will handle
repeated hashes, so that worry is assuaged. These magic 1/2/3 constants
seem a bit ugly though. Can we get some #defines or something?
a99a7fdfdf5e02ef1460f5a49960e77f2ecc6c8b:
This looks okay
d64041a1ba4ff765a3fe65db8819f20dce3cf8e0:
Good catch
69d7a56972e2982c7c5817ab2139f3e83cba27d5:
This is just a whitespace fixup
24b24e4a97117c295b310ccf0f8da327e9d7e8a3:
Yay, we get new unit tests even for code we already had!
cbde09755392a705b56b4c04290bb2885cef55a7:
This looks fine.
cf4cdd034174600ddedb5145d5eaa7e7601082e9:
Looks good.
dc145f2ec77ed812d24d7c4e4f639e6200533647:
Looks good.
87b32019e93f12dc37f8352750928e5f183ddfd2:
Looks good.
185868241b7130b2f1b24cf8ed2e50c998d1ca4c:
Change description sounds fine, but isn't it redundant with the one
before?
This is a good idea; I've opened a ticket for it as #13399 (moved).
[...]
In router_parse_list_from_string(), it'd be possible for a suitably formed
string to result in the same hash being added to invalid_digests_out
multiple times. Are the callers of this code aware of that?
The same comments as for router_parse_list_from_string() apply to
microdescs_parse_from_string().
Hm. Interesting. I should check these out. I believe the answer is "yes that's okay", but I should check. (TODO 1)
[...]
In both router_parse_entry_from_string() and
extrainfo_parse_entry_from_string(), correctness depends on all of the
reasons for rejecting an input before setting can_dl_again = 1 be functions
only of the input itself as covered by the hash, and with no dependencies
on any other state of the router. Seems unlikely we'd want to ever change
the parsing to be any other way, but perhaps a comment warning future
editors of those functions would be advisable.
ok. (TODO 2)
Use of digestmap_t in microdescs_add_to_cache() looks like it will handle
repeated hashes, so that worry is assuaged. These magic 1/2/3 constants
seem a bit ugly though. Can we get some #defines or something?
Will do. (TODO 3)
[...]
24b24e4a97117c295b310ccf0f8da327e9d7e8a3:
Yay, we get new unit tests even for code we already had!
It seemed important and worthwhile. Also it was fun.
[...]
Change description sounds fine, but isn't it redundant with the one
before?
re TODO 1: Added a comment; confirmed as harmless. (ca53b357922b50bcabf5dfdbc9ed56771c73e8cc (Note that parse-list functions may add duplicate 'invalid' entries)
re TODO 2: Fixed in e74290d8be59612bfe321e66bd10e9def656ea0f (Bugfixes on bug11243 fix for the not-added cases and tests) among other stuff.
re TODO 3: Fixed in 770ca94a602d98811e375986623db21e972c8f7e (Use symbolic constants for statuses in microdescs_add_to_cache).
Running moria1 on master (1ce57267), it chokes on startup trying to read its cached descriptors:
(gdb) where#0 0x00007f2d201ab625 in raise () from /lib64/libc.so.6#1 0x00007f2d201ace05 in abort () from /lib64/libc.so.6#2 0x00007f2d219590b8 in crash_handler (sig=<value optimized out>, si=<value optimized out>, ctx_=<value optimized out>) at src/common/backtrace.c:143#3 <signal handler called>#4 extrainfo_insert (ei=0x7f2d27382500, msg=<value optimized out>, from_cache=1, from_fetch=<value optimized out>) at src/or/routerlist.c:2963#5 router_add_extrainfo_to_routerlist (ei=0x7f2d27382500, msg=<value optimized out>, from_cache=1, from_fetch=<value optimized out>) at src/or/routerlist.c:3508#6 0x00007f2d218b5194 in router_load_extrainfo_from_string ( s=0x7f2d1c0fc84f <Address 0x7f2d1c0fc84f out of bounds>, eos=<value optimized out>, saved_location=<value optimized out>, requested_fingerprints=0x0, descriptor_digests=0) at src/or/routerlist.c:3980#7 0x00007f2d218b58a6 in router_reload_router_list_impl (store=0x7f2d23caed48) at src/or/routerlist.c:1196#8 0x00007f2d218b5a6e in router_reload_router_list () at src/or/routerlist.c:1243#9 0x00007f2d2186fddb in do_main_loop () at src/or/main.c:1972#10 0x00007f2d21871825 in tor_main (argc=<value optimized out>, argv=<value optimized out>) at src/or/main.c:2997#11 0x00007f2d20197d5d in __libc_start_main () from /lib64/libc.so.6#12 0x00007f2d2186e019 in _start ()
Thanks for the thorough report, Roger! That made it easy to fix. I just added commit f5fc7e3306f03c8a304dad89ada2280466d54c9a, which should make it better.
Still hoping for a more thorough review of 223d354e344ad3d07766b7c19e1841342c270ad2
Reviewed 223d354e344ad3d07766b7c19e1841342c270ad2 and it looks good, I think.
Should WRA_NEVER_DOWNLOADABLE() also include ROUTER_WAS_NOT_NEW? IIUC you get that if you have already have a newer version of this extrainfo descriptor. Do we want to keep on asking in that case? Or did I misunderstand?
If we have a newer version of the ei descriptor or a newer one, then we won't try to download the old one again, until/unless we are holding an ri that lists it. I think.
Thanks for the review! Opening a new ticket (#13644 (moved)) for confusing names.
Trac: Resolution: N/Ato implemented Status: needs_review to closed