Opened 6 months ago

Closed 6 months ago

#23457 closed defect (fixed)

prop224: Service descriptor uploads race condition

Reported by: dgoulet Owned by:
Priority: Medium Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-hs, prop224
Cc: Actual Points:
Parent ID: Points:
Reviewer: asn Sponsor:



The service gets a new consensus and microdescs fetch happens right after. The HS subsystem gets notified that new directory information arrived so it should consider a re-upload of its descriptors:

Sep 07 23:43:01.000 [info] A consensus needs 5 good signatures from recognized authorities for us to accept it. This one has 8 (dannenberg tor26 longclaw maatuska moria1 dizum gabelmoo Faravahar).
Sep 07 23:43:02.000 [info] hs_service_dir_info_changed(): New dirinfo arrived: consider reuploading descriptor
Sep 07 23:43:02.000 [info] launch_descriptor_downloads(): Launching 3 requests for 148 microdescs, 50 at a time

... and an upload has been scheduled now() for all 6 HSDirs. So far so good. Now, the microdescriptors arrive and a second upload is triggered because the service hashring changed. Remember that we need mds to consider a relay for it to be in the hashring:

Sep 07 23:43:02.000 [info] handle_response_fetch_microdesc(): Received answer to microdescriptor request (status 200, body size 19779) from server ''
Sep 07 23:43:02.000 [info] hs_service_dir_info_changed(): New dirinfo arrived: consider reuploading descriptor

In a two seconds time frame, two uploads were initiated with two different revision counters, let's say rev counter 1 and 2. Then, to save you from more text and logs, the result is that the 2 was uploaded before the 1 finishes thus 2>1 so the HSDir will reject it and respond with a 400 malformed descriptor like so:

Sep 07 23:43:05.000 [warn] Uploading hidden service descriptor: http status 400 ("Invalid HS descriptor. Rejected.") response from dirserver '<RELAY>'. Malformed hidden service descriptor?

The consequence of that is benign that is the HSDir will end up with the correct descriptor and client will be able to reach the service. But, we end up with this annoying warning in the logs that we can easily prevent and ultimately also more load on the network.

The fix is to cancel all uploads (for the specific descriptor) right before trying to upload a new one because that new one will always have a higher revision counter.

We could do something like schedule a new descriptor upload only when all requested microdesc have arrived but then that would probably introduce a reachability issue which is making a client query the correct HSDir and unable to find the service because the service is waiting on getting all mds to upload its desc to the new hashring...

Child Tickets

Change History (4)

comment:1 Changed 6 months ago by dgoulet

Reviewer: asn
Status: newneeds_review

See branch: bug23457_032_01

It contains the solution where we close all active directory connections for the descriptor before upload.

comment:2 Changed 6 months ago by asn

Status: needs_reviewmerge_ready

OK, please see my branch bug23457_032_01 from my repo for the fix here.

comment:3 Changed 6 months ago by dgoulet


comment:4 Changed 6 months ago by nickm

Resolution: fixed
Status: merge_readyclosed

okay; merged to master.

Note: See TracTickets for help on using tickets.