Opened 7 weeks ago

Closed 43 hours ago

#23985 closed defect (fixed)

If less than 15 missing mds, Tor will delay md download for 10 mins

Reported by: asn Owned by: nickm
Priority: High Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version: Tor: 0.3.0.6
Severity: Normal Keywords: tor-guard, tor-bridge, tor-client, 030-backport, 031-backport
Cc: catalyst, isis, bmeson, mrphs Actual Points:
Parent ID: #21969 Points:
Reviewer: Sponsor:

Description

In launch_descriptor_downloads() if we are missing between 1 to 15 mds (MAX_DL_TO_DELAY), Tor will delay the md download for 10 mins (or until we are missing >= 16 mds). See TestingClientMaxIntervalWithoutRequest for the 10 min delay.

This is bad when comboed with #21969 since if one of the 15 missing mds is for one of your top two primary guards, tor will hang for 10 mins with missing descriptor for primary guards before bootstrapping.

The probability of this happening is small (about 0.004 I think for 6.4k mds in total) but given the amount of clients we have this is bound to happen for some of them.

Child Tickets

Change History (16)

comment:1 Changed 5 weeks ago by asn

Perhaps the fix here is to disable this delay functionality if the missing descriptors are delaying bootstrap (e.g. if missing descs are primary guard descs)?

comment:2 in reply to:  1 ; Changed 5 weeks ago by nickm

Replying to asn:

Perhaps the fix here is to disable this delay functionality if the missing descriptors are delaying bootstrap (e.g. if missing descs are primary guard descs)?

Before we do that, let's consider the purpose of waiting here: we don't want anybody to be able to force us to use a secondary guard by denying us descriptors for any primary guards.

comment:3 in reply to:  2 Changed 5 weeks ago by asn

Replying to nickm:

Replying to asn:

Perhaps the fix here is to disable this delay functionality if the missing descriptors are delaying bootstrap (e.g. if missing descs are primary guard descs)?

Before we do that, let's consider the purpose of waiting here: we don't want anybody to be able to force us to use a secondary guard by denying us descriptors for any primary guards.

Hm. I meant to suggest we disable the TestingClientMaxIntervalWithoutRequest delay functionality if we are missing descs of primary guards. Aka not wait 10 mins before fetching the primary guard descriptors.

I think perhaps you understood that I suggested we disable the "waiting for primary guard descriptors" functionality, which was certainly not my intention. Sorry for the non-clear wording above.

comment:4 Changed 5 weeks ago by nickm

Hm. I meant to suggest we disable the TestingClientMaxIntervalWithoutRequest delay functionality if we are missing descs of primary guards. Aka not wait 10 mins before fetching the primary guard descriptors.

Ah! You mean something like

diff --git a/src/or/routerlist.c b/src/or/routerlist.c
index f04e2ca160331b..f587bfadcef1a1 100644
--- a/src/or/routerlist.c
+++ b/src/or/routerlist.c
@@ -5010,6 +5010,11 @@ launch_descriptor_downloads(int purpose,
       log_debug(LD_DIR,
                 "There are enough downloadable %ss to launch requests.",
                 descname);
+    } else if (! router_have_minimum_dir_info()) {
+      log_debug(LD_DIR,
+                "We are only missing %d %ss, but we'll fetch anyway, since "
+                "we don't yet have enough directory info.",
+                n_downloadable, descname);
     } else {
 
       /* should delay */

comment:5 Changed 5 weeks ago by nickm

Priority: MediumHigh

comment:6 Changed 5 weeks ago by asn

Yes exactly.

Perhaps we can also put the whole if (!directory_fetches_dir_info_early(options)) block into a function called static int should_delay_descriptor_downloads() to make it a bit cleaner too.

comment:7 Changed 5 weeks ago by asn

Sebastian points out that this MAX_DL_TO_DELAY delay functionality might also be performed by dirservers which makes the problem worse, since it means that dirservers will wait even longer before they get to 100% mds.

comment:8 Changed 5 weeks ago by nickm

I think for the purposes of possible backporting, we should go with the minimal change here for now. See bug23985_029.

comment:9 Changed 5 weeks ago by nickm

Keywords: 030-backport 031-backport added
Owner: set to nickm
Status: newaccepted

comment:10 Changed 5 weeks ago by nickm

Status: acceptedneeds_review

comment:11 Changed 5 weeks ago by arma

bug23985_029 looks good.

comment:12 in reply to:  7 Changed 5 weeks ago by arma

Replying to asn:

Sebastian points out that this MAX_DL_TO_DELAY delay functionality might also be performed by dirservers which makes the problem worse, since it means that dirservers will wait even longer before they get to 100% mds.

That n_downloadable >= MAX_DL_TO_DELAY should only be checked if !directory_fetches_dir_info_early().

And in theory (there might be bugs), dir mirrors fetch dir info early.

comment:13 Changed 5 weeks ago by Sebastian

great, I haven't looked at any code just asked about the behaviour. I have no indication that it happens otherwise

comment:14 Changed 5 weeks ago by asn

Status: needs_reviewmerge_ready

ACKing the branch here. Let's do it.

comment:15 Changed 5 weeks ago by nickm

Milestone: Tor: 0.3.2.x-finalTor: 0.3.1.x-final

merged bug23985_029 to 0.3.2 and forward; marking for possible backport.

comment:16 Changed 43 hours ago by nickm

Resolution: fixed
Status: merge_readyclosed

merged to 0.2.9 and forward!

Note: See TracTickets for help on using tickets.