One option is to add another bootstrap phase for people who need to fetch bridge descriptors. Somewhere in the 18% range I guess.
Another option is to fold this bridge descriptor fetching into the 15% phase -- that is, treat it as not having a circuit open until after we've got our bridge descriptor. Kind of klunky.
A third option is to leave it alone, since it's not really a big problem in most cases. I only noticed because #11965 (moved) sure seemed (based on bootstrap level 20%) like it was about failing to fetch a consensus, rather than failing to like the bridge descriptor it had secretly fetched instead of asking for the consensus it said it was asking for.
I picked 'unspecified' for the milestone because I'm fine with option 3 for at least the short and medium term, unless somebody wants to step up and make this right.
Adding another bootstrap phase for fetching bridge descriptors seems like a reasonable idea to me. Probably not hard to implement either.
I'd second that. I think arma's 18% option wouldn't be so difficult, and arma is correct that failing to fetch the consensus vs. failing to fetch bridge descriptors are two different bootstrapping problems.
I'm concerned about all the things that look at this number. Can we defer merging this to 0.2.9?
That's fine by me. I was wondering why it hadn't been bumped to 0.2.9, and why it received the TorCoreTeam201605 tag. (I had assumed you or arma wanted it prioritised for some reason.)
Trac: Milestone: Tor: 0.2.8.x-final to Tor: 0.2.9.x-final
One question. When we trigger the BOOTSTRAP_STATUS_REQUESTING_BRIDGE_DESC control event, we do it in the circuit building function instead of the "requesting bridge desc" function which I assume is directory_get_from_dirserver.
For instance, BOOTSTRAP_STATUS_REQUESTING_DESCRIPTORS is used when we realize we need more descriptors rather than when the circuit is built for that request. What if that onehop circuit never gets build, we won't know that it was because we requested a bridge descriptor?
One question. When we trigger the BOOTSTRAP_STATUS_REQUESTING_BRIDGE_DESC control event, we do it in the circuit building function instead of the "requesting bridge desc" function which I assume is directory_get_from_dirserver.
So... what happens is that we just build a circuit to our bridge, without knowing it's key, and then we ask the bridge for the consensus. The bridge responds with its descriptor. Then, usually, because the scheduled events run we call fetch_bridge_descriptors() and the bridge gives us the descriptor again. Then we decide "oh, I actually have the bridge descriptor, now I can ask for the consensus again." When we ask for the consensus again, at this point is the first time that any of the code in directory.c is called, but never before that (because of this fumbling around that happens with fetching the bridge descriptor).
However, if we were to trigger BOOTSTRAP_STATUS_REQUESTING_BRIDGE_DESC in fetch_bridge_descriptors, then it seems like we would want to make this event be at like 3% bootstrapped, because "15%" is "establishing an encrypted directory connection" where the "directory" in question is actually the bridge (for which we already have it's descriptor at this point, because the stupid fumbling) and the point that we get the descriptor is before BOOTSTRAP_STATUS_CONN_DIR=5 (5%).
For instance, BOOTSTRAP_STATUS_REQUESTING_DESCRIPTORS is used when we realize we need more descriptors rather than when the circuit is built for that request. What if that onehop circuit never gets build, we won't know that it was because we requested a bridge descriptor?
Ah. That is true.
Okay, there's an alternate version of the patch in my bug11966_v2branch and it does like this:
May 31 15:11:45.000 [notice] Bootstrapped 0%: StartingMay 31 15:11:45.000 [notice] Delaying directory fetches: No running bridgesMay 31 15:11:46.000 [notice] Bootstrapped 3%: Asking for bridge descriptorsMay 31 15:11:46.000 [notice] Bootstrapped 5%: Connecting to directory serverMay 31 15:11:46.000 [notice] Bootstrapped 10%: Finishing handshake with directory serverMay 31 15:11:46.000 [notice] Learned fingerprint 8F347C5673390E46642B06A7B1F4088B59437AD0 for bridge 127.0.0.1:5009.May 31 15:11:46.000 [notice] Bootstrapped 15%: Establishing an encrypted directory connectionMay 31 15:11:46.000 [notice] new bridge descriptor 'test009br' (fresh): $8F347C5673390E46642B06A7B1F4088B59437AD0~test009br at 127.0.0.1May 31 15:11:46.000 [notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.May 31 15:11:47.000 [notice] Bootstrapped 20%: Asking for networkstatus consensusMay 31 15:11:47.000 [notice] Bootstrapped 25%: Loading networkstatus consensus
(FWIW, I still like the 18% method better but I don't really care either way.)
Ok! Thanks for the explanation, now it's clearer. :)
Yeah I think the 18% step alone is fine. Our circuit is opened and we are about to request the bridge descriptor followed by 20% which is getting the consensus using that bridge. Sounds good.
A) This wants a control-spec.txt patch too (Section 5.5 at the bottom).
B) + BOOTSTRAP_STATUS_REQUESTING_BRIDGE_DESC=3,
Doesn't this want to be =18? Oh, I see there is discussion above on this topic. I need to read that discussion more, but I hope to not buy it. If two of us like the 18% approach more, but there are bugs, we should explore and fix those bugs.
C) The change in circuitbuild.c now no longer triggers BOOTSTRAP_STATUS_REQUESTING_STATUS in some cases? And it's not clear that we will necessarily get back into circuit_send_next_onion_skin() later, to trigger it? So does that mean there are now cases where we accidentally skip this bootstrap phase?
A) This wants a control-spec.txt patch too (Section 5.5 at the bottom).
Indeed
B) + BOOTSTRAP_STATUS_REQUESTING_BRIDGE_DESC=3,
Doesn't this want to be =18? Oh, I see there is discussion above on this topic. I need to read that discussion more, but I hope to not buy it. If two of us like the 18% approach more, but there are bugs, we should explore and fix those bugs.
v2 branch is not ideal. The right branch is bug11966. (same for C).
@isis, before this can be merged upstream, a control-spec patch is needed. imo bug11966 is ready for merge but still let's make sure C) (from arma review) is actually addressed.
Trac: Status: merge_ready to needs_revision Keywords: TorCoreTeam201605 deleted, TorCoreTeam201606 added
@isis, before this can be merged upstream, a control-spec patch is needed. imo bug11966 is ready for merge but still let's make sure C) (from arma review) is actually addressed.
Yeah, (C) only applies to bug11966_v2.
There's a control-spec.txt patch in my torspec.git repo, in the bug11966branch.
If there's someone who is primarily Sponsor8 funded who'd like to work on this, lmk. Otherwise I can revisit my patch and probably call it SponsorM-can.
These tickets were marked as removed, and nobody has said that they can fix them. Let's remember to look at 033-removed-20180320 as we re-evaluate our triage process, to see whether we're triaging out unnecessarily, and to evaluate whether we're deferring anything unnecessarily. But for now, we can't do these: we need to fix the 033-must stuff now.
Trac: Milestone: Tor: 0.3.3.x-final to Tor: unspecified
If we do this, I think it falls under #25502 (moved) ? Apparently there is code here that we might be able to use.
The bug11966 branch seems to be about bootstrap reporting, so this probably belongs under #28018 (moved).
Deferring 51 tickets from 0.4.0.x-final. Tagging them with 040-deferred-20190220 for visibility. These are the tickets that did not get 040-must, 040-can, or tor-ci.