I believe this is different from all the other instances of this bug (#11965 (moved) and friends), because the client never recovers (I am using a pluggable transport that is experimental, but the symptoms don't point at my code at first glance).
Client debug log:
May 15 19:36:24.000 [debug] connection_dir_client_reached_eof(): Received response from directory server '127.0.0.1:52810': 404 "Not found" (purpose: 6)May 15 19:36:24.000 [info] connection_dir_client_reached_eof(): Received server info (size 0) from server '127.0.0.1:52810'May 15 19:36:24.000 [info] connection_dir_client_reached_eof(): Received http status code 404 ("Not found") from server '127.0.0.1:52810' while fetching "/tor/server/authority.z". I'll try again soon.May 15 19:36:24.000 [debug] conn_close_if_marked(): Cleaning up connection (fd -1).May 15 19:36:24.000 [debug] connection_remove(): removing socket -1 (type Directory), n_conns now 3
The bridge is fully bootstrapped at this point according to the logs. Bridge functionality should be fully working once the bridge bootstraps to 100% right? This does seem to happen most after I restart both the client and bridge to pick up a new build of the pt binary...
The only notable config option besides the PT is "PublishServerDescriptor 0" (A cursory search for authority.z brings up #9366 (moved)).
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
Your bridge may have bootstrapped to 100%, but that doesn't mean it could learn its address. The lines you quoted indicated that your bridge returned 404 when asked for its bridge descriptor. That is, it hasn't generated its descriptor yet. Perhaps if you set its address explicitly? What are the torrc lines for bridge and client?
Your bridge may have bootstrapped to 100%, but that doesn't mean it could learn its address. The lines you quoted indicated that your bridge returned 404 when asked for its bridge descriptor. That is, it hasn't generated its descriptor yet. Perhaps if you set its address explicitly? What are the torrc lines for bridge and client?
This is indeed possible, I do not have Address set in the bridge side torrc (Would specifying the loopback address break anything here?).
I assume a configuration like this isn't something that's seen in the wild (probably only PT developers do this sort of thing?). FWIW I haven't seen the client actually retry to fetch the bridge descriptor when I've triggered this in the past, and I've waited > 60 sec because I thought it might be the other bugs.
If indeed it hasn't generated its descriptor yet, then just using it as a vanilla bridge on its orport should fail too. That should remove some components from your situation.
What does the bridge say about its reachability testing? Or does it not even get to that because it thinks it has no publicly routable address?
Tor won't retry fetching the bridge descriptor just 60 seconds later -- it failed, so it's unlikely to succeed again so soon after:
{{{
V(TestingBridgeDownloadSchedule, CSV_INTERVAL, "3600, 900, 900, 3600"),
}}}
If indeed it hasn't generated its descriptor yet, then just using it as a vanilla bridge on its orport should fail too. That should remove some components from your situation.
I have observed this in the past when testing (ORport not working).
What does the bridge say about its reachability testing? Or does it not even get to that because it thinks it has no publicly routable address?
Haven't tried letting it go that far (and it would fail anyway since I don't forward the port).
I've been doing my development with Address set and haven't ran into this again, so I believe your diagnosis is correct.
In light of that, I'm not sure if there's a bug here, it may be nice to have a warning when fetching the bridge descriptor fails, but bridges in the wild presumably have a real address and won't trigger this issue in the first place.
Sorry for taking up your time, and please feel free to close this if all of this behaviour is ok.
Typically, when things get stuck at ~20% in my testing, there is a port-blocking issue. As in, the local network is restricting ports to a whitelisted set.
Just a note as I'm searching trac, you get the same symptom (bootstrapping stops at 20%) if you are a client with UseBridges set and connect to a bridge that has neither BridgeRelay nor DirPort set. info logging is:
[notice] Bootstrapped 20%: Asking for networkstatus consensus[info] internal circ (length 1): $0000000000000000000000000000000000000000(open)[info] connection_ap_handshake_send_begin(): Sending relay cell 1 to begin stream 21318.[info] connection_ap_handshake_send_begin(): Address/port sent, ap socket -1, n_circ_id 3101216939[info] connection_ap_process_end_not_open(): Edge got end (not a directory) before we're connected. Marking for close.[info] internal circ (length 1): $0000000000000000000000000000000000000000(open)[info] stream_end_reason_to_socks5_response(): Reason for ending (526) not recognized; sending generic socks error.[info] connection_free_(): Freeing linked Socks connection [waiting for connect response] with 57 bytes on inbuf, 0 on outbuf.[info] connection_dir_client_reached_eof(): 'fetch' response not all here, but we're at eof. Closing.[info] connection_dir_request_failed(): Giving up on serverdesc/extrainfo fetch from directory server at '0.0.2.0'; retrying[info] connection_free_(): Freeing linked Directory connection [client reading] with 0 bytes on inbuf, 0 on outbuf.[info] compute_weighted_bandwidths(): Empty routerlist passed in to consensus weight node selection for rule weight as guard[info] smartlist_choose_node_by_bandwidth(): Empty routerlist passed in to old node selection for rule weight as guard[info] should_delay_dir_fetches(): Delaying dir fetches (no running bridges known)[info] compute_weighted_bandwidths(): Empty routerlist passed in to consensus weight node selection for rule weight as guard[info] smartlist_choose_node_by_bandwidth(): Empty routerlist passed in to old node selection for rule weight as guard[info] should_delay_dir_fetches(): Delaying dir fetches (no running bridges known)
I guess that #12538 (moved) will make it work to use any old relay as a bridge.
This is a strange problem that I faced today as well. Here is some information about my setup:
OpenVZ container with virtual network interface (venet)
OS: Debian 7 Wheezy
Network configuration:
lo (local loopback)
venet0 - inet addr 127.0.0.2 and inet6 addr /128 scope global
venet0:0 -inet addr <public v4 address)
Trying to build a bridge with obfs3 and obfs4 pluggable transports on Tor 0.2.5.8-rc and obfs4proxy installed from Debian sid repo.
Wanted the bridge to listen on v4 and v6 too, so I have stated in torrc:
ORPort [::]:443
the same for obfs3 and obfs4 listen addr [::]:port
(this opened ORPort on both versions of IP. Checked via a remote server port checker and ports open, everything working fine). OR and obfuscated ports open on v4 and v6.
Trying to connect to the bridge via IPv4, it stucks at loading network status. Takes forever, does not go on. Tried to connect to it as a normal bridge, obfs3 bridge and obfs4 bridge and the same, not loading network status, like bridge is not working or cannot further build Tor circuits.
Added "Address " in torrc. Restarted Tor daemon. The same, nothing fixed.
Finally, I have removed ORPort [::]:443 from torrc and put instead 2 entries as follows:
ORPort :443
ORPort :443
Left the obfuscated listening ports untouched but also removed "Address <v4 address" entry. Restarted Tor daemon.
And... it worked. connected to the Tor network just fine. Connected via regular bridge, obfs3 and obfs4 - all working fine. Ports open on v6 too.
I might add the fact that these are private bridges configured not to send anything to the bridge authorities (maybe this is somehow relevant) [PublishServerDescriptor 0]
Since all this happened within something between 30 mins and 1 hour, I am not sure if my modification of torrc did the fix or the bridge actually learned its own IP and started to work. I don't understan why a bridge will take this long in order to become functional (the bridge was finished bootstraping when tried to connect to it).
This workaround gives false information to the bridge authority, so it must be used with PublishServerDescriptor 0.
Also, this workaround makes this issue a duplicate of #4847 (moved).