Opened 11 months ago

Closed 9 months ago

#28717 closed defect (fixed)

Tor stuck in 25% Loading networkstatus consensus

Reported by: loskiq Owned by:
Priority: Medium Milestone: Tor: 0.3.4.x-final
Component: Core Tor/Tor Version: Tor: 0.3.4.9
Severity: Normal Keywords:
Cc: loskiq@… Actual Points:
Parent ID: Points:
Reviewer: nickm Sponsor:

Description

I installed the Tor Bridge. When launched, it gets stuck:

Dec 04 10:42:33.943 [notice] Tor 0.3.4.9 running on Linux with Libevent 2.1.8-stable, OpenSSL LibreSSL 2.8.2, Zlib 1.2.8, Liblzma N/A, and Libzstd N/A.
Dec 04 10:42:33.944 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Dec 04 10:42:33.944 [notice] Read configuration file "/etc/tor/torrc".
Dec 04 10:42:33.947 [notice] Scheduler type KIST has been enabled.
Dec 04 10:42:33.947 [notice] Opening Socks listener on 127.0.0.1:9050
Dec 04 10:42:33.000 [notice] Parsing GEOIP IPv4 file /usr/local/share/tor/geoip.
Dec 04 10:42:34.000 [notice] Parsing GEOIP IPv6 file /usr/local/share/tor/geoip6.
Dec 04 10:42:34.000 [notice] We were built to run on a 64-bit CPU, with OpenSSL 1.0.1 or later, but with a version of OpenSSL that apparently lacks accelerated support for the NIST P-224 and P-256 groups. Building openssl with such support (using the enable-ec_nistp_64_gcc_128 option when configuring it) would make ECDH much faster.
Dec 04 10:42:34.000 [notice] Bootstrapped 0%: Starting
Dec 04 10:42:34.000 [notice] Starting with guard context "bridges"
Dec 04 10:42:34.000 [notice] new bridge descriptor 'huy' (cached): $3E0CFCEE7183970DCC70ABC2D10518BC288BF0DE~huy at 79.103.124.21
Dec 04 10:42:34.000 [notice] Delaying directory fetches: Pluggable transport proxies still configuring
Dec 04 10:42:35.000 [notice] Bootstrapped 5%: Connecting to directory server
Dec 04 10:42:35.000 [notice] Bootstrapped 10%: Finishing handshake with directory server
Dec 04 10:42:35.000 [notice] Bootstrapped 15%: Establishing an encrypted directory connection
Dec 04 10:42:35.000 [notice] Bootstrapped 20%: Asking for networkstatus consensus
Dec 04 10:42:35.000 [notice] Bootstrapped 25%: Loading networkstatus consensus
Dec 04 10:47:39.000 [notice] Delaying directory fetches: No running bridges

My ISP does not block Tor. This problem does not depend on the country, because I tested the Tor client with my bridge in Russia, Germany and the Netherlands. The same problem occurs without using obfs4proxy. I think that in this case it has nothing to do with the time zone.

Child Tickets

Attachments (8)

torrc_of_the_client.txt (336 bytes) - added by loskiq 11 months ago.
torrc of the client
torrc_of_the_bridge.txt (241 bytes) - added by loskiq 11 months ago.
torrc of the bridge
log_of_the_bridge.txt (3.4 KB) - added by loskiq 11 months ago.
log of the bridge
tor.log (30.5 KB) - added by loskiq 11 months ago.
log on the client
tor_bridge.log (46.4 KB) - added by loskiq 11 months ago.
log on bridge
tor_client.log (48.7 KB) - added by loskiq 11 months ago.
log on client
obfs4proxy_server.log (349 bytes) - added by loskiq 11 months ago.
obfs4proxy log of server
obfs4proxy_client.log (279 bytes) - added by loskiq 11 months ago.
obfs4proxy log of client

Download all attachments as: .zip

Change History (40)

comment:1 Changed 11 months ago by loskiq

Cc: loskiq@… added
Component: - Select a componentCore Tor/Tor

comment:2 Changed 11 months ago by loskiq

P.S. The attached logs belongs to the Tor client, not to the bridge

Last edited 11 months ago by loskiq (previous) (diff)

comment:3 Changed 11 months ago by teor

Milestone: Tor: unspecified

What do the logs on the bridge say?
Has the bridge bootstrapped?

Can you attach your bridge and client torrcs?

Changed 11 months ago by loskiq

Attachment: torrc_of_the_client.txt added

torrc of the client

Changed 11 months ago by loskiq

Attachment: torrc_of_the_bridge.txt added

torrc of the bridge

Changed 11 months ago by loskiq

Attachment: log_of_the_bridge.txt added

log of the bridge

comment:4 in reply to:  3 Changed 11 months ago by loskiq

Attached. Yes, bridge is bootstrapped

Replying to teor:

What do the logs on the bridge say?
Has the bridge bootstrapped?

Can you attach your bridge and client torrcs?

comment:5 Changed 11 months ago by cypherpunks3

Last edited 11 months ago by cypherpunks3 (previous) (diff)

comment:6 in reply to:  5 Changed 11 months ago by loskiq

This ip address belongs to my bridge. I wanted to hide it and therefore I changed the address in the logs, but missed this line.

Replying to cypherpunks3:

Guessed our IP address as 54.93.104.200 (source: 79.137.112.4).

This.

54.93.104.200 works for me

comment:7 Changed 11 months ago by teor

It looks like your obfs4 is only listening on IPv6:

Dec 04 08:12:04.000 [notice] Registered server transport 'obfs4' at '[::]:40635'

But your client is configured to connect via IPv4:

Bridge obfs4 79.103.124.21:40635 3E0CFCEE7183970DCC70ABC2D10518BC288BF0DE cert=ZzN5WrKqUZCkHYlb8gh0Ew1B5tMgO+oP60jxfar1r3A8yMH/syZ0T3td4x13VbEj1+G4EQ iat-mode=0

Unless you changed the IPv4 address to an IPv6 address when you redacted it.
(Please don't change logs without telling us what you changed, it makes debugging much harder.)

Can you check if port 40635 is open on IPv4, IPv6, or both?
If it's open on IPv4, tor has a logging bug.

If it's not, you can use ServerTransportListenAddress to set the correct address.
And tor has a default IP version bug.

Also, you probably don't need this torrc option, it doesn't do anything on most systems:
HardwareAccel 1

comment:8 in reply to:  7 Changed 11 months ago by loskiq

I checked port 40635 on IPv4 via nmap, and it is open. I just changed the IP address of my bridge. Sorry. The correct IP address of my bridge is 54.93.104.200, not 79.103.124.21

loskiq@loskiq-work:~$ nmap 54.93.104.200 -p 40635

Starting Nmap 7.40 ( https://nmap.org ) at 2018-12-05 17:33 MSK
Nmap scan report for ec2-54-93-104-200.eu-central-1.compute.amazonaws.com (54.93.104.200)
Host is up (0.070s latency).
PORT      STATE SERVICE
40635/tcp open  unknown

Nmap done: 1 IP address (1 host up) scanned in 0.28 seconds

And I removed the option HardwareAccel. Thanks for this.

Replying to teor:

comment:9 Changed 11 months ago by teor

Ok, tor has a logging bug.

And it's still not clear why your client can't connect to your bridge.
Can you connect to the obfs4 port from your client's IP address?
Is the bridge line correct?
Can you collect info-level logs on the client, and post them here.
(You can redact if you want to.)

Changed 11 months ago by loskiq

Attachment: tor.log added

log on the client

comment:10 Changed 11 months ago by teor

The client can connect to your bridge, but your bridge has no descriptor:

Dec 05 17:57:29.000 [info] handle_response_fetch_desc(): Received server info (body size 0) from server '54.93.104.200:40635'
Dec 05 17:57:29.000 [info] handle_response_fetch_desc(): Received http status code 404 ("Servers unavailable") from server '54.93.104.200:40635' while fetching "/tor/server/authority.z". I'll try again soon.

That's really strange, because your bridge has a descriptor:

Dec 04 09:07:05.000 [notice] Guessed our IP address as 54.93.104.200 (source: 79.137.112.4).
Dec 04 09:07:05.000 [notice] Self-testing indicates your ORPort is reachable from the outside. Excellent. Publishing server descriptor.

Can you please post the logs from your bridge, *after* you changed the IP address and restarted the bridge?

Can you post info-level logs from your client and bridge, that contain the lines that your client and bridge log when the client tries to connect?

Changed 11 months ago by loskiq

Attachment: tor_bridge.log added

log on bridge

Changed 11 months ago by loskiq

Attachment: tor_client.log added

log on client

comment:11 in reply to:  description ; Changed 11 months ago by dcf

Replying to loskiq:

Dec 04 10:47:39.000 [notice] Delaying directory fetches: No running bridges

I have seen this error before while testing PTs. I think it is a tor bug. I don't know the exact cause, but after a few bridge failures, tor will cache the fact that it thinks all bridges are down, and refuse even to try connecting to them. See:

My usual workaround is to delete the <datadir>/state file and restart tor. You can also try adding a line like this to your torrc:

DataDirectory tmp-datadir

If that works, then the problem is likely the one I described.

comment:12 in reply to:  11 Changed 11 months ago by loskiq

Thank you for your answer, but unfortunately it did not help me. First I deleted <datadir>/state and restarted bridge, then I changed DataDirectory to /tmp, restarted bridge and wrote the necessary changes to torrc of client. Client still unable to connect and stuck in 25%.

Replying to dcf:

comment:13 Changed 11 months ago by loskiq

By the way, with the same configuration, the client successfully works with other bridges.

comment:14 in reply to:  13 ; Changed 11 months ago by dcf

Replying to loskiq:

By the way, with the same configuration, the client successfully works with other bridges.

I see. This means my guess was wrong.

I don't have any other good ideas, except to turn on obfs4proxy logging. In the client torrc, use

ClientTransportPlugin obfs4 exec /usr/local/bin/obfs4proxy --enableLogging --logLevel=DEBUG

In the bridge torrc, use

ServerTransportPlugin obfs4 exec /usr/local/bin/obfs4proxy --enableLogging --logLevel=DEBUG

In both cases, the log file will appear in <datadir>/pt_state/obfs4proxy.log.

Changed 11 months ago by loskiq

Attachment: obfs4proxy_server.log added

obfs4proxy log of server

Changed 11 months ago by loskiq

Attachment: obfs4proxy_client.log added

obfs4proxy log of client

comment:15 in reply to:  14 Changed 11 months ago by loskiq

It seems with obfs4proxy all right

Replying to dcf:

comment:16 Changed 11 months ago by loskiq

I have just tested working with another obfs4 bridge received from bridges@…. The problem persists. Stuck in 25%. But there is a bridge with which the client works correctly.

The bridges I checked:

Not worked

obfs4 154.16.245.120:443 072AF4D16146012D6E8DFA8518B169345D8CEA51 cert=/nZrKfIZrw39ij9PLfoY0Uq+CCyQIYpP0BefZJbe2yf9cbJtXynlsWLwV77pVbKK4xDvHw iat-mode=0

Worked

obfs4 93.190.138.248:41248 577059FF2CADFB6AEFBCC78FFAB4DBEC1AF8A57B cert=d5ijyl75isX32Vx9yQkSDZicYPiamD413fnyKf/5TQxBTytobCVrzX7ATtBKZGOwIQ4IYg iat-mode=0

I think this problem is related to bridges, not clients...

comment:17 Changed 11 months ago by loskiq

I just installed the bridge version 0.2.9.17, and the client immediately started working. I believe that the problem of 25% is related to the new version of the bridge.

comment:18 in reply to:  17 ; Changed 11 months ago by dcf

Replying to loskiq:

I just installed the bridge version 0.2.9.17, and the client immediately started working. I believe that the problem of 25% is related to the new version of the bridge.

Oh! How surprising. I see from attachment:tor_bridge.log that formerly you were using 0.3.4.9. So downgrading from 0.3.4.9 to 0.2.9.17 on the bridge (keeping the client at 0.3.4.9) made the bridge start working.

This also matches with the bridges you posted in comment:16.

154.16.245.120:443 not working 0.3.4.9 relay search (archive)
93.190.138.248:41248 working 0.2.9.17 relay search (archive)

I'll ask and see if any core-tor developer have an idea about what it wrong. If you have time, you can help by testing versions between 0.2.9.17 and 0.3.4.9 to see which versions work.

comment:19 Changed 11 months ago by cypherpunks3

Last edited 11 months ago by cypherpunks3 (previous) (diff)

comment:20 in reply to:  18 Changed 11 months ago by loskiq

Replying to dcf:

If you have time, you can help by testing versions between 0.2.9.17 and 0.3.4.9 to see which versions work.

I tested different versions of Tor, and here is the result:

Tor version 0.2.9.17. working
Tor version 0.3.2.10 (git-31cc63deb69db819). working
Tor version 0.3.3.10 (git-2e94df92caee0fca). working
Tor version 0.3.3.6 (git-7dd0813e783ae16e). working
Tor version 0.3.3.7 (git-035a35178c92da94). working
Tor version 0.3.3.9 (git-45028085ea188baf). working
Tor version 0.3.4.1-alpha (git-deb8970a29ef7427). not working
Tor version 0.3.4.2-alpha (git-bc951e83aac770d1). not working
Tor version 0.3.4.6-rc (git-6045c26d8442913e). not working
Tor version 0.3.4.7-rc (git-8465a8d84647c349). not working
Tor version 0.3.4.8 (git-da95b91355248ad8). not working
Tor version 0.3.4.9 (git-4ac3ccf2863b86e7). not working

comment:21 Changed 11 months ago by cypherpunks3

Last edited 11 months ago by cypherpunks3 (previous) (diff)

comment:22 Changed 11 months ago by cypherpunks3

Last edited 11 months ago by cypherpunks3 (previous) (diff)

comment:23 Changed 11 months ago by cypherpunks3

Dieses problem wurde bei linked-verbindungen aufgespürt. Dies war nie richtig funktioniert aber bei älteren versionen und normal geladenen relais.

comment:24 Changed 11 months ago by cypherpunks3

diff --git a/src/or/main.c b/src/or/main.c
index bc01e07..dd1f0d6 100644
--- a/src/or/main.c
+++ b/src/or/main.c
@@ -404,6 +404,9 @@ connection_unlink(connection_t *conn)
   connection_free(conn);
 }
 
+/** Event that invokes schedule_active_linked_connections_cb. */
+static mainloop_event_t *schedule_active_linked_connections_event = NULL;
+
 /**
  * Callback: used to activate read events for all linked connections, so
  * libevent knows to call their read callbacks.  This callback run as a
@@ -420,11 +423,10 @@ schedule_active_linked_connections_cb(mainloop_event_t *ev
    * so that libevent knows to run their callbacks. */
   SMARTLIST_FOREACH(active_linked_connection_lst, connection_t *, conn,
                     event_active(conn->read_event, EV_READ, 1));
+  if (smartlist_len(active_linked_connection_lst)) //QQQ: vvv safe?
+    mainloop_event_activate(schedule_active_linked_connections_event);
 }
 
-/** Event that invokes schedule_active_linked_connections_cb. */
-static mainloop_event_t *schedule_active_linked_connections_event = NULL;
-
 /** Initialize the global connection list, closeable connection list,
  * and active connection list. */
 STATIC void

comment:25 Changed 11 months ago by dgoulet

Hmmm... I'm able to reproduce on 154.16.245.120:443 but it is working on my bridge listed in TB that is running 0.3.4.9 (Lisbeth, https://metrics.torproject.org/rs.html#details/D9C805C955CB124D188C0D44F271E9BE57DE2109). Bridge line:

Bridge obfs4 192.95.36.142:443 CDF2E852BF539B82BD10E27E9115A31734E378C2 cert=qUVQ0srL1JI/vO6V6m/24anYXiJD3QP2HgzUKQtQ7GRqqUvs7P+tG43RtAqdhLOALP7DJQ iat-mode=0

I think there is an issue with the bridge 154.16.245.120 itself, some sort of heavy throttling maybe. In the debug logs of the tor client, I see bursts of cells (no clear patterns but between 30 and 60 cells every 2 to 5 minutes) and then it get stuck waiting for more, tor just sits there idle waiting for I'm guessing the download to complete... I bet if I let it sit there long enough, the download would finish.

comment:26 Changed 10 months ago by arma

Status: newneeds_review

an irc user reports being happy with this patch on #28912 (which sounds from the ticket titles like the same issue)

comment:27 Changed 10 months ago by arma

I am running this patch on moria1 now.

(I wonder if weasel's surprisingly varying bootstrap times for tor clients has to do with directory authorities, or fallbackdirs, now running substantially on Tor 0.3.4.x or later.)

comment:28 Changed 10 months ago by cypherpunks3

See old code for "called_loop_once = smartlist_len(active_linked_connection_lst) ? 1 : 0;"
Stuck at:

diff --git a/src/or/connection.c a/src/or/connection.c
index 0a2a635..0e051a5 100644
--- a/src/or/connection.c
+++ b/src/or/connection.c
@@ -3428,6 +3428,8 @@ connection_handle_read_impl(connection_t *conn)
 
     if (!buf_datalen(linked->outbuf) && conn->active_on_link)
       connection_stop_reading_from_linked_conn(conn);
+    /* Now. Now. If code still reading from connection then code */
+    /* must to reactivate event. It's linked connection. */
   }
   /* If we hit the EOF, call connection_reached_eof(). */
   if (!conn->marked_for_close &&
Last edited 10 months ago by cypherpunks3 (previous) (diff)

comment:29 Changed 10 months ago by dgoulet

See https://trac.torproject.org/projects/tor/ticket/28912#comment:10

We've used the mainloop patch for now instead of going with reactivating the linked connection if it still has data to send. The reason is that it is a safer fix because <= 0.3.3 has that behavior. Doing the latter would require new tricky code that could introduce more issues :S.

comment:30 in reply to:  29 Changed 10 months ago by loskiq

I compiled the latest stable version of Tor (0.3.4.9), replacing the main.c file with yours. It works!

Replying to dgoulet

comment:31 Changed 10 months ago by dgoulet

Reviewer: nickm

comment:32 Changed 9 months ago by nickm

Milestone: Tor: unspecifiedTor: 0.3.4.x-final
Resolution: fixed
Status: needs_reviewclosed

I believe that the proposed patch above is the same patch as for #28912, which should be fixed in 0.3.4.10 and 0.3.5.7. Please reopen if this happens between a client and a relay running those versions or later?

Note: See TracTickets for help on using tickets.