Opened 3 years ago

Closed 7 months ago

Last modified 6 months ago

#24841 closed defect (duplicate)

Your relay has a very large number of connections to other relays. Is your outbound address the same as your relay address?

Reported by: tyng Owned by: nickm
Priority: Medium Milestone: Tor: 0.4.4.x-final
Component: Core Tor/Tor Version: Tor: 0.3.1.1-alpha
Severity: Normal Keywords: 043-backport-maybe, 042-backport-maybe, 033-triage-20180320, 033-removed-20180320, 035-deferred-20190115, 041-proposed
Cc: mikeperry Actual Points:
Parent ID: #33880 Points: 0.5
Reviewer: Sponsor:

Description (last modified by arma)

debian-tor-0.3.2.8-rc-1

SOCKSPort 9050
ORPort 80.67.176.53:9001
ORPort [2001:910:1035::1]:9001
ExitRelay 0

https://atlas.torproject.org/#details/A2547D13A3659ECF23AF8B9456B2E277110BF136

janv. 08 20:41:35 alex Tor[21674]: Tor 0.3.2.8-rc (git-3696eb720795a666) running on Linux with Libevent 2.1.8-stable, OpenSSL 1.1.0g, Zlib 1.2.8, Liblzma 5.2.2, and Libzstd 1.3.3.
janv. 08 20:41:35 alex Tor[21674]: Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
janv. 08 20:41:35 alex Tor[21674]: Read configuration file "/usr/share/tor/tor-service-defaults-torrc".
janv. 08 20:41:35 alex Tor[21674]: Read configuration file "/etc/tor/torrc".
janv. 08 20:41:35 alex Tor[21674]: Based on detected system memory, MaxMemInQueues is set to 8192 MB. You can override this by setting MaxMemInQueues by hand.
janv. 08 20:41:35 alex Tor[21674]: Scheduler type KIST has been enabled.
janv. 08 20:41:35 alex Tor[21674]: Opening Socks listener on 127.0.0.1:9050
janv. 08 20:41:35 alex Tor[21674]: Opening OR listener on 80.67.176.53:9001
janv. 08 20:41:35 alex Tor[21674]: Opening OR listener on [2001:910:1035::1]:9001
janv. 08 20:41:35 alex Tor[21674]: Parsing GEOIP IPv4 file /usr/share/tor/geoip.
janv. 08 20:41:35 alex Tor[21674]: Parsing GEOIP IPv6 file /usr/share/tor/geoip6.
janv. 08 20:41:35 alex Tor[21674]: Configured to measure statistics. Look for the *-stats files that will first be written to the data directory in 24 hours from now.
janv. 08 20:41:35 alex Tor[21674]: Your Tor server's identity key fingerprint is 'tyng A2547D13A3659ECF23AF8B9456B2E277110BF136'
janv. 08 20:41:35 alex Tor[21674]: Bootstrapped 0%: Starting
janv. 08 20:41:38 alex Tor[21674]: Starting with guard context "default"
janv. 08 20:41:38 alex Tor[21674]: Bootstrapped 80%: Connecting to the Tor network
janv. 08 20:41:38 alex Tor[21674]: Signaled readiness to systemd
janv. 08 20:41:38 alex Tor[21674]: Opening Control listener on /var/run/tor/control
janv. 08 20:41:39 alex Tor[21674]: Bootstrapped 85%: Finishing handshake with first hop
janv. 08 20:41:39 alex Tor[21674]: Bootstrapped 90%: Establishing a Tor circuit
janv. 08 20:41:42 alex Tor[21674]: Tor has successfully opened a circuit. Looks like client functionality is working.
janv. 08 20:41:42 alex Tor[21674]: Bootstrapped 100%: Done
janv. 08 20:41:42 alex Tor[21674]: Now checking whether ORPort 80.67.176.53:9001 is reachable... (this may take up to 20 minutes -- look for log messages indicating success)
janv. 08 20:41:43 alex Tor[21674]: Self-testing indicates your ORPort is reachable from the outside. Excellent. Publishing server descriptor.
janv. 08 20:42:41 alex Tor[21674]: Performing bandwidth self-test...done.
janv. 08 23:41:38 alex Tor[21674]: Your relay has a very large number of connections to other relays. Is your outbound address the same as your relay address? Found 11 connections to 7 relays. Found 11 current canonical connections, in 0 of which we were a non-canonical peer. 4 relays had more than 1 connection, 0 had more than 2, and 0 had more than 4 connections.
janv. 09 02:41:38 alex Tor[21674]: Heartbeat: Tor's uptime is 5:59 hours, with 2 circuits open. I've sent 4.61 MB and received 16.47 MB.
janv. 09 02:41:38 alex Tor[21674]: Circuit handshake stats since last time: 0/0 TAP, 5/5 NTor.
janv. 09 02:41:38 alex Tor[21674]: Since startup, we have initiated 0 v1 connections, 0 v2 connections, 0 v3 connections, and 2 v4 connections; and received 0 v1 connections, 0 v2 connections, 1 v3 connections, and 37 v4 connections.
janv. 09 02:41:38 alex Tor[21674]: Your relay has a very large number of connections to other relays. Is your outbound address the same as your relay address? Found 14 connections to 9 relays. Found 14 current canonical connections, in 0 of which we were a non-canonical peer. 5 relays had more than 1 connection, 0 had more than 2, and 0 had more than 4 connections.
janv. 09 04:41:38 alex Tor[21674]: Your relay has a very large number of connections to other relays. Is your outbound address the same as your relay address? Found 15 connections to 10 relays. Found 15 current canonical connections, in 0 of which we were a non-canonical peer. 5 relays had more than 1 connection, 0 had more than 2, and 0 had more than 4 connections.
janv. 09 05:41:38 alex Tor[21674]: Your relay has a very large number of connections to other relays. Is your outbound address the same as your relay address? Found 11 connections to 7 relays. Found 11 current canonical connections, in 0 of which we were a non-canonical peer. 4 relays had more than 1 connection, 0 had more than 2, and 0 had more than 4 connections.
janv. 09 06:41:38 alex Tor[21674]: Your relay has a very large number of connections to other relays. Is your outbound address the same as your relay address? Found 9 connections to 6 relays. Found 9 current canonical connections, in 0 of which we were a non-canonical peer. 3 relays had more than 1 connection, 0 had more than 2, and 0 had more than 4 connections.
janv. 09 08:41:38 alex Tor[21674]: Heartbeat: Tor's uptime is 11:59 hours, with 0 circuits open. I've sent 8.13 MB and received 24.95 MB.
janv. 09 08:41:38 alex Tor[21674]: Average packaged cell fullness: 93.781%. TLS write overhead: 16%
janv. 09 08:41:38 alex Tor[21674]: Circuit handshake stats since last time: 0/0 TAP, 1/1 NTor.
janv. 09 08:41:38 alex Tor[21674]: Since startup, we have initiated 0 v1 connections, 0 v2 connections, 0 v3 connections, and 4 v4 connections; and received 0 v1 connections, 0 v2 connections, 4 v3 connections, and 74 v4 connections.

Child Tickets

TicketStatusOwnerSummaryComponent
#26199closed"Your relay has a very large number of connections to other relays." potential bug in OutboundBindAddress?Core Tor/Tor

Attachments (1)

avoid_useless_warning.diff (492 bytes) - added by timkuijsten 2 years ago.
avoid useless warning for new relays

Download all attachments as: .zip

Change History (26)

comment:1 Changed 3 years ago by arma

Component: - Select a componentCore Tor/Tor
Description: modified (diff)

comment:2 Changed 3 years ago by arma

Mike: in the comment it says

/* If we average 1.5 or more connections per relay, something is wrong */

What is wrong? What are some of the things that could be wrong? What should we do? :)

comment:3 Changed 3 years ago by dgoulet

Milestone: Tor: 0.3.3.x-final

comment:4 Changed 3 years ago by mikeperry

My guess is that this relay has both IPv4 and IPv6 connections to some of its peers. Options include A) checking for that and not logging in that case. B) Marking one of these duplicate connections as non-canonical, so the duplicates won't persist.

Last edited 3 years ago by mikeperry (previous) (diff)

comment:5 in reply to:  4 Changed 3 years ago by teor

Replying to mikeperry:

My guess is that this relay has both IPv4 and IPv6 connections to some of its peers. Options include A) checking for that and not logging in that case. B) Marking one of these duplicate connections as non-canonical, so the duplicates won't persist.

Relays do not extend over IPv6, so this explanation is unlikely. IPv6 relay extends will be implemented in #24404.

My guess is that the canonical connection code behaves badly under load.

comment:6 Changed 3 years ago by teor

Parent ID: #24767

Your relay might be connecting to the same relay multiple times due to #24767.

comment:7 Changed 3 years ago by arma

No, I believe that relays do the right thing when there is a pending connection -- they set the circuit into state pending, and wait for the conn to either succeed or fail.

comment:8 Changed 3 years ago by nickm

Keywords: 033-triage-20180320 added

Marking all tickets reached by current round of 033 triage.

comment:9 Changed 3 years ago by nickm

Keywords: 033-removed-20180320 added

Mark all not-already-included tickets as pending review for removal from 0.3.3 milestone.

comment:10 Changed 3 years ago by nickm

Milestone: Tor: 0.3.3.x-finalTor: unspecified

These tickets were marked as removed, and nobody has said that they can fix them. Let's remember to look at 033-removed-20180320 as we re-evaluate our triage process, to see whether we're triaging out unnecessarily, and to evaluate whether we're deferring anything unnecessarily. But for now, we can't do these: we need to fix the 033-must stuff now.

comment:11 Changed 3 years ago by dgoulet

Parent ID: #24767

comment:12 Changed 2 years ago by cypherpunks

#26199 was closed as a duplicate

comment:13 Changed 2 years ago by cypherpunks

For people finding this entry after stumpling on these log lines on their relays it would be nice if you could share some information about how bad this is for the relay/user.

anything from

  • feel free to ignore this
  • this will affect your relay's performance

to

  • better shutdown your relay than keep it running with these kind of problems

comment:14 Changed 2 years ago by teor

Feel free to ignore this, if all the numbers are less than 60. (60 is 1% of a relay's 6000 connections to other relays.)
Otherwise, please report it.

This message is logged so we can make sure that a relay isn't making too many connections to other relays.

We could change the code so it logs an info-level message when the number of duplicate connections is under 1% of the number of nodes.

Changed 2 years ago by timkuijsten

Attachment: avoid_useless_warning.diff added

avoid useless warning for new relays

comment:15 Changed 2 years ago by timkuijsten

Since I've been starting a couple of new relays as a new relay operator, this message occurs now and then in the first couple of days and it made me wonder[0]. If we should ignore these numbers as long as they are below 60, then why show the message before having connected to at least 60 relays? I would propose to crank MIN_RELAY_CONNECTIONS_TO_WARN from 5 to 60 (but I'm completely new :)).

[0] https://lists.torproject.org/pipermail/tor-relays/2018-August/016037.html

comment:16 in reply to:  15 ; Changed 2 years ago by teor

Replying to timkuijsten:

Since I've been starting a couple of new relays as a new relay operator, this message occurs now and then in the first couple of days and it made me wonder[0]. If we should ignore these numbers as long as they are below 60, then why show the message before having connected to at least 60 relays? I would propose to crank MIN_RELAY_CONNECTIONS_TO_WARN from 5 to 60 (but I'm completely new :)).

[0] https://lists.torproject.org/pipermail/tor-relays/2018-August/016037.html

Thanks for this patch!

I think this is a good fix, but we could make it better by:

  • counting the number of relays in the consensus, and using 1% of that as a minimum
  • if the limit is less than 5, use 5 as the minimum
  • if there is no consensus, use 5 relays as the minimum

You can find the number of relays in the consensus by doing something like:

const networkstatus_t *consensus = networkstatus_get_reasonably_live_consensus(now, usable_flavour());
int n_relays = consensus ? smatlist_len(consensus->routerstatus_list) : 0;

comment:17 Changed 2 years ago by teor

Milestone: Tor: unspecifiedTor: 0.3.5.x-final
Status: newneeds_revision

comment:18 Changed 22 months ago by nickm

Keywords: 035-deferred-20190115 041-proposed added
Milestone: Tor: 0.3.5.x-finalTor: unspecified

Marking a number of 0.3.5 tickets as possible, maybe even a good idea, for later. Possibly backportable, some of them. But not currently things to do as part of 0.3.5 stabilization.

comment:19 Changed 7 months ago by teor

Parent ID: #33048
Points: 0.5
Sponsor: Sponsor55-can

This ticket might be fixed by #33817: connection_is_canonical_to_peer checks IPv6, but connection_is_canonical does not.

comment:20 in reply to:  16 Changed 7 months ago by teor

Milestone: Tor: unspecifiedTor: 0.4.4.x-final
Owner: set to nickm
Status: needs_revisionassigned

Replying to teor:

Replying to timkuijsten:

Since I've been starting a couple of new relays as a new relay operator, this message occurs now and then in the first couple of days and it made me wonder[0]. If we should ignore these numbers as long as they are below 60, then why show the message before having connected to at least 60 relays? I would propose to crank MIN_RELAY_CONNECTIONS_TO_WARN from 5 to 60 (but I'm completely new :)).

[0] https://lists.torproject.org/pipermail/tor-relays/2018-August/016037.html

Thanks for this patch!

I think this is a good fix, but we could make it better by:

  • counting the number of relays in the consensus, and using 1% of that as a minimum
  • if the limit is less than 5, use 5 as the minimum
  • if there is no consensus, use 5 relays as the minimum

You can find the number of relays in the consensus by doing something like:

const networkstatus_t *consensus = networkstatus_get_reasonably_live_consensus(now, usable_flavour());
int n_relays = consensus ? smatlist_len(consensus->routerstatus_list) : 0;

nickm, would you like to update the attached patch?

I think my patch in #33817 will fix one possible source of this issue. But we won't see the impact until 0.4.4 or later are deployed to most of the network. (And there may be other causes.)

So it's a good time to adjust the thresholds for this warning.

comment:21 Changed 7 months ago by teor

(This is not urgent at all - it would be nice to have this change in 0.4.4.)

comment:22 Changed 7 months ago by teor

Keywords: 043-backport-maybe 042-backport-maybe added
Version: Tor: 0.3.2.8-rcTor: 0.3.1.1-alpha

This log message was originally introduced in #17604 in 0.3.1.1-alpha.

Do you think we should backport this change?
How far?

Last edited 7 months ago by teor (previous) (diff)

comment:23 Changed 7 months ago by arma

Ah ha! I had forgotten this ticket. I believe it is fully explained by #33880 -- these are reachability tests by directory authorities, and in 0.3.1.1-alpha (ticket #17592, commit d5a151a0) we changed the timeout parameters such that it is quite likely that you will still have an old connection open from a directory authority when the next reachability test connection arrives. The numbers from the log message here line up perfectly with the numbers from the log message in that newer ticket.

comment:24 Changed 7 months ago by teor

Parent ID: #33048#33880
Resolution: duplicate
Status: assignedclosed

This is now a duplicate of #33880.

comment:25 Changed 6 months ago by teor

Sponsor: Sponsor55-can
Note: See TracTickets for help on using tickets.