Opened 3 years ago

Closed 23 months ago

#21073 closed defect (wontfix)

"PredictedPortsRelevanceTime 0" causes stagnant/uncommunicative onion services, stale descriptors

Reported by: alecmuffett Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.9.7-rc
Severity: Normal Keywords: tor-hs dont-do-that-then predictedports liveness-detection
Cc: asn, arma Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I am running 72 tor daemons with the following spec:

Tor 0.2.9.7-rc (git-6b6ad81c2e140d85) running on Linux with Libevent 2.0.21-stable, OpenSSL 1.0.1t and Zlib 1.2.8.

...on a cluster of identical Raspberry Pi hardware.

The goal is to experiment with Tor bandwidth via OnionBalance, so I have been tweaking configurations because a cluster of N tor daemons doesn't really benefit from predictive persistent anything.

The configuration (text in the footer) in about 8% of cases, creates a daemon which, after initial upload, appears to never (or-only-very-rarely - unsure) refresh its descriptors in an HSDir.

This behaviour stops when "PredictedPortsRelevanceTime 0" is commented out.

Using a small custom Stem script, I query the age of the 72 daemons' descriptor; the vast majority are less than 2 hours old, but some - the afflicted daemons - are 10+hours old.

Sample output from my tool:

19:25:22 mistral:~ $ ls-hsdir `cat Dropbox/all-onions.txt`
v=2 age=5183 pub(2016-12-23 18:00:00) 2pnhm32wvh2g6bod
v=2 age=5183 pub(2016-12-23 18:00:00) 2ss5hl24km3cnedb
# unavailable 44kpqx3wj4pdj4x3
v=2 age=1583 pub(2016-12-23 19:00:00) 457vhfiipyfahsw2
v=2 age=5183 pub(2016-12-23 18:00:00) 4byeybc6yyqvxc64
v=2 age=12383 pub(2016-12-23 16:00:00) 4sj56yfqt6iimah2
v=2 age=5183 pub(2016-12-23 18:00:00) 57j6n5nsrvl2n3lm
# unavailable 5imawjwdy2332sk2
v=2 age=5183 pub(2016-12-23 18:00:00) 5k2ukr3gjxw4iuwo
v=2 age=12383 pub(2016-12-23 16:00:00) 6bdgdiyoqdaq65oh
v=2 age=1583 pub(2016-12-23 19:00:00) 6egxpvvszfzriamo
v=2 age=5183 pub(2016-12-23 18:00:00) 7rydmwifplyugjzg
v=2 age=5183 pub(2016-12-23 18:00:00) a7ls3tboibdtexpa
v=2 age=66383 pub(2016-12-23 01:00:00) apk2wb3qdwzovtdj
v=2 age=1583 pub(2016-12-23 19:00:00) av6plyhrd5j7enoo
v=2 age=1583 pub(2016-12-23 19:00:00) awocgbvyljq4nf2p
v=2 age=5183 pub(2016-12-23 18:00:00) ayzn2s76oh4eqw45
v=2 age=37583 pub(2016-12-23 09:00:00) b6rzknxn664juice
# unavailable bnuy3zlmrnvljylh
v=2 age=1583 pub(2016-12-23 19:00:00) btxtnep4ipsgiq6j
...
...

The daemons, despite some having such old descriptors, are all still reachable some 21 hours after launch

I shall be taking these (cited) daemons down, but can recreate them pretty easily.

Purely speculatively, it does sound vaguely similar to this Ricochet issue which arma reported to Ricochet: https://github.com/ricochet-im/ricochet/issues/245

I have 2x 'debug' logs from the same physical machine, one which is of a 'good' daemon and the other 'stale' daemon, running concurrently. The 'good' log is 35Mb versus the 'stale' 27Mb, but comparison with other logs does not suggest a strong correlation for stale daemons vs: logfile size.

The files are presumably too large to attach? Even after compression they will be several Mb.

Running carml on an stale daemon for HS_DESC activity showed little of note. Surprisingly little, even.

I'm stuck for ideas, but am aware that a very large site uses this option in its 2.7 config, so it would be good to know if it is needed and/or helpful for SingleOnions in 2.9, and.or also bugfixed.

19:28:24 rig2:hs2.d $ more config
DataDirectory /home/alecm/master/halfagig/hs2.d
HiddenServiceDir /home/alecm/master/halfagig/hs2.d
ControlPort unix:/home/alecm/master/halfagig/hs2.d/control.sock
SocksPort 0
Log debug file /home/alecm/master/halfagig/hs2.d/log.txt
SafeLogging 0
HeartbeatPeriod 60 minutes
# HiddenServicePort 19 localhost:8502
# HiddenServicePort 22 localhost:22
HiddenServicePort 80 localhost:10502
HiddenServiceNumIntroductionPoints 3
LongLivedPorts 19,22,80
#
# CircuitBuildTimeout 60
# LearnCircuitBuildTimeout 0
PredictedPortsRelevanceTime 0
# UseEntryGuards 0
# UseEntryGuardsAsDirGuards 0

Child Tickets

Change History (6)

comment:1 Changed 3 years ago by dgoulet

Cc: dgoulet removed
Component: - Select a componentCore Tor/Tor
Keywords: tor-hs added
Milestone: Tor: 0.3.0.x-final

comment:2 Changed 3 years ago by teor

This is related to #17359, where __DisablePredictedCircuits 1 causes tor to hang while bootstrapping. The root cause is that tor relies on predicted circuits to generate network activity, and network activity causes tor to perform certain actions on a regular basis.

I'd recommend avoiding these options (and similar options) until we have a fix - I doubt a fix would be backported to any 0.2 series tor version.

comment:3 Changed 3 years ago by dgoulet

Milestone: Tor: 0.3.0.x-finalTor: 0.3.1.x-final

Close to feature freeze. Deferring to 031.

comment:4 Changed 3 years ago by dgoulet

Milestone: Tor: 0.3.1.x-finalTor: unspecified

comment:5 Changed 2 years ago by nickm

Keywords: dont-do-that-then predictedports liveness-detection added

comment:6 Changed 23 months ago by teor

Resolution: wontfix
Status: newclosed

This option has been removed from Tor.

Note: See TracTickets for help on using tickets.