Opened 10 years ago

Closed 7 years ago

#1307 closed defect (fixed)

Hosting many hidden services causes many errors and takes hours to start up

Reported by: marked Owned by: rransom
Priority: Low Milestone: Tor: unspecified
Component: Core Tor/Tor Version: 0.2.1.24
Severity: Keywords: hidden services tor server bootstrap tor-hs
Cc: marked, Sebastian, nickm Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by phobos)

In my torrc i have configured about 200 hidden services (I have a reason for this, they are needed and im not just wasting them)

When I start tor it takes hours to start while using 99% CPU and giving many errors in the logs.
I have been having this problem for a while but it seems to get worse as i add more hidden services.

Older version of tor (0.2.1.22 and below) gave many errors like this.
[warn] Error launching circuit to node xxxx for service xxxxxxx.

Tor 0.2.1.24 (i never used 0.2.1.23) gives different errors and seems to be taking way longer to start.
Here is my log for 0.2.1.24.

Mar 13 02:11:11.841 [notice] Parsing GEOIP file.
Mar 13 02:11:12.077 [notice] OpenSSL OpenSSL 0.9.8m 25 Feb 2010 looks like version 0.9.8m or later; I will try SSL_OP to enable renegotiation
Mar 13 02:11:13.862 [notice] We now have enough directory information to build circuits.
Mar 13 02:11:13.862 [notice] Bootstrapped 80%: Connecting to the Tor network.
Mar 13 02:13:11.982 [notice] Bootstrapped 85%: Finishing handshake with first hop.
Mar 13 02:13:12.503 [notice] Bootstrapped 90%: Establishing a Tor circuit.
Mar 13 02:13:12.993 [notice] Your system clock just jumped 119 seconds forward; assuming established circuits no longer work.
Mar 13 02:15:25.191 [notice] Your system clock just jumped 133 seconds forward; assuming established circuits no longer work.
Mar 13 02:15:26.194 [notice] We stalled too much while trying to write 317968 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 7, type OR, state 8, marked at main.c:722).
Mar 13 02:15:28.954 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Mar 13 02:15:28.954 [notice] Bootstrapped 100%: Done.
Mar 13 02:17:57.051 [notice] Your system clock just jumped 123 seconds forward; assuming established circuits no longer work.
Mar 13 02:20:08.157 [notice] Your system clock just jumped 131 seconds forward; assuming established circuits no longer work.
Mar 13 02:20:08.193 [notice] We stalled too much while trying to write 303728 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 15, type OR, state 8, marked at main.c:722).
Mar 13 02:20:08.194 [notice] We stalled too much while trying to write 309872 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 7, type OR, state 8, marked at main.c:722).
Mar 13 02:20:47.760 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Mar 13 02:23:01.965 [notice] Your system clock just jumped 126 seconds forward; assuming established circuits no longer work.
Mar 13 02:25:13.073 [notice] Your system clock just jumped 132 seconds forward; assuming established circuits no longer work.
Mar 13 02:25:13.104 [notice] We stalled too much while trying to write 150640 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 13, type OR, state 8, marked at main.c:722).
Mar 13 02:25:13.144 [notice] We stalled too much while trying to write 159232 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 15, type OR, state 8, marked at main.c:722).
Mar 13 02:25:13.144 [notice] We stalled too much while trying to write 160768 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 11, type OR, state 8, marked at main.c:722).
Mar 13 02:25:14.505 [notice] We're missing a certificate from authority with signing key 08D85E2B51D1962DF9EAB4DAF1F1A0061FF0E954: launching request.
Mar 13 02:25:18.434 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Mar 13 02:27:56.202 [notice] Your system clock just jumped 120 seconds forward; assuming established circuits no longer work.
Mar 13 02:30:09.555 [notice] Your system clock just jumped 133 seconds forward; assuming established circuits no longer work.
Mar 13 02:30:09.599 [notice] We stalled too much while trying to write 246384 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 28, type OR, state 8, marked at main.c:722).
Mar 13 02:30:09.600 [notice] We stalled too much while trying to write 245360 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 26, type OR, state 8, marked at main.c:722).
Mar 13 02:30:09.600 [notice] We stalled too much while trying to write 232960 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 24, type OR, state 8, marked at main.c:722).
Mar 13 02:30:16.401 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Mar 13 02:32:57.525 [notice] Your system clock just jumped 120 seconds forward; assuming established circuits no longer work.
Mar 13 02:35:10.484 [notice] Your system clock just jumped 133 seconds forward; assuming established circuits no longer work.
Mar 13 02:35:10.528 [notice] We stalled too much while trying to write 154224 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 24, type OR, state 8, marked at main.c:722).
Mar 13 02:35:10.529 [notice] We stalled too much while trying to write 161392 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 26, type OR, state 8, marked at main.c:722).
Mar 13 02:35:10.530 [notice] We stalled too much while trying to write 161392 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 33, type OR, state 8, marked at main.c:722).
Mar 13 02:35:10.531 [notice] We stalled too much while trying to write 158320 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 25, type OR, state 8, marked at main.c:722).
Mar 13 02:35:10.532 [notice] We stalled too much while trying to write 170096 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 28, type OR, state 8, marked at main.c:722).
Mar 13 02:35:10.532 [notice] We stalled too much while trying to write 156672 bytes to address [scrubbed]. If this happens a lot, either something is wrong with your network connection, or something is wrong with theirs. (fd 9, type OR, state 8, marked at main.c:722).
Mar 13 02:35:13.440 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Mar 13 02:38:00.857 [notice] Your system clock just jumped 122 seconds forward; assuming established circuits no longer work

.... it continues like this for hours
My clock did not jump, the missing time is when tor is stuck in 99% cpu use.
There is nothing wrong with my network either.
It works if I remove the many hidden services.

When it does finally start up, tor and all the hidden services work fine (at least it did in 0.2.1.22, im still waiting for 0.2.1.24 to start after 2 hours.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Attachments (1)

torrc.tkt1307 (26.3 KB) - added by phobos 10 years ago.
500 hidden services of love

Download all attachments as: .zip

Change History (23)

comment:1 Changed 10 years ago by marked

I left it over night and after 12 hours it was still not working.

I removed all the hidden services and started tor, then wrote a script to add the services back in 10 at a time and send a HUP signal to tor, and then wait 30 secs and add another 10...

This worked and got all my hidden services working within a few minutes.

There seems to be some problem with starting too many hidden services at once that causes tor to get stuck in a loop.

comment:2 Changed 10 years ago by phobos

Description: modified (diff)
Owner: set to phobos
Status: newaccepted

Is this still a problem with 0.2.1.26 or 0.2.2.13-alpha?

comment:3 Changed 10 years ago by marked

Yes it still happens with 0.2.1.26. i have not tried alpha version.

comment:4 Changed 10 years ago by phobos

Keywords: hidden services tor server bootstrap added

which operating system? and can you attach an info level log of start up?

Changed 10 years ago by phobos

Attachment: torrc.tkt1307 added

500 hidden services of love

comment:5 Changed 10 years ago by phobos

Ok, I can recreate this problem. I've attached my test torrc with 500 hidden services created.

comment:6 Changed 10 years ago by phobos

Here's the quick script I wrote to create all the hidden services.

#
# Create a ton of hidden services in a tor config
# BSD 3-Clause license, Copyright 2010 The Tor Project, Inc.
#

start_port = 10000;
datadir = "/home/phobos/.tor";
end_port = 10300;

until start_port == end_port do
  puts "HiddenServiceDir #{datadir}/hidserv/#{start_port}
HiddenServicePort #{start_port} 127.0.0.1:#{start_port}";
#puts "#{start_port}";
  start_port += 1;
end

comment:7 in reply to:  5 Changed 10 years ago by phobos

Replying to phobos:

Ok, I can recreate this problem. I've attached my test torrc with 500 hidden services created.

And by 500, I meant 300.

comment:8 Changed 10 years ago by nickm

gah. 500 hidden services means 1500 introduction point circuits plus 1500 hsdir uploads. The code isn't optimized to deal with this case at all. Somebody will need to figure out the right way to batch up and stagger these without leaking the linkages between hidden services. (Keeping those linkages secret might be a lost cause, though: if your Tor stops or your network connection drops, all 500 will go down at once, and that's hard to hide.)

comment:9 Changed 10 years ago by phobos

Is this a "won't fix" situation or something that we can say "don't do this"?

comment:10 Changed 10 years ago by nickm

Priority: majorminor

Neither, I think. It's more like a "low-priority, lots-of-work" thing. "Won't fix" implies that we wouldn't accept a patch for it even if we got one. "Don't do this" implies that we have a good workaround, and we don't, really, AfAICT.

comment:11 in reply to:  10 Changed 10 years ago by phobos

Owner: phobos deleted
Status: acceptedassigned

Replying to nickm:

Neither, I think. It's more like a "low-priority, lots-of-work" thing. "Won't fix" implies that we wouldn't accept a patch for it even if we got one. "Don't do this" implies that we have a good workaround, and we don't, really, AfAICT.

In my world, "don't do this" means either it will not work, or we don't support it due to code/sanity/reality constraints.

comment:12 Changed 10 years ago by nickm

Milestone: Tor: unspecified

comment:13 Changed 9 years ago by arma

Component: Tor RelayTor hidden services

comment:14 Changed 9 years ago by rransom

Owner: set to rransom

This is infeasible with our current HS protocols -- we can't implement it without either reworking all of Tor to be properly multithreaded or making the hidden services linkable, and we would need a new introduction-point protocol at a minimum.

comment:15 Changed 9 years ago by arma

What is Tor spending its cpu time doing in this case? A thousand public key ops shouldn't take this long.

(While I agree about the "don't do this" answer, learning exactly where we fail here could help us understand ways that we fail more subtly when hosting just a few hidden services.)

comment:16 in reply to:  15 Changed 9 years ago by rransom

Replying to arma:

What is Tor spending its cpu time doing in this case? A thousand public key ops shouldn't take this long.

They would have been either RSA key-generation operations or RSA private-key operations.

(While I agree about the "don't do this" answer, learning exactly where we fail here could help us understand ways that we fail more subtly when hosting just a few hidden services.)

See loud-hs-serv-pk-operations ( git://git.torproject.org/rransom/tor.git loud-hs-serv-pk-operations ).

comment:17 Changed 8 years ago by nickm

Keywords: tor-hs added

comment:18 Changed 8 years ago by nickm

Component: Tor Hidden ServicesTor

comment:19 Changed 7 years ago by cypherpunks

This is horo. I rewrite phobos script:

#!/bin/bash
#
# Create a ton of hidden services in a tor config
# BSD 3-Clause license, Copyright 2010 The Tor Project, Inc.
#

echo "SocksPort 0" >> torrc-test
echo "Log debug file /home/amnesia/tor/debugtor.log" >> torrc-test
echo "SafeLogging 0" >> torrc-test

for i in {10000..12000}
do
	echo "HiddenServiceDir /home/amnesia/tor2/hidserv/$i" >> torrc-test
	echo "HiddenServicePort $i 127.0.0.1:$i" >> torrc-test
done

comment:20 Changed 7 years ago by cypherpunks

Need to run this after that :

for dir in `grep Dir torrc-test | cut -d" " -f2`; do mkdir $dir; done

comment:21 Changed 7 years ago by cypherpunks

2000 hidden services seem to work in tails 0.15

comment:22 Changed 7 years ago by phobos

Resolution: Nonefixed
Status: assignedclosed

I can replicate these results in tor 0.2.3.25 on a normal system. Calling this fixed.

Note: See TracTickets for help on using tickets.