Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#8679 closed defect (duplicate)

Tor: 100% CPU usage

Reported by: VladV Owned by:
Priority: Medium Milestone:
Component: Core Tor/Tor Version: Tor: 0.2.3.25
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Sometimes Tor server starts using abnormally high CPU, occupying one processor core entirely. This has happened a few times now. After a restart it starts working properly again, and sometimes works happily for days.

I'm not sure what triggers the problem, last time it seemed to happen after an IP change, though that might be a coincidence. Tor traffic was quite low at the time (<10 KB/s), so that's clearly not the cause.

Here are the last log messages (I don't have a debug log yet):

Apr 10 20:13:17.000 [notice] Heartbeat: Tor's uptime is 12:07 hours, with 193 circuits open. I've sent 117.95 GB and received 113.66 GB.
Apr 11 00:15:23.000 [notice] Your IP address seems to have changed to [...]. Updating.
Apr 11 00:15:27.000 [notice] Self-testing indicates your ORPort is reachable from the outside. Excellent. Publishing server descriptor.
Apr 11 00:15:27.000 [notice] Your IP address seems to have changed to [...]. Updating.
Apr 11 00:15:27.000 [notice] Self-testing indicates your ORPort is reachable from the outside. Excellent. Publishing server descriptor.
Apr 11 00:16:27.000 [notice] Self-testing indicates your DirPort is reachable from the outside. Excellent.
Apr 11 00:16:27.000 [notice] Performing bandwidth self-test...done.
Apr 11 00:23:31.000 [notice] New control connection opened.
Apr 11 00:23:31.000 [notice] New control connection opened.
Apr 11 00:23:40.000 [notice] Tor 0.2.3.25 (git-3fed5eb096d2d187) opening log file.
Apr 11 00:23:40.000 [notice] Renaming old configuration file to "/etc/tor/torrc.orig.2"
Apr 11 00:23:40.000 [warn] Couldn't rename configuration file "/etc/tor/torrc" to "/etc/tor/torrc.orig.2": Permission denied
Apr 11 00:23:40.000 [notice] Tor 0.2.3.25 (git-3fed5eb096d2d187) opening log file.

strace repeatedly shows the following:

read(35, 0xbbd26463, 5)                 = -1 EAGAIN (Resource temporarily unavailable)
time(NULL)                              = 1365630212
read(106, 0xb969128b, 5)                = -1 EAGAIN (Resource temporarily unavailable)
time(NULL)                              = 1365630212
epoll_wait(3, {{EPOLLOUT, {u32=35, u64=35}}, {EPOLLOUT, {u32=106, u64=106}}}, 1024, 49) = 2
clock_gettime(CLOCK_MONOTONIC, {430295, 713277822}) = 0
gettimeofday({1365630212, 877128}, NULL) = 0
read(35, 0xbbd26463, 5)                 = -1 EAGAIN (Resource temporarily unavailable)
time(NULL)                              = 1365630212
read(106, 0xb969128b, 5)                = -1 EAGAIN (Resource temporarily unavailable)
time(NULL)                              = 1365630212
epoll_wait(3, {{EPOLLOUT, {u32=35, u64=35}}, {EPOLLOUT, {u32=106, u64=106}}}, 1024, 48) = 2
clock_gettime(CLOCK_MONOTONIC, {430295, 713494994}) = 0
gettimeofday({1365630212, 877339}, NULL) = 0
read(35, 0xbbd26463, 5)                 = -1 EAGAIN (Resource temporarily unavailable)
time(NULL)                              = 1365630212
read(106, 0xb969128b, 5)                = -1 EAGAIN (Resource temporarily unavailable)
time(NULL)                              = 1365630212
epoll_wait(3, {{EPOLLOUT, {u32=35, u64=35}}, {EPOLLOUT, {u32=106, u64=106}}}, 1024, 48) = 2
clock_gettime(CLOCK_MONOTONIC, {430295, 713685322}) = 0
gettimeofday({1365630212, 877529}, NULL) = 0
read(35, 0xbbd26463, 5)                 = -1 EAGAIN (Resource temporarily unavailable)
time(NULL)                              = 1365630212
read(106, 0xb969128b, 5)                = -1 EAGAIN (Resource temporarily unavailable)

top shows that CPU is used by the main thread.
I've taken a few stack traces with gdb before accidentally killing the process.

Thread 1 (Thread 0xb70e1ac0 (LWP 3940)):
#0  0xb736a982 in ?? () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#1  0xb736af51 in lh_retrieve () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#2  0xb736d893 in ?? () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#3  0xb736e22d in ERR_get_state () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#4  0xb736e33b in ?? () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#5  0xb736e5fb in ERR_peek_error () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#6  0xb769dda3 in _check_no_tls_errors (fname=0xb76b6f2f "buffers.c", line=748) at tortls.c:2299
#7  0xb75ebdc1 in read_to_buf_tls (tls=0xb9698e48, at_most=16384, buf=0xb9978188) at buffers.c:748
#8  0xb7621fca in connection_read_to_buf (socket_error=0xbf9af59c, max_to_read=<synthetic pointer>, conn=0xbaf504f0) at connection.c:2804
#9  connection_handle_read_impl (conn=0xbaf504f0) at connection.c:2673
#10 connection_handle_read (conn=0xbaf504f0) at connection.c:2748
#11 0xb7622faa in connection_handle_write_impl (force=0, conn=0xbaf504f0) at connection.c:3234
#12 connection_handle_write (conn=0xbaf504f0, force=0) at connection.c:3340
#13 0xb7591a44 in conn_write_callback (fd=106, events=4, _conn=0xbaf504f0) at main.c:738
#14 0xb74c4ce9 in event_base_loop () from /usr/lib/libevent-2.0.so.5
#15 0xb759269b in do_main_loop () at main.c:1959
#16 0xb7594199 in tor_main (argc=4, argv=0xbf9af924) at main.c:2652
#17 0xb758e3b3 in main (argc=4, argv=0xbf9af924) at tor_main.c:30

Thread 1 (Thread 0xb70e1ac0 (LWP 3940)):
#0  0xb7561424 in __kernel_vsyscall ()
#1  0xb72a89db in read () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7362bbe in ?? () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#3  0xb7360422 in BIO_read () from /lib/i386-linux-gnu/libcrypto.so.1.0.0
#4  0xb747b507 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#5  0xb747c6fd in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#6  0xb7479769 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#7  0xb74933e9 in SSL_read () from /lib/i386-linux-gnu/libssl.so.1.0.0
#8  0xb769c8e4 in tor_tls_read (tls=0xb9698e48, 
    cp=0xbbc07f20 "\241\215\003\210\002\251(\"n\343\065$\264f\236L\372p[...]", len=16384) at tortls.c:1750
#9  0xb75ebdf3 in read_to_chunk_tls (at_most=16384, tls=0xb9698e48, chunk=<optimized out>, buf=<optimized out>) at buffers.c:667
#10 read_to_buf_tls (tls=0xb9698e48, at_most=16384, buf=0xb9978188) at buffers.c:766
#11 0xb7621fca in connection_read_to_buf (socket_error=0xbf9af59c, max_to_read=<synthetic pointer>, conn=0xbaf504f0) at connection.c:2804
#12 connection_handle_read_impl (conn=0xbaf504f0) at connection.c:2673
#13 connection_handle_read (conn=0xbaf504f0) at connection.c:2748
#14 0xb7622faa in connection_handle_write_impl (force=0, conn=0xbaf504f0) at connection.c:3234
#15 connection_handle_write (conn=0xbaf504f0, force=0) at connection.c:3340
#16 0xb7591a44 in conn_write_callback (fd=106, events=4, _conn=0xbaf504f0) at main.c:738
#17 0xb74c4ce9 in event_base_loop () from /usr/lib/libevent-2.0.so.5
#18 0xb759269b in do_main_loop () at main.c:1959
#19 0xb7594199 in tor_main (argc=4, argv=0xbf9af924) at main.c:2652
#20 0xb758e3b3 in main (argc=4, argv=0xbf9af924) at tor_main.c:30

Thread 1 (Thread 0xb70e1ac0 (LWP 3940)):
#0  0xb747c26d in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#1  0xb7479769 in ?? () from /lib/i386-linux-gnu/libssl.so.1.0.0
#2  0xb74933e9 in SSL_read () from /lib/i386-linux-gnu/libssl.so.1.0.0
#3  0xb769c8e4 in tor_tls_read (tls=0xba5b64c8, 
    cp=0xbba7ab08 "\034\032\003z\340\026\225\232[...]", len=16384) at tortls.c:1750
#4  0xb75ebdf3 in read_to_chunk_tls (at_most=16384, tls=0xba5b64c8, chunk=<optimized out>, buf=<optimized out>) at buffers.c:667
#5  read_to_buf_tls (tls=0xba5b64c8, at_most=16384, buf=0xbae703e0) at buffers.c:766
#6  0xb7621fca in connection_read_to_buf (socket_error=0xbf9af59c, max_to_read=<synthetic pointer>, conn=0xbaee2518) at connection.c:2804
#7  connection_handle_read_impl (conn=0xbaee2518) at connection.c:2673
#8  connection_handle_read (conn=0xbaee2518) at connection.c:2748
#9  0xb7622faa in connection_handle_write_impl (force=0, conn=0xbaee2518) at connection.c:3234
#10 connection_handle_write (conn=0xbaee2518, force=0) at connection.c:3340
#11 0xb7591a44 in conn_write_callback (fd=35, events=4, _conn=0xbaee2518) at main.c:738
#12 0xb74c4ce9 in event_base_loop () from /usr/lib/libevent-2.0.so.5
#13 0xb759269b in do_main_loop () at main.c:1959
#14 0xb7594199 in tor_main (argc=4, argv=0xbf9af924) at main.c:2652
#15 0xb758e3b3 in main (argc=4, argv=0xbf9af924) at tor_main.c:30

Thread 1 (Thread 0xb70e1ac0 (LWP 3940)):
#0  tor_tls_get_n_raw_bytes (tls=0xba5b64c8, n_read=0xbf9af5a0, n_written=0xbf9af5a4) at tortls.c:2284
#1  0xb762214f in connection_read_to_buf (socket_error=0xbf9af59c, max_to_read=<synthetic pointer>, conn=0xbaee2518) at connection.c:2847
#2  connection_handle_read_impl (conn=0xbaee2518) at connection.c:2673
#3  connection_handle_read (conn=0xbaee2518) at connection.c:2748
#4  0xb7622faa in connection_handle_write_impl (force=0, conn=0xbaee2518) at connection.c:3234
#5  connection_handle_write (conn=0xbaee2518, force=0) at connection.c:3340
#6  0xb7591a44 in conn_write_callback (fd=35, events=4, _conn=0xbaee2518) at main.c:738
#7  0xb74c4ce9 in event_base_loop () from /usr/lib/libevent-2.0.so.5
#8  0xb759269b in do_main_loop () at main.c:1959
#9  0xb7594199 in tor_main (argc=4, argv=0xbf9af924) at main.c:2652
#10 0xb758e3b3 in main (argc=4, argv=0xbf9af924) at tor_main.c:30

Versions:
Ubuntu 12.04 (Linux kernel 3.2.0-39-generic-pae)
Tor 0.2.3.25-1~precise+1 (from deb.torproject.org)
OpenSSL 1.0.1-4ubuntu5.8

Child Tickets

Change History (3)

comment:1 Changed 7 years ago by marmoute

This bug affect me once. After restart the relay was back to normal

Debian wheezy (up to date) on amd64;
tor: package at 0.2.3.25-1;

No suspicious log entry.

comment:2 Changed 7 years ago by nickm

Resolution: duplicate
Status: newclosed

Looks like a duplicate of #5650. Good info though.

I've got some questions about what's going on there -- could you head over to that ticket ?

comment:3 Changed 7 years ago by nickm

(There's now a possible fix for this issue on that ticket. Needs testing!)

Note: See TracTickets for help on using tickets.