Opened 9 years ago

Closed 3 years ago

#4626 closed defect (fixed)

Very high cpu usage for gabelmoo running with renegotiation-limiting code

Reported by: Sebastian Owned by: asn
Priority: High Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-dirauth
Cc: Actual Points:
Parent ID: #4668 Points:
Reviewer: Sponsor:

Description

Hey there,

gabelmoo is seeing almost full cpu utilization lately. I'm running openssl1 and libevent master. Traffic is at around 200KB/s, so not very much. Here's a profile for everything over 0.5%:

samples  %        image name               app name                 symbol name
397332   26.8226  libc.so.6                libc.so.6                /home/karsten/debug/libc.so.6
210739   14.2263  libpthread.so.0          libpthread.so.0          __pthread_mutex_unlock_usercnt
157849   10.6559  libpthread.so.0          libpthread.so.0          pthread_mutex_lock
62969     4.2508  tor                      tor                      connection_handle_write
56998     3.8477  tor                      tor                      _openssl_locking_cb
44452     3.0008  tor                      tor                      assert_connection_ok
38146     2.5751  tor                      tor                      connection_bucket_write_limit
37917     2.5597  [vdso] (tgid:17627 range:0x7fffb85ff000-0x7fffb8600000) tor                      [vdso] (tgid:17627 range:0x7fffb85ff000-0x7fffb8600000)
32683     2.2063  tor                      tor                      flush_buf_tls
29224     1.9728  tor                      tor                      connection_is_rate_limited
28245     1.9067  tor                      tor                      connection_bucket_round_robin
25259     1.7052  tor                      tor                      tor_tls_get_error
22309     1.5060  tor                      tor                      tor_tls_write
21562     1.4556  tor                      tor                      assert_buf_ok
20642     1.3935  tor                      tor                      get_options_mutable
19521     1.3178  tor                      tor                      approx_time
19272     1.3010  tor                      tor                      _check_no_tls_errors
19108     1.2899  tor                      tor                      conn_write_callback
18312     1.2362  tor                      tor                      tor_addr_is_internal
14932     1.0080  tor                      tor                      tor_tls_get_forced_write_size
14237     0.9611  tor                      tor                      tor_gettimeofday_cache_clear
12501     0.8439  librt.so.1               librt.so.1               /home/karsten/debug/librt.so.1
11918     0.8045  tor                      tor                      tor_mutex_acquire
11907     0.8038  tor                      tor                      tor_mutex_release
11376     0.7680  tor                      tor                      connection_bucket_refill
9770      0.6595  tor                      tor                      connection_is_listener
9582      0.6468  tor                      tor                      connection_is_reading
9493      0.6408  tor                      tor                      tor_tls_state_changed_callback
9087      0.6134  tor                      tor                      connection_is_writing
8689      0.5866  tor                      tor                      TO_OR_CONN
7890      0.5326  tor                      tor                      connection_state_is_connecting

Child Tickets

Change History (27)

comment:1 Changed 9 years ago by nickm

Wow, that pthread_mutex stuff is pretty surprising. Very little in Tor uses locks, and locks are usually pretty cheap unless there's contention. It could be locks used by libc or openssl, but I'd be a little surprised if so.

I know there are some tools out there that can debug locking performance issues, but I don't know which are best or how to use them.

Is there any way to drill down into the libc chunk there?

What build options was Tor/Libevent built with here? Is there anything crazy in your logs?

comment:2 Changed 9 years ago by nickm

Also, it's pretty weird that openssl doesn't show up in that profile at all.

comment:3 Changed 9 years ago by Sebastian

I think it isn't in libevent, because I've seen the same thing with libevent1.4 and tor master, and I don't think it's with openssl, because I've also seen it with openssl0.9.whatever debian has. I rebuilt the libs myself to get the debug info. I have the libc-dbg package installed, I don't know what else I could do to get into the libc chunk. Tor was built with --enable-gcc-warnings --disable-asciidoc --with-{openssl,zlib/libevent}-dir, libevent was built with --enable-gcc-warnings, and openssl was built with zlib shared.

comment:4 Changed 9 years ago by Sebastian

oh, another weird thing. For whatever reason, libssl and libcrypto don't appear in my profile at all, even tho they are in the same dir as the other libs and ldd shows that Tor wants to use them?

comment:5 Changed 9 years ago by rransom

Based on a later profile, it looks like we're seeing locking overhead inside OpenSSL, probably in or around its PRNG. Hopefully the locking calls are inside the PRNG, where we can avoid them by hooking in our own (new) PRNG implementation.

comment:6 Changed 9 years ago by Sebastian

latest profile:

samples  %        image name               app name                 symbol name
1632346  11.8094  libc.so.6                libc.so.6                /home/karsten/debug/libc.so.6
1498099  10.8382  tor                      tor                      sha1_block_data_order
1416223  10.2458  tor                      tor                      bn_mul_mont
565760    4.0931  tor                      tor                      lh_retrieve
470492    3.4038  tor                      tor                      CRYPTO_THREADID_cmp
460371    3.3306  tor                      tor                      CRYPTO_lock
453698    3.2823  tor                      tor                      ssleay_rand_add
334489    2.4199  tor                      tor                      assert_connection_ok
295292    2.1363  tor                      tor                      ssl3_read_n
291990    2.1124  tor                      tor                      ERR_clear_error
291403    2.1082  tor                      tor                      EVP_DigestInit_ex
272549    1.9718  tor                      tor                      SHA1_Update
216186    1.5640  tor                      tor                      ssl3_accept
187467    1.3563  tor                      tor                      connection_handle_write
178321    1.2901  tor                      tor                      ssl3_setup_read_buffer
170131    1.2308  tor                      tor                      ssl3_read_bytes
164986    1.1936  tor                      tor                      SHA1_Final
163540    1.1831  tor                      tor                      flush_buf_tls
161413    1.1678  tor                      tor                      connection_bucket_write_limit
155617    1.1258  tor                      tor                      connection_bucket_round_robin
152358    1.1023  tor                      tor                      conn_write_callback
150020    1.0853  tor                      tor                      tor_tls_write
140980    1.0199  tor                      tor                      SHA1_Init
139537    1.0095  tor                      tor                      BIO_read
135219    0.9783  tor                      tor                      int_thread_get_item
132170    0.9562  [vdso] (tgid:1517 range:0x7fff413ff000-0x7fff41400000) tor                      [vdso] (tgid:1517 range:0x7fff413ff000-0x7fff41400000)
124277    0.8991  tor                      tor                      get_error_values
122123    0.8835  tor                      tor                      ssl3_release_read_buffer
111395    0.8059  tor                      tor                      EVP_DigestFinal_ex
110316    0.7981  tor                      tor                      CRYPTO_add_lock
104822    0.7583  tor                      tor                      engine_table_select
103036    0.7454  tor                      tor                      EVP_MD_CTX_init
98293     0.7111  tor                      tor                      EVP_MD_CTX_cleanup
97970     0.7088  tor                      tor                      int_thread_get
97956     0.7087  tor                      tor                      tor_addr_is_internal
96442     0.6977  tor                      tor                      sock_read
91395     0.6612  tor                      tor                      assert_buf_ok
85942     0.6218  tor                      tor                      ssl3_write_bytes
84626     0.6122  tor                      tor                      ssl3_get_message
78316     0.5666  tor                      tor                      OPENSSL_cleanse
76337     0.5523  tor                      tor                      tor_tls_get_error
73666     0.5329  tor                      tor                      _log_fn
73623     0.5326  tor                      tor                      _check_no_tls_errors
71169     0.5149  tor                      tor                      int_thread_release
68475     0.4954  tor                      tor                      SSL_get_error
67685     0.4897  librt.so.1               librt.so.1               /home/karsten/debug/librt.so.1
67298     0.4869  tor                      tor                      ERR_get_state
62872     0.4549  tor                      tor                      tor_tls_state_changed_callback
60933     0.4408  tor                      tor                      CRYPTO_malloc
53032     0.3837  tor                      tor                      connection_state_is_connecting
49586     0.3587  tor                      tor                      SSL_version
48757     0.3527  tor                      tor                      connection_is_writing
46488     0.3363  tor                      tor                      TO_OR_CONN
46311     0.3350  tor                      tor                      approx_time
45217     0.3271  tor                      tor                      ssl3_check_client_hello
45038     0.3258  tor                      tor                      ERR_peek_error

comment:7 Changed 9 years ago by Sebastian

And I've managed to get the libc broken down. Looks like stuff from outside Tor got put in. Here we go with a better profile.

samples  %        image name               symbol name
816064   10.3630  tor                      sha1_block_data_order
326831    4.1503  tor                      lh_retrieve
270685    3.4374  libevent-2.1.so.1.0.0    event_base_loop
266122    3.3794  tor                      CRYPTO_THREADID_cmp
256188    3.2533  tor                      CRYPTO_lock
246810    3.1342  tor                      ssleay_rand_add
198834    2.5249  tor                      bn_mul_mont
176971    2.2473  tor                      EVP_DigestInit_ex
171173    2.1737  tor                      ssl3_read_n
156014    1.9812  tor                      ERR_clear_error
155584    1.9757  tor                      SHA1_Update
148804    1.8896  tor                      ssl3_read_bytes
144783    1.8386  libc-2.11.2.so           memcpy
138239    1.7555  libevent-2.1.so.1.0.0    event_process_active_single_queue
121508    1.5430  libevent-2.1.so.1.0.0    epoll_dispatch
110935    1.4087  tor                      assert_connection_ok
108831    1.3820  tor                      connection_handle_write
104542    1.3276  libc-2.11.2.so           _int_free
102253    1.2985  tor                      int_thread_get_item
98278     1.2480  libc-2.11.2.so           malloc
97648     1.2400  libevent-2.1.so.1.0.0    event_pending
96666     1.2275  tor                      conn_write_callback
96345     1.2235  tor                      ssl3_setup_read_buffer
95187     1.2088  tor                      tor_tls_write
90558     1.1500  tor                      connection_bucket_round_robin
87150     1.1067  tor                      flush_buf_tls
86564     1.0993  tor                      SHA1_Final
85288     1.0831  libevent-2.1.so.1.0.0    gettime
83158     1.0560  libc-2.11.2.so           memset
82328     1.0455  tor                      connection_bucket_write_limit
82056     1.0420  tor                      _log_fn
81218     1.0314  [vdso] (tgid:13332 range:0x7fff9719a000-0x7fff9719b000) [vdso] (tgid:13332 range:0x7fff9719a000-0x7fff9719b000)
79000     1.0032  libc-2.11.2.so           _int_malloc
76371     0.9698  tor                      ssl3_accept
72840     0.9250  tor                      ssl3_release_read_buffer
69932     0.8880  tor                      BIO_read
67766     0.8605  tor                      tor_addr_is_internal
66815     0.8485  tor                      get_error_values
65876     0.8365  libc-2.11.2.so           __read_nocancel
64578     0.8201  libevent-2.1.so.1.0.0    event_queue_remove_active
63060     0.8008  tor                      EVP_DigestFinal_ex
62984     0.7998  tor                      ssl3_get_message
61671     0.7831  tor                      CRYPTO_malloc
60856     0.7728  tor                      SHA1_Init
60739     0.7713  libevent-2.1.so.1.0.0    evmap_io_active
57774     0.7337  tor                      CRYPTO_add_lock
57171     0.7260  tor                      EVP_MD_CTX_init
56716     0.7202  tor                      EVP_MD_CTX_cleanup
53904     0.6845  tor                      OPENSSL_cleanse
53697     0.6819  libc-2.11.2.so           free
49858     0.6331  tor                      engine_table_select
49590     0.6297  tor                      ssl3_write_bytes
49328     0.6264  libc-2.11.2.so           __epoll_wait_nocancel
48919     0.6212  libevent-2.1.so.1.0.0    event_active_nolock
47917     0.6085  tor                      assert_buf_ok
47879     0.6080  tor                      tor_tls_get_error
45525     0.5781  libc-2.11.2.so           __errno_location
45152     0.5734  tor                      int_thread_get
45064     0.5723  libc-2.11.2.so           read
43563     0.5532  tor                      sock_read
40059     0.5087  tor                      int_thread_release
39132     0.4969  tor                      ERR_get_state
38059     0.4833  tor                      SSL_get_error

I still have no idea what tor is actually *DOING* here, there is absolutely no reason to spend all this cpu

comment:8 Changed 9 years ago by arma

moria1 experiences this too after upgrade

comment:9 Changed 9 years ago by arma

Priority: normalmajor

Running on the 58d1aa44023e branch seems to trigger it.

And running on the eaa3a379f0ba seems to not trigger it.

And running on 7920ea55b8d9 seems to trigger it.

The 7920ea55b8d9 commit looks like it's touching the right sort of things to be relevant to our bug.

Nicing my Tor process makes it much more friendly -- I then have some idle time on my cpu.

comment:10 in reply to:  9 Changed 9 years ago by arma

Replying to arma:

The 7920ea55b8d9 commit looks like it's touching the right sort of things to be relevant to our bug.

Alas, e5f2f10844d seems to trigger it too. So does 9814019a54.

comment:11 Changed 9 years ago by nickm

Status: newneeds_review

sebastian says his bisect blames 44619d5e4.

wanoskarnet found a problem there that could be the issue.

I have a possible fix in branch "bug4626_try1" in my public repository. Could use testing!

comment:12 Changed 9 years ago by Sebastian

So, unfortunately the patch doesn't seem to work for me, not even when I applied it directly on top of the (suspected) faulty commit and ran that.

But here's something very interesting: this is a profile of the commit directly before 44619d5e4, where the cpu usage stays below 10% typically

288491   51.1032  tor                      tor                      bn_mul_mont
31541     5.5872  tor                      tor                      _x86_64_AES_encrypt_compact
30061     5.3250  libz.so.1.2.5            tor                      deflate_slow
18985     3.3630  libz.so.1.2.5            tor                      build_tree
14140     2.5048  tor                      tor                      sha1_block_data_order
13043     2.3104  libc-2.11.2.so           tor                      _int_malloc
9262      1.6407  tor                      tor                      _x86_64_AES_decrypt_compact
8564      1.5170  tor                      tor                      tor_memeq
7717      1.3670  tor                      tor                      BN_mod_exp_mont_consttime
6585      1.1665  tor                      tor                      connection_bucket_refill
5682      1.0065  libc-2.11.2.so           tor                      strcasecmp
3646      0.6459  libz.so.1.2.5            tor                      fill_window
3573      0.6329  libc-2.11.2.so           tor                      memcpy
3544      0.6278  libz.so.1.2.5            tor                      compress_block
3124      0.5534  libc-2.11.2.so           tor                      memmem
3007      0.5327  tor                      tor                      digestmap_get
2703      0.4788  tor                      tor                      BN_mod_mul_montgomery
2394      0.4241  libz.so.1.2.5            tor                      send_tree
2268      0.4018  libz.so.1.2.5            tor                      _tr_flush_block
2106      0.3731  libc-2.11.2.so           tor                      _int_free
2090      0.3702  tor                      tor                      digestmap_iter_get
1935      0.3428  libc-2.11.2.so           tor                      memset
1902      0.3369  tor                      tor                      bn_mul_add_words
1870      0.3313  tor                      tor                      MOD_EXP_CTIME_COPY_TO_PREBUF
1727      0.3059  tor                      tor                      aes_crypt_inplace
1588      0.2813  tor                      tor                      md5_block_asm_data_order
1522      0.2696  tor                      tor                      dirserv_orconn_tls_done
1488      0.2636  tor                      tor                      buf_datalen
1469      0.2602  libc-2.11.2.so           tor                      malloc_consolidate
1348      0.2388  tor                      tor                      ssl_cipher_apply_rule
1251      0.2216  libc-2.11.2.so           tor                      free
1122      0.1988  tor                      tor                      connection_or_set_bad_connections
1089      0.1929  libc-2.11.2.so           tor                      vfprintf
1077      0.1908  libc-2.11.2.so           tor                      malloc
1063      0.1883  libz.so.1.2.5            tor                      adler32
1062      0.1881  libevent-2.1.so.1.0.0    tor                      event_base_loop
930       0.1647  tor                      tor                      BN_div
893       0.1582  tor                      tor                      ap_stream_wants_exit_attention
886       0.1569  tor                      tor                      AES_cbc_encrypt
882       0.1562  tor                      tor                      EVP_MD_CTX_cleanup

comment:13 Changed 9 years ago by Sebastian

In contrast, 44619d5e4 with bug4626_try1 cherry-picked on top of it looks like this:

30790    11.0720  tor                      sha1_block_data_order
12867     4.6269  tor                      lh_retrieve
10654     3.8311  tor                      ssleay_rand_add
10564     3.7988  tor                      CRYPTO_lock
10479     3.7682  tor                      CRYPTO_THREADID_cmp
9484      3.4104  tor                      ERR_clear_error
7611      2.7369  tor                      bn_mul_mont
7295      2.6233  tor                      connection_handle_write
5497      1.9767  libevent-2.1.so.1.0.0    event_process_active_single_queue
5220      1.8771  tor                      ssl3_read_n
4833      1.7379  libc-2.11.2.so           _int_free
4705      1.6919  tor                      assert_connection_ok
4533      1.6300  tor                      ssl3_accept
4526      1.6275  tor                      SHA1_Update
4372      1.5722  libc-2.11.2.so           memset
4193      1.5078  libevent-2.1.so.1.0.0    event_base_loop
4036      1.4513  tor                      EVP_DigestInit_ex
3990      1.4348  tor                      SHA1_Final
3906      1.4046  tor                      conn_write_callback
3665      1.3179  tor                      ssl3_setup_read_buffer
3582      1.2881  tor                      flush_buf_tls
3576      1.2859  tor                      ssl3_read_bytes
3462      1.2449  tor                      connection_bucket_write_limit
3380      1.2154  libevent-2.1.so.1.0.0    event_pending
3293      1.1841  libc-2.11.2.so           _int_malloc
3186      1.1457  tor                      connection_bucket_round_robin
3174      1.1414  tor                      tor_tls_write
2929      1.0533  tor                      ssl3_write_bytes
2713      0.9756  libc-2.11.2.so           __read_nocancel
2612      0.9393  tor                      int_thread_get_item
2546      0.9155  tor                      assert_buf_ok
2425      0.8720  libc-2.11.2.so           free
2393      0.8605  tor                      SHA1_Init
2374      0.8537  tor                      SSL_get_error
2353      0.8461  tor                      tor_tls_get_error
2351      0.8454  tor                      ssl3_release_read_buffer
2288      0.8228  tor                      EVP_MD_CTX_cleanup
2243      0.8066  tor                      CRYPTO_add_lock
2222      0.7990  tor                      ssl3_get_message
2046      0.7357  libc-2.11.2.so           malloc
2005      0.7210  tor                      get_error_values
1942      0.6983  libevent-2.1.so.1.0.0    gettime
1938      0.6969  libevent-2.1.so.1.0.0    event_active_nolock
1896      0.6818  libc-2.11.2.so           memcpy
1894      0.6811  libevent-2.1.so.1.0.0    evmap_io_active
1869      0.6721  tor                      int_thread_release

comment:14 Changed 9 years ago by asn

It seems like Sebastan's profiles hint that the bug is in 44619d5e4. And the bug exists even with bug4626_try1 cherry-picked! Fun stuff!

A difference between the pre-44619d5e4 and the post-44619d5e4 tor_tls_read(), is that in the latter case, when we first detect the renegotiation we return TOR_TLS_WANTREAD. pre-44619d5e4 we used to return a positive integer, since we only detected renegotiation with the first application data packet.

Could calling the TLS renegotiation callback and then returning TOR_TLS_WANTREAD cause all this chaos?

The profiles seem to indicate a huge amount of SSL reads and writes (connection_handle_write, flush_buf_tls, etc.), and there also seems to be too much ssl3_accept(), which probably means that we are getting in the SSL protocol state machine loop more than we should.

comment:15 Changed 9 years ago by asn

A plausible cause of the too much ssl3_accept() is that we are failing too many non-blocking SSL protocol read/write operations, and the ssl3_accept() loop gets continuously deferred.

comment:16 Changed 9 years ago by Sebastian

lalelu, to keep trac updated:

wanoskarnet and asn have been chatting, and had another idea which seems to not be the problem:

diff --git a/src/common/tortls.c b/src/common/tortls.c
index 1205599..a768634 100644
--- a/src/common/tortls.c
+++ b/src/common/tortls.c
@@ -1629,14 +1629,17 @@ tor_tls_read(tor_tls_t *tls, char *cp, size_t len)
 
 #ifdef V2_HANDSHAKE_SERVER
   if (tls->got_renegotiate) {
-    tor_assert(tls->server_handshake_count == 2);
-    /* XXX tor_assert(err == TOR_TLS_WANTREAD); */
-
-    /* Renegotiation happened! */
-    log_info(LD_NET, "Got a TLS renegotiation from %s", ADDR(tls));
-    if (tls->negotiated_callback)
-      tls->negotiated_callback(tls, tls->callback_arg);
-    tls->got_renegotiate = 0;
+    if ((err == TOR_TLS_WANTREAD) || (err == TOR_TLS_WANTWRITE)) {
+
+      tor_assert(tls->server_handshake_count == 2);
+      /* XXX tor_assert(err == TOR_TLS_WANTREAD); */
+
+      /* Renegotiation happened! */
+      log_info(LD_NET, "Got a TLS renegotiation from %s", ADDR(tls));
+      if (tls->negotiated_callback)
+        tls->negotiated_callback(tls, tls->callback_arg);
+      tls->got_renegotiate = 0;
+    }
   }
 #endif
 

comment:17 Changed 9 years ago by Sebastian

more updates

< wanoskarnet> the logic of 4fd79f9def289965 is broken. you can't to call connection_or_check_valid_tls_handshake() just after wantread from ssl_read(), that doesn't means that certs recved yet.
< asn> got_renegotiate && WANTREAD, does it for most cases
< asn> but in certain high-traffic scenarios, indeed, the certs might not be there.
< wanoskarnet> no for most cases there are no certs at all. server just send hello and keys and returns -1 as non blockin io (no yet answer with cert and finish by client).
< wanoskarnet> callback that sets got_renegotiate happens during first ssl_read just after client hello.

comment:18 Changed 9 years ago by nickm

asn suggests that his bug4626_callback_conditions_theory branch might hold the answer, if one possible theory is correct.

comment:19 Changed 9 years ago by asn

Parent ID: #4668

comment:20 Changed 9 years ago by asn

Owner: set to asn
Status: needs_reviewaccepted

Pulling this out of needs_review. I'm not sure how to do that, so I'm assigning it to me which I think will do the trick.

comment:21 Changed 9 years ago by Sebastian

Shouldn't this be closed with the revert now?

comment:22 in reply to:  21 Changed 9 years ago by nickm

Milestone: Tor: 0.2.3.x-finalTor: 0.2.4.x-final
Summary: Very high cpu usage for gabelmoo running masterVery high cpu usage for gabelmoo running with renegotiation-limiting code

Replying to Sebastian:

Shouldn't this be closed with the revert now?

Since we're planning to eventually re-apply (or rewrite) the relevant code, I'd like to keep tickets around that describe bugs in it, and make them child-tickets of #4668.

comment:23 Changed 8 years ago by nickm

Keywords: tor-auth added

comment:24 Changed 8 years ago by nickm

Component: Tor Directory AuthorityTor

comment:25 Changed 8 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: unspecified

comment:26 Changed 3 years ago by dgoulet

Keywords: tor-dirauth added; tor-auth removed

Turns out that tor-auth is for directory authority so make it clearer with tor-dirauth

comment:27 in reply to:  21 Changed 3 years ago by nickm

Resolution: fixed
Severity: Normal
Status: acceptedclosed

Replying to Sebastian:

Shouldn't this be closed with the revert now?

Given 5 years have passed, I think I should just say "Sebastian Was Right" :)

Note: See TracTickets for help on using tickets.