Opened 2 years ago

Closed 2 years ago

#23722 closed enhancement (fixed)

Somebody should profile a Tor 0.3.1.7 relay

Reported by: arma Owned by:
Priority: Medium Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords:
Cc: ahf, alex_y_xu@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

moria1, running git master, seems to be using twice as much cpu or more as it used to use.

We hear stories on tor-relays of people whose rpi's are melting when they didn't used to melt.

I just talked to a person on #tor whose Windows 0.3.1.7 relay goes to 100% cpu and stays there.

I think something changed in cpu use between 0.3.0 and 0.3.1. We should try to figure out what it is and do something to make it better.

Child Tickets

Change History (13)

comment:1 Changed 2 years ago by cypherpunks

Consensus diffs and compression are the obvious suspects.

comment:2 Changed 2 years ago by dgoulet

I'm currently running a profile on latest master with my fast relay. I'll let it sit for a while and report back.

I did a quick profile on a relay on my desktop that had old data and when liblzma kicks in, it hugs the CPU big time.

comment:3 Changed 2 years ago by dgoulet

After around 1h30 of profiling:

  20.87%  tor      [kernel.kallsyms]      [k] update_blocked_averages                                                                                                                                                        ▒
   8.54%  tor      tor                    [.] curve25519_donna                                                                                                                                                               ▒
   1.50%  tor      [kernel.kallsyms]      [k] native_queued_spin_lock_slowpath                                                                                                                                               ▒
   1.27%  tor      libc-2.23.so           [.] malloc                                                                                                                                                                         ▒
   1.21%  tor      tor                    [.] connection_bucket_refill                                                                                                                                                       ▒
   1.09%  tor      [kernel.kallsyms]      [k] __bpf_prog_run                                                                                                                                                                 ▒
   0.97%  tor      libc-2.23.so           [.] 0x000000000007fdeb                                                                                                                                                             ▒
   0.89%  tor      libevent-2.0.so.5.1.9  [.] _init                                                                                                                                                                          ▒
   0.86%  tor      libcrypto.so.1.0.0     [.] BN_num_bits_word                                                                                                                                                               ▒
   0.81%  tor      tor                    [.] circuitmux_find_map_entry                                                                                                                                                      ▒
   0.68%  tor      tor                    [.] curve25519_square_times                                                                                                                                                        ▒
   0.66%  tor      [kernel.kallsyms]      [k] try_to_wake_up                                                                                                                                                                 ▒
   0.65%  tor      libcrypto.so.1.0.0     [.] BN_num_bits                                                                                                                                                                    ▒
   0.65%  tor      tor                    [.] ewma_cmp_cmux                                                                                                                                                                  ▒
   0.62%  tor      libc-2.23.so           [.] 0x0000000000081c78                                                                                                                                                             ▒
   0.61%  tor      [nf_conntrack]         [k] __nf_conntrack_find_get                                                                                                                                                        ▒
   0.60%  tor      libcrypto.so.1.0.0     [.] 0x00000000000c7567                                                                                                                                                             ▒
   0.55%  tor      tor                    [.] buf_datalen                                                                                                                                                                    ▒
   0.54%  tor      libcrypto.so.1.0.0     [.] 0x00000000000c728e                                                                                                                                                             ▒
   0.52%  tor      [kernel.kallsyms]      [k] __fget                                                                                                                                                                         ▒
   0.51%  tor      tor                    [.] circuit_get_by_circid_channel                                                                                                                                                  ▒
   0.51%  tor      tor                    [.] ge25519_nielsadd2       

comment:4 Changed 2 years ago by ahf

Cc: ahf added

comment:5 Changed 2 years ago by nickm

Weird! Was the server seeing much traffic at the time? I'm surprised that none of the compression algorithms, digest algorithms, or AES appeared on profile.

comment:6 Changed 2 years ago by dgoulet

~5.1MB/sec at the time for ~100 minutes.

Aren't we using the AES-ni if available, would it be seen in the profile?

comment:7 in reply to:  5 Changed 2 years ago by Hello71

Replying to nickm:

Weird! Was the server seeing much traffic at the time? I'm surprised that none of the compression algorithms, digest algorithms, or AES appeared on profile.

well, consensus updates are uncommon, and AES-NI is very fast. suppose that you get https://calomel.org/aesni_ssl_performance.html performance, then if your relay is 100 megabit/sec and you can do 1700 megabyte/sec AES, then by my calculations, you should spend about 0.7% CPU time in AES. coincidentally, we have symbols in libcrypto.so at 0.60% and 0.54%. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1598529 is probably relevant, upgrading your kernel will probably help.

as I said on IRC, KIST incidentally moderately improves CPU usage by calling epoll_ctl at a reasonable rate instead of the ridiculousness that was before.

comment:8 Changed 2 years ago by Hello71

Cc: alex_y_xu@… added

comment:9 Changed 2 years ago by s7r

I am running the latest git master 0.3.2.2-alpha-dev (git-51e47481fc6f131d) and I don't see a significant CPU usage increase compared to 0.3.0. It's true this relay's bottleneck is the bandwidth because it's capped, but there is no big difference in terms of CPU % usage per tor user / process as opposite to 0.3.0 - it's still around 9% - 20% with very small period bursts to even 98%.

comment:10 in reply to:  9 ; Changed 2 years ago by cypherpunks

Replying to s7r:

I am running the latest git master 0.3.2.2-alpha-dev (git-51e47481fc6f131d) and I don't see a significant CPU usage increase compared to 0.3.0. It's true this relay's bottleneck is the bandwidth because it's capped, but there is no big difference in terms of CPU % usage per tor user / process as opposite to 0.3.0 - it's still around 9% - 20% with very small period bursts to even 98%.

You should test with 0.3.1.x, according to Hello71 "KIST incidentally moderately improves CPU"

comment:11 in reply to:  10 Changed 2 years ago by Hello71

Replying to cypherpunks:

Replying to s7r:

I am running the latest git master 0.3.2.2-alpha-dev (git-51e47481fc6f131d) and I don't see a significant CPU usage increase compared to 0.3.0. It's true this relay's bottleneck is the bandwidth because it's capped, but there is no big difference in terms of CPU % usage per tor user / process as opposite to 0.3.0 - it's still around 9% - 20% with very small period bursts to even 98%.

You should test with 0.3.1.x, according to Hello71 "KIST incidentally moderately improves CPU"

I also experienced no noticeable increase in CPU usage with 0.3.1.7 compared to 0.3.0.10, although I only ran it for a few hours and didn't precisely measure anything.

comment:12 Changed 2 years ago by nickm

Hrrm. So, should we close this, or try to profile under other circumstances?

comment:13 Changed 2 years ago by dgoulet

Resolution: fixed
Status: newclosed

I think we can close this for now. I'm profiling 032 as we speak on a very fast Exit graciously provided by Moritz (#24127).

I say that if we have a performance issue on 031, lets open a ticket about it.

Thanks everyone that helped out here!

Note: See TracTickets for help on using tickets.