Opened 7 months ago

Last modified 4 days ago

#29607 needs_information defect

2019 Q1: Denial of service on v2 and v3 onion service

Reported by: pidgin Owned by: pidgin
Priority: Immediate Milestone: Tor: 0.4.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-hs, tor-dos, network-team-roadmap-2019-Q1Q2, security, 041-longterm, 041-deferred-20190530, 042-deferred-20190918
Cc: Actual Points:
Parent ID: #29999 Points: 10
Reviewer: Sponsor: Sponsor27-must

Description

Dear tor team,
We have setup a discussion board, on the tor network.
And there is someone that is exploiting within our servers, by taking it down it every time and the forums will respond with "Server not found".
We are pretty sure this problem is on the side of the TOR browser, is there anything we could do to sort this?
With many thanks for taking time into reading this.

Child Tickets

TicketStatusOwnerSummaryComponent
#29610closedServer being attackedCore Tor/Tor
#29637closedDenial of service on v2 onion serviceCore Tor/Tor

Change History (66)

comment:1 Changed 7 months ago by pidgin

Some server sided info below :

onionbalance is active
vanguard is active
vanguard tor process is at 5%
serving tor process is at 5%

attacker has found a way to DDOS not based on tor cpu usage attack or tor traffic exhaust attack.

comment:2 Changed 7 months ago by mcs

Component: - Select a componentCore Tor/Tor

Setting Trac component (someone please reset it if it is incorrect).

I don't understand the comment "We are pretty sure this problem is on the side of the TOR browser" since this sounds to me like a tor server-side issue. It is also unclear if this is a bug report or a support request.

Also, if you need immediate help please ask on IRC: https://www.torproject.org/about/contact.html.en#irc

comment:3 in reply to:  2 Changed 7 months ago by pidgin

Replying to mcs:

Setting Trac component (someone please reset it if it is incorrect).

I don't understand the comment "We are pretty sure this problem is on the side of the TOR browser" since this sounds to me like a tor server-side issue. It is also unclear if this is a bug report or a support request.

Also, if you need immediate help please ask on IRC: https://www.torproject.org/about/contact.html.en#irc

My apologizes if it's not on TOR side, i am not here to discuss whether it is or not.
Just trying to sort this problem out, so people won't have to deal with the problem in the future.
And thank you for linking me the irc "I could not use the IRC through TOR, so i will leave this ticket open".

Last edited 7 months ago by pidgin (previous) (diff)

comment:4 Changed 7 months ago by nickm

We are pretty sure this problem is on the side of the TOR browser, is there anything we could do to sort this?

Probably the best way to get started would be to explain why you conclude this is a problem with the Tor browser, or with Tor.

comment:5 Changed 7 months ago by nickm

Closed #29610 as a duplicate here. What is causing you to conclude that this is an attack against Tor, instead of (say) an attack against some other component? Right now, there's not enough information in your report(s) to tell.

comment:6 Changed 7 months ago by pidgin

the service behind onion HiddenService is fine, it is serving requests.
before the DDOS there have not been "Server Not Found".
Actually it was the hackers third iteration.
First step from hacker was brute force DDOS which made tor cpu load 100%. countermeasure: vanguards and using ExcludeNodes (torrc)
Second iteration from hacker was to use random nodes, about 1000+, to do tor cpu load 100%. countermeasure: vanguards / onionbalance.
now tor browser gives "server not found", countermeasure not found yet

comment:7 Changed 7 months ago by pidgin

Owner: set to pidgin
Status: newaccepted

comment:8 Changed 7 months ago by teor

Summary: Need urgent help!Denial of service on v2 onion service

#29610 and #29637 are duplicates of this ticket.

#29637 contains the same information from the user, and this response from me:

The attack probably isn't on the Tor client. It's more likely it's on the service, HSDir or Intro Points. It may be on the service's Guard or Rendezvous Point, but they're harder to find and target.

Have you tried v3 onion services?
They are much more resistant to attacks.
There is no OnionBalance for v3 yet, but the v3 crypto is more efficient, so you might not need it.

Are you sure that your service is still running ok?
Your service might be serving a small amount of traffic, and blocked on network requests. So its CPU would be low, and most clients would not be able to connect.

comment:9 Changed 7 months ago by pidgin

have not tried v3 onion services yet.

tor is again at 100% cpu

comment:10 Changed 6 months ago by pidgin

tor v3 services did not help.

comment:11 Changed 6 months ago by teor

Did you add a v3 service to the same tor instance?
Or did you start a separate tor instance with a v3 service?

comment:12 in reply to:  11 Changed 6 months ago by pidgin

Replying to teor:

Did you add a v3 service to the same tor instance?
Or did you start a separate tor instance with a v3 service?

it is a separate tor instance.

comment:13 Changed 6 months ago by teor

Parent ID: #25461

This looks like a duplicate of #25461.

We'll need to fix #25461, or confirm that you're not using an affected version, before we can isolate this bug,

How do you know there is an attacker, rather than just lots of clients using your service?

comment:14 in reply to:  13 Changed 6 months ago by pidgin

not sure when there is a forum post we can't delete it.

Last edited 6 months ago by pidgin (previous) (diff)

comment:15 Changed 6 months ago by pidgin

not sure when there is a forum post we can't delete it.

Sorry for the double reply.

Last edited 6 months ago by pidgin (previous) (diff)

comment:16 Changed 6 months ago by asn

We might need a log file (preferably info/debug level) to get deeper into this. Please make sure you don't reveal any sensitive info (e.g. guard names) through the log file.

Otherwise, do you see any warn logs, or any weird stuff on the info/debug severity that you can point out?

Last edited 6 months ago by asn (previous) (diff)

comment:17 in reply to:  16 Changed 6 months ago by pidgin

Replying to asn:

We might need a log file (preferably info/debug level) to get deeper into this. Please make sure you don't reveal any sensitive info (e.g. guard names) through the log file.

Otherwise, do you see any warn logs, or any weird stuff on the info/debug severity that you can point out?

here is the log from the onion v3 service, a debug/info log follows soon :

-scrubbed-:42:47.000 [warn] No valid circuit build time data out of 1000 times, 3 modes, have_timeout=1, 2332.000000ms
-scrubbed-:42:47.000 [warn] No valid circuit build time data out of 1000 times, 3 modes, have_timeout=1, 2332.000000ms
-scrubbed-:42:47.000 [warn] No valid circuit build time data out of 1000 times, 3 modes, have_timeout=1, 2332.000000ms
-scrubbed-:42:47.000 [warn] No valid circuit build time data out of 1000 times, 3 modes, have_timeout=1, 2332.000000ms
-scrubbed-:42:47.000 [warn] No valid circuit build time data out of 1000 times, 3 modes, have_timeout=1, 2332.000000ms
-scrubbed-:42:47.000 [warn] No valid circuit build time data out of 1000 times, 3 modes, have_timeout=1, 2332.000000ms
-scrubbed-:42:47.000 [warn] No valid circuit build time data out of 1000 times, 3 modes, have_timeout=1, 2332.000000ms
-scrubbed-:42:47.000 [warn] No valid circuit build time data out of 1000 times, 3 modes, have_timeout=1, 2332.000000ms
-scrubbed-:42:48.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3125ms and we've abandoned 999 out of 1000 circuits. (on Tor 0.4.0.1-alpha 81f1b89efc94723f)
-scrubbed-:42:48.000 [warn] circuit_build_times_update_alpha(): Bug: Could not determine largest build time (0). Xm is 3125ms and we've abandoned 998 out of 1000 circuits. (on Tor 0.4.0.1-alpha 81f1b89efc94723f)
-scrubbed-:42:50.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
-scrubbed-:42:50.000 [warn] Giving up on launching a rendezvous circuit to [scrubbed] for hidden service [scrubbed]
-scrubbed-:42:52.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60s after 18 timeouts and 1000 buildtimes.
-scrubbed-:45:30.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
-scrubbed-:45:30.000 [warn] Giving up on launching a rendezvous circuit to [scrubbed] for hidden service [scrubbed]
-scrubbed-:46:40.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
-scrubbed-:46:40.000 [warn] Giving up on launching a rendezvous circuit to [scrubbed] for hidden service [scrubbed]
-scrubbed-:47:25.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
-scrubbed-:47:25.000 [warn] Giving up on launching a rendezvous circuit to [scrubbed] for hidden service [scrubbed]
-scrubbed-:47:27.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 60s after 18 timeouts and 131 buildtimes.
-scrubbed-:47:27.000 [notice] Your network connection speed appears to have changed. Resetting timeout to 120s after 18 timeouts and 0 buildtimes.
-scrubbed-:48:47.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
-scrubbed-:48:47.000 [warn] Failed to launch rendezvous circuit to [scrubbed]
-scrubbed-:50:34.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.
-scrubbed-:50:34.000 [warn] Giving up on launching a rendezvous circuit to [scrubbed] for hidden service [scrubbed]

a debug/info log follows soon

comment:18 Changed 6 months ago by pidgin

Here is a debug log for a tor process under 100% cpu with v2 onion service. some info scrubbed.
created 4MB debug logfile per second .

-scrubbed-:27.000 [debug] relay_send_command_from_edge_(): delivering 14 cell forward.
-scrubbed-:27.000 [debug] relay_send_command_from_edge_(): Sending a RELAY_EARLY cell; 5 remaining.
-scrubbed-:27.000 [debug] relay_encrypt_cell_outbound(): encrypting a layer of the relay cell.
-scrubbed-:27.000 [debug] relay_encrypt_cell_outbound(): encrypting a layer of the relay cell.
-scrubbed-:27.000 [debug] append_cell_to_circuit_queue(): Made a circuit active.
-scrubbed-:27.000 [debug] connection_or_process_cells_from_inbuf(): 11: starting, inbuf_datalen 1028 (0 pending in tls object).
-scrubbed-:27.000 [debug] channel_process_cell(): Processing incoming cell_t 0x7ffcaedf5950 for channel 0x563a770113e0 (global ID 2)
-scrubbed-:27.000 [debug] circuit_get_by_circid_channel_impl(): circuit_get_by_circid_channel_impl() returning circuit 0x563a779c4eb0 for circ_id 2380366636, channel ID 2 (0x563a770113e0)
-scrubbed-:27.000 [debug] circuit_receive_relay_cell(): Sending to origin.
-scrubbed-:27.000 [debug] connection_edge_process_relay_cell(): Now seen 10207 relay cells here (command 15, stream 0).
-scrubbed-:27.000 [debug] connection_edge_process_relay_cell(): Got an extended cell! Yay.
-scrubbed-:27.000 [info] circuit_finish_handshake(): Finished building circuit hop:
-scrubbed-:27.000 [info] internal circ (length 5, last hop $-scrubbed-): $-scrubbed-(open) $scrubbed(open) $scrubbed(open) $Cscrubbed(closed) $-scrubbed-(closed)
-scrubbed-:27.000 [debug] btc_cevent_rcvr(): CIRC gid=1541 evtype=2 reason=0 onehop=0
-scrubbed-:27.000 [debug] circuit_build_times_add_time(): Adding circuit build time 22290
-scrubbed-:27.000 [debug] circuit_send_intermediate_onion_skin(): starting to send subsequent skin.
-scrubbed-:27.000 [info] circuit_send_intermediate_onion_skin(): Sending extend relay cell.
-scrubbed-:27.000 [debug] relay_send_command_from_edge_(): delivering 14 cell forward.
-scrubbed-:27.000 [debug] relay_send_command_from_edge_(): Sending a RELAY_EARLY cell; 5 remaining.
-scrubbed-:27.000 [debug] relay_encrypt_cell_outbound(): encrypting a layer of the relay cell.
-scrubbed-:27.000 [debug] relay_encrypt_cell_outbound(): encrypting a layer of the relay cell.
-scrubbed-:27.000 [debug] relay_encrypt_cell_outbound(): encrypting a layer of the relay cell.
-scrubbed-:27.000 [debug] append_cell_to_circuit_queue(): Made a circuit active.
-scrubbed-:27.000 [debug] connection_or_process_cells_from_inbuf(): 11: starting, inbuf_datalen 514 (0 pending in tls object).
-scrubbed-:27.000 [debug] channel_process_cell(): Processing incoming cell_t 0x7ffcaedf5950 for channel 0x563a770113e0 (global ID 2)
-scrubbed-:27.000 [debug] circuit_get_by_circid_channel_impl(): circuit_get_by_circid_channel_impl() returning circuit 0x563a77524ad0 for circ_id 3378515571, channel ID 2 (0x563a770113e0)
-scrubbed-:27.000 [debug] circuit_receive_relay_cell(): Sending to origin.
-scrubbed-:27.000 [debug] connection_edge_process_relay_cell(): Now seen 10208 relay cells here (command 35, stream 0).
-scrubbed-:27.000 [info] rend_service_receive_introduction(): Received INTRODUCE2 cell for service "uuuuuuuuuuuuuuuuuuu" on circ 3378515571.
-scrubbed-:27.000 [debug] extend_info_from_node(): using YYYYYYYYYYYYYYY for scrubbed
-scrubbed-:27.000 [info] extend_info_from_node(): Including Ed25519 ID for $scrubbedscrubbed
-scrubbed-:27.000 [info] rep_hist_note_used_internal(): New port prediction added. Will continue predictive circ building for 3003 more seconds.
-scrubbed-:27.000 [debug] circuit_find_to_cannibalize(): Hunting for a circ to cannibalize: purpose 17, uptime 0, capacity 1, internal 1
-scrubbed-:27.000 [debug] new_route_len(): Chosen route length 5 (6571 direct and 6571 indirect routers suitable).
-scrubbed-:27.000 [info] onion_pick_cpath_exit(): Using requested exit node '$scrubbedscrubbed'
-scrubbed-:27.000 [debug] onion_extend_cpath(): Path is 0 long; we want 5
-scrubbed-:27.000 [info] select_primary_guard_for_circuit(): Selected primary guard rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrriiiiiiiiiiiiiiiiiii ($-scrubbed-) for circuit.
-scrubbed-:27.000 [debug] extend_info_from_node(): using XXXXXXXXXXXXXXzzzzzzzzzzzzzzzzzzzzzz for rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrriiiiiiiiiiiiiiiiiii
-scrubbed-:27.000 [info] extend_info_from_node(): Including Ed25519 ID for $-scrubbed-~rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrriiiiiiiiiiiiiiiiiii at XXXXXXXXXXXXXX
-scrubbed-:27.000 [debug] onion_extend_cpath(): Chose router $-scrubbed-~rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrriiiiiiiiiiiiiiiiiii at XXXXXXXXXXXXXX for hop #1 (exit is scrubbed)

-scrubbed-:27.000 [debug] onion_extend_cpath(): Path is 1 long; we want 5
-scrubbed-:27.000 [debug] choose_good_middle_server(): Contemplating intermediate hop #2: random choice.
-scrubbed-:27.000 [debug] choose_good_middle_server(): Picking a sticky node (cur_len = 1)
-scrubbed-:27.000 [debug] extend_info_from_node(): using rrrrrrrrrrrrrrrrrrrrrrrrzzzzzzzzzzzzzzzzzzzzzz for araglaucogularis
-scrubbed-:27.000 [info] extend_info_from_node(): Including Ed25519 ID for $scrubbed~scrubbed
-scrubbed-:27.000 [debug] onion_extend_cpath(): Chose router $scrubbed~scrubbed for hop #2 (exit is scrubbed)
-scrubbed-:27.000 [debug] onion_extend_cpath(): Path is 2 long; we want 5
-scrubbed-:27.000 [debug] choose_good_middle_server(): Contemplating intermediate hop #3: random choice.
-scrubbed-:27.000 [debug] choose_good_middle_server(): Picking a sticky node (cur_len = 2)
-scrubbed-:27.000 [debug] extend_info_from_node(): using scrubbed:kkkkkkkkkkkkkkkkkkkkkkkkkk for rrrrrrrrrrrrrrrrrrrrrrrrrrrr
-scrubbed-:27.000 [info] extend_info_from_node(): Including Ed25519 ID for $xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx~rrrrrrrrrrrrrrrrrrrrrrrrrrrr at scrubbed
-scrubbed-:27.000 [debug] onion_extend_cpath(): Chose router $xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx~rrrrrrrrrrrrrrrrrrrrrrrrrrrr at scrubbed for hop #3 (exit is scrubbed)
-scrubbed-:27.000 [debug] onion_extend_cpath(): Path is 3 long; we want 5
-scrubbed-:27.000 [debug] choose_good_middle_server(): Contemplating intermediate hop #4: random choice.
-scrubbed-:27.000 [debug] router_choose_random_node(): We found 5695 running nodes.
-scrubbed-:27.000 [debug] router_choose_random_node(): We removed 0 excludednodes, leaving 5695 nodes.
-scrubbed-:27.000 [debug] router_choose_random_node(): We removed 4 excludedsmartlist, leaving 5691 nodes.
-scrubbed-:27.000 [debug] compute_weighted_bandwidths(): Generated weighted bandwidths for rule weight as middle node based on weights Wg=0.400800 Wm=1.000000 We=0.000000 Wd=0.000000 with total bw 25389038256.000000
-scrubbed-:27.000 [debug] extend_info_from_node(): using fffffffffffffffffffffffffffffff:kkkkkkkkkkkkkkkkkkkkkkkkkk for wwwwwwwwwwwwwwwwwwwwww
-scrubbed-:27.000 [info] extend_info_from_node(): Including Ed25519 ID for $ggggggggggggggggggggggggggggggggggg~wwwwwwwwwwwwwwwwwwwwww at fffffffffffffffffffffffffffffff
-scrubbed-:27.000 [debug] onion_extend_cpath(): Chose router $ggggggggggggggggggggggggggggggggggg~wwwwwwwwwwwwwwwwwwwwww at fffffffffffffffffffffffffffffff for hop #4 (exit is scrubbed)
-scrubbed-:27.000 [debug] onion_extend_cpath(): Path is 4 long; we want 5
-scrubbed-:27.000 [debug] onion_extend_cpath(): Chose router $scrubbedscrubbed for hop #5 (exit is scrubbed)
-scrubbed-:27.000 [debug] onion_extend_cpath(): Path is complete: 5 steps long
-scrubbed-:27.000 [debug] btc_cevent_rcvr(): CIRC gid=2543 evtype=0 reason=0 onehop=0
-scrubbed-:27.000 [debug] circuit_handle_first_hop(): Looking for firsthop 'XXXXXXXXXXXXXXzzzzzzzzzzzzzzzzzzzzzz'
-scrubbed-:27.000 [debug] btc_event_rcvr(): CIRC gid=2543 chan=2 onehop=0
-scrubbed-:27.000 [debug] circuit_handle_first_hop(): Conn open. Delivering first onion skin.
-scrubbed-:27.000 [debug] circuit_send_first_onion_skin(): First skin; sending create cell.
-scrubbed-:27.000 [debug] circuit_get_by_circid_channel_impl(): circuit_get_by_circid_channel_impl() found nothing for circ_id 3035601791, channel ID 2 (0x563a770113e0)
-scrubbed-:27.000 [debug] circuit_deliver_create_cell(): Chosen circID 3035601791.
-scrubbed-:27.000 [debug] circuitmux_attach_circuit(): Attaching circuit 3035601791 on channel 2 to cmux 0x563a770079d0
-scrubbed-:27.000 [debug] append_cell_to_circuit_queue(): Made a circuit active.
-scrubbed-:27.000 [debug] btc_state_rcvr(): CIRC gid=2543 state=0 onehop=0
-scrubbed-:27.000 [info] circuit_send_first_onion_skin(): First hop: finished sending CREATE cell to '$-scrubbed-~rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrriiiiiiiiiiiiiiiiiii at XXXXXXXXXXXXXX'
-scrubbed-:27.000 [info] rend_service_receive_introduction(): Accepted intro; launching circuit to [scrubbed] (cookie 2E7D742C) for service uuuuuuuuuuuuuuuuuuu.
-scrubbed-:27.000 [debug] connection_or_process_cells_from_inbuf(): 11: starting, inbuf_datalen 0 (0 pending in tls object).
-scrubbed-:27.000 [debug] conn_read_callback(): socket 10 wants to read.
-scrubbed-:27.000 [debug] connection_buf_read_from_socket(): 10: starting, inbuf_datalen 0 (0 pending in tls object). at_most 16448.
-scrubbed-:27.000 [debug] connection_buf_read_from_socket(): After TLS read of 1028: 1057 read, 0 written
-scrubbed-:27.000 [debug] connection_or_process_cells_from_inbuf(): 10: starting, inbuf_datalen 1028 (0 pending in tls object).
-scrubbed-:27.000 [debug] channel_process_cell(): Processing incoming cell_t 0x7ffcaedf5950 for channel 0x563a76ffd020 (global ID 1)
-scrubbed-:27.000 [debug] circuit_get_by_circid_channel_impl(): circuit_get_by_circid_channel_impl() returning circuit 0x563a773f5cd0 for circ_id 3703046038, channel ID 1 (0x563a76ffd020)
-scrubbed-:27.000 [debug] circuit_receive_relay_cell(): Sending to origin.
-scrubbed-:27.000 [debug] connection_edge_process_relay_cell(): Now seen 10209 relay cells here (command 15, stream 0).
-scrubbed-:27.000 [debug] connection_edge_process_relay_cell(): Got an extended cell! Yay.
-scrubbed-:27.000 [info] circuit_finish_handshake(): Finished building circuit hop:
-scrubbed-:27.000 [info] internal circ (length 5, last hop wwwwwwwwwwwwwwwwwwwwww): $hhhhhhhhhhhhhhhhhhhhhhhhh(open) eeeeeeeeeeeeeeeeeeeeeeeee(open) nnnnnnnnnnnnnnnnnnnnnnnnnnnnnn(open) vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv(open) lllllllllllllllllllllllllllllllllllllllll(open)
-scrubbed-:27.000 [debug] btc_cevent_rcvr(): CIRC gid=1023 evtype=2 reason=0 onehop=0
-scrubbed-:27.000 [info] entry_guards_note_guard_success(): Recorded success for primary confirmed guard ($hhhhhhhhhhhhhhhhhhhhhhhhh)
-scrubbed-:27.000 [debug] btc_state_rcvr(): CIRC gid=1023 state=4 onehop=0
-scrubbed-:27.000 [info] circuit_build_no_more_hops(): circuit built!
-scrubbed-:27.000 [debug] btc_cevent_rcvr(): CIRC gid=1023 evtype=1 reason=0 onehop=0
-scrubbed-:27.000 [info] rend_service_rendezvous_has_opened(): Done building circuit 3703046038 to rendezvous with cookie A9ADDEC1 for service uuuuuuuuuuuuuuuuuuu96EB1FD02A5(open) nnnnnnnnnnnnnnnnnnnnnnnnnnnnnn(open) vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv(open) lllllllllllllllllllllllllllllllllllllllll(open)
-scrubbed-:27.000 [debug] btc_cevent_rcvr(): CIRC gid=1023 evtype=2 reason=0 onehop=0
-scrubbed-:27.000 [info] entry_guards_note_guard_success(): Recorded success for primary confirmed guard ($hhhhhhhhhhhhhhhhhhhhhhhhh)
-scrubbed-:27.000 [debug] btc_state_rcvr(): CIRC gid=1023 state=4 onehop=0
-scrubbed-:27.000 [info] circuit_build_no_more_hops(): circuit built!
-scrubbed-:27.000 [debug] btc_cevent_rcvr(): CIRC gid=1023 evtype=1 reason=0 onehop=0
-scrubbed-:27.000 [info] rend_service_rendezvous_has_opened(): Done building circuit 3703046038 to rendezvous with cookie A9ADDEC1 for service uuuuuuuuuuuuuuuuuuu
-scrubbed-:27.000 [info] internal circ (length 5): $hhhhhhhhhhhhhhhhhhhhhhhhh(open) eeeeeeeeeeeeeeeeeeeeeeeee(open) nnnnnnnnnnnnnnnnnnnnnnnnnnnnnn(open) vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv(open) lllllllllllllllllllllllllllllllllllllllll(open)

Last edited 6 months ago by pidgin (previous) (diff)

comment:19 in reply to:  13 Changed 6 months ago by pidgin

Replying to teor:

This looks like a duplicate of #25461.

We'll need to fix #25461, or confirm that you're not using an affected version, before we can isolate this bug,

How do you know there is an attacker, rather than just lots of clients using your service?

the attacker contacted us for extortion. The attacker turned the DDOS off and on, he informed us about times.
cpu loads matches times. cpu load is at all instances at 100% at attack. if DDOS not taking place, all tor processes are at 10% max (average far less)

comment:20 Changed 6 months ago by asn

Hm, those logs are OK but not useful enough unfortunately.

The only interesting thing is that your circuit build times are big (like 22.9 seconds for a single circuit):
circuit_build_times_add_time(): Adding circuit build time 22290

Not much I can get from it. You also seem to have borked the logs in the end since rend_service_rendezvous_has_opened() appears twice with duplicated stuff around it.

I think ideally we would need a log over a greater period of time (like 3-5 mins) to see what the attacker does over time. Perhaps the way to do it is to only gather info logs (I don't think we need debugs), and compress it before attaching it here. Otherwise, send it to us in a personal email.

comment:21 Changed 6 months ago by teor

Parent ID: #25461

Ok, so this is not #25461.

Here's another interesting log:

-scrubbed-:47:25.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.

This looks like an instance of #28962, where Tor fails guards based on the absolute number of failures of each guard, rather than checking if the guard is (much) worse than the average number of failures.

We might end up fixing #28862 as part of this ticket, or as part of our IPv6 work.

comment:22 Changed 6 months ago by teor

The v3 service doesn't have the guard issue, it's getting much further along. Have you tried running it on a separate machine? (Or by itself without v2 running on the same machine?)

comment:23 Changed 6 months ago by teor

Summary: Denial of service on v2 onion serviceDenial of service on v2 and v3 onion service

comment:24 in reply to:  21 Changed 6 months ago by pidgin

Replying to teor:

Ok, so this is not #25461.

Here's another interesting log:

-scrubbed-:47:25.000 [warn] Failed to find node for hop #1 of our path. Discarding this circuit.

This looks like an instance of #28962, where Tor fails guards based on the absolute number of failures of each guard, rather than checking if the guard is (much) worse than the average number of failures.

We might end up fixing #28862 as part of this ticket, or as part of our IPv6 work.

Hi Teor, what is the e-mail i can mail it to ? Also do you have a public pgp i can import so i can send it in PGP format ?

Version 1, edited 6 months ago by pidgin (previous) (next) (diff)

comment:25 Changed 6 months ago by pidgin

Any updates?

comment:26 in reply to:  25 ; Changed 6 months ago by asn

Replying to pidgin:

Any updates?

You could send logs to my email address/PGP: https://www.torproject.org/about/corepeople.html.en#asn
Teor is also there: https://www.torproject.org/about/corepeople.html.en#teor

Keep in mind it's the weekend and we are generally all pretty busy! :/

comment:27 in reply to:  26 Changed 6 months ago by pidgin

Replying to asn:

Replying to pidgin:

Any updates?

You could send logs to my email address/PGP: https://www.torproject.org/about/corepeople.html.en#asn
Teor is also there: https://www.torproject.org/about/corepeople.html.en#teor

Keep in mind it's the weekend and we are generally all pretty busy! :/

Understandable, thank you for the information.
Send to both e-mails.

Last edited 6 months ago by pidgin (previous) (diff)

comment:28 in reply to:  22 ; Changed 6 months ago by teor

Hi pidgin,

We need to know more about your setup to help you:

Replying to teor:

The v3 service doesn't have the guard issue, it's getting much further along. Have you tried running it on a separate machine? (Or by itself without v2 running on the same machine?)

comment:29 Changed 6 months ago by asn

Hey Pidgin,

I looked at the logs but nothing jumped out at me as super weird. There is obviously lots of activity on the HS, but doesn't seem like something that would 100% CPU your Tor (you are receiving a rendezvous request every two seconds). There is probably something else going on that I couldn't see from that first look.

Another approach would be to try to profile Tor to see which functions are causing the CPU increase: https://gitweb.torproject.org/tor.git/tree/doc/HACKING/HelpfulTools.md#n183
It's likely that this might also not bear any fruit, but it might be worth trying...

Also, take extra care in sanitizing guard nicknames when sending logs (grep "for guard").

Finally, I'm pretty overwhelmed with stuff for March, so I will probably not have much time to look at this until April.

comment:30 in reply to:  28 ; Changed 6 months ago by pidgin

Replying to teor:

Hi pidgin,

We need to know more about your setup to help you:

Replying to teor:

The v3 service doesn't have the guard issue, it's getting much further along. Have you tried running it on a separate machine? (Or by itself without v2 running on the same machine?)

there is enough cpu and bandwidth

comment:31 in reply to:  30 Changed 6 months ago by teor

Replying to pidgin:

Replying to teor:

Hi pidgin,

We need to know more about your setup to help you:

Replying to teor:

The v3 service doesn't have the guard issue, it's getting much further along. Have you tried running it on a separate machine? (Or by itself without v2 running on the same machine?)

there is enough cpu and bandwidth

Tor also uses many other resources that can become exhausted, like sockets, memory, and various kernel data structures.

Can you please test the v3 service on a machine by itself?

comment:32 Changed 6 months ago by pidgin

the v3 onion tor process was at 100% cpu at another machine

last notice log messages:

scrubbed:39:57.000 [notice] Extremely large value for circuit build timeout: 128s. Assuming clock jump. Purpose 14 (Measuring circuit timeout)
scrubbed:39:57.000 [notice] Extremely large value for circuit build timeout: 128s. Assuming clock jump. Purpose 14 (Measuring circuit timeout)
scrubbed:39:57.000 [notice] Extremely large value for circuit build timeout: 128s. Assuming clock jump. Purpose 14 (Measuring circuit timeout)
scrubbed:39:57.000 [notice] Extremely large value for circuit build timeout: 128s. Assuming clock jump. Purpose 14 (Measuring circuit timeout)
scrubbed:39:57.000 [notice] Extremely large value for circuit build timeout: 128s. Assuming clock jump. Purpose 14 (Measuring circuit timeout)
scrubbed:39:57.000 [notice] Extremely large value for circuit build timeout: 128s. Assuming clock jump. Purpose 14 (Measuring circuit timeout)
scrubbed:57:39.000 [warn] Unknown introduction point auth key on circuit 4162469458 for service [scrubbed]
scrubbed:57:39.000 [warn] circuit_mark_for_close_(): Bug: Duplicate call to circuit_mark_for_close at src/feature/hs/hs_service.c:3227 (first at src/feature/hs/hs_service.c:2446) (on Tor 0.4.0.1-alpha 81f1b89efc94723f)

with the logmessage "57:39.000 [warn] circuit_mark_for_close_():"
the cpu is now at 0% and the service is not accessible

comment:33 Changed 6 months ago by 993872315

if tor developer need direct traffic analyzing then I propose the following:

  1. user pidgin sets up onionbalance server to use an main onion that is being attacked
  1. tor developer provide onion url to pidgin so pidgin can plug-in to onionbalance mirror only
  1. tor developer receive dos attack to mirror onion, then developer can analyze attack exactly

comment:34 Changed 6 months ago by asn

Sponsor: Sponsor27-can

comment:35 Changed 6 months ago by asn

Summary: Denial of service on v2 and v3 onion service2019 Q1: Denial of service on v2 and v3 onion service

Closed #29919 as a duplicate for this one. More info over there.

comment:36 in reply to:  35 ; Changed 6 months ago by HelpDOS

Replying to asn:

Closed #29919 as a duplicate for this one. More info over there.

Hi asn,

Understandable why you closed my ticket, at a point of desperation and just hoping someone will take real interest in looking into this. Which is why I am able to offer access to a server that is currently being attacked. I believe I saw a chat log of you first discussing complex mode in 2015 for OnionBalance. Do you have any links for how to enable it/configure it? I am going to try it out to see if it is a resolution for this, with the theory of introduction points being attacked.

Thank you.

comment:37 in reply to:  36 ; Changed 6 months ago by asn

Replying to HelpDOS:

Replying to asn:

Closed #29919 as a duplicate for this one. More info over there.

Hi asn,

Understandable why you closed my ticket, at a point of desperation and just hoping someone will take real interest in looking into this. Which is why I am able to offer access to a server that is currently being attacked. I believe I saw a chat log of you first discussing complex mode in 2015 for OnionBalance. Do you have any links for how to enable it/configure it? I am going to try it out to see if it is a resolution for this, with the theory of introduction points being attacked.

Thank you.

Hey, I just remembered that complex mode was never implemented for onionbalance, because it was harder to implement and we thought there was no real use for it.

I'm not currently interested (or have the time) to get access to a server that is under attack.

I think the most useful thing right now would be to have more logs that display the attack. I want debug or info logs that last for 1-2 hours of the attack and display the whole Tor lifetime (from startup to shutdown). Please sanitize them correctly (make sure that guard names and onion names are not visible).

Same for vanguard logs on debug or info if you use vanguards.

comment:38 in reply to:  37 ; Changed 6 months ago by HelpDOS

Replying to asn:

Replying to HelpDOS:

Replying to asn:

Closed #29919 as a duplicate for this one. More info over there.

Hi asn,

Understandable why you closed my ticket, at a point of desperation and just hoping someone will take real interest in looking into this. Which is why I am able to offer access to a server that is currently being attacked. I believe I saw a chat log of you first discussing complex mode in 2015 for OnionBalance. Do you have any links for how to enable it/configure it? I am going to try it out to see if it is a resolution for this, with the theory of introduction points being attacked.

Thank you.

Hey, I just remembered that complex mode was never implemented for onionbalance, because it was harder to implement and we thought there was no real use for it.

I'm not currently interested (or have the time) to get access to a server that is under attack.

I think the most useful thing right now would be to have more logs that display the attack. I want debug or info logs that last for 1-2 hours of the attack and display the whole Tor lifetime (from startup to shutdown). Please sanitize them correctly (make sure that guard names and onion names are not visible).

Same for vanguard logs on debug or info if you use vanguards.

I will provide you with any logs I can later today. Could you please send a full list of anything that could help in debugging just to make sure you have everything relevant? Thank you

comment:39 in reply to:  38 Changed 6 months ago by asn

Replying to HelpDOS:

Replying to asn:

Replying to HelpDOS:

Replying to asn:

Closed #29919 as a duplicate for this one. More info over there.

Hi asn,

Understandable why you closed my ticket, at a point of desperation and just hoping someone will take real interest in looking into this. Which is why I am able to offer access to a server that is currently being attacked. I believe I saw a chat log of you first discussing complex mode in 2015 for OnionBalance. Do you have any links for how to enable it/configure it? I am going to try it out to see if it is a resolution for this, with the theory of introduction points being attacked.

Thank you.

Hey, I just remembered that complex mode was never implemented for onionbalance, because it was harder to implement and we thought there was no real use for it.

I'm not currently interested (or have the time) to get access to a server that is under attack.

I think the most useful thing right now would be to have more logs that display the attack. I want debug or info logs that last for 1-2 hours of the attack and display the whole Tor lifetime (from startup to shutdown). Please sanitize them correctly (make sure that guard names and onion names are not visible).

Same for vanguard logs on debug or info if you use vanguards.

I will provide you with any logs I can later today. Could you please send a full list of anything that could help in debugging just to make sure you have everything relevant? Thank you

Hm. It would be great if we could have all debug logs from Tor startup to Tor shutdown. Please scrub the names of your primary guards and your onion address and anything else that might seem pervasive, but please try to not destroy the accuracy of the logs (by double-pasting or removing surrounding lines).

Another thing that might be helpful would be to try with a blank state file so that Tor discards any previous circuit timeouts and performance measurements etc. (you can find the state file in your data directory. please don't delete it, just backup it somewhere else so that you can then restore it).

Also, it would be better to not use vanguards while you are collecting these logs because it's hard to keep track of what they are doing while reading the logs.

Another useful thing would be to try to profile Tor to see which functions are causing the CPU increase: ​https://gitweb.torproject.org/tor.git/tree/doc/HACKING/HelpfulTools.md#n183

Finally, please let us know of any custom configurations you've done on your onion service.

Cheers.

Last edited 5 months ago by asn (previous) (diff)

comment:41 Changed 6 months ago by asn

Points: 10

Any news on the logs here?

comment:42 Changed 6 months ago by asn

Keywords: tor-hs tor-dos added

comment:43 Changed 6 months ago by nickm

Milestone: Tor: 0.4.1.x-final
Status: acceptedneeds_information

Possible for 0.4.1 if we get the correct insights here, though it isn't guaranteed :/

comment:44 Changed 6 months ago by pidgin

Problem is still not solved, still the same error.
I have provided everything i could to you guys i have no clue what to do else.

comment:45 in reply to:  44 Changed 6 months ago by asn

Replying to pidgin:

Problem is still not solved, still the same error.
I have provided everything i could to you guys i have no clue what to do else.

Hello. No one has provided the logs asked in comment:39 yet.

comment:46 in reply to:  44 Changed 5 months ago by HelpDOS

Replying to pidgin:

Problem is still not solved, still the same error.
I have provided everything i could to you guys i have no clue what to do else.

I found a complete solution but it needs more work to be reliable enough. Get in touch with me via your onion.

Will provide the solution here also but it is a novel method of identifying offending connections so they can then be dropped, which would likely not be implemented into core tor.

comment:47 in reply to:  44 Changed 5 months ago by HelpDOS

Replying to pidgin:

Problem is still not solved, still the same error.
I have provided everything i could to you guys i have no clue what to do else.

Via "librarytask"

comment:48 Changed 5 months ago by asn

HelpDOS I don't know what 'librarytask' is. Also, feel free to send us any additional information over email. Please use asn@torproject.org or dgoulet@torproject.org. We are the main active onion service developers right now.

Last edited 5 months ago by asn (previous) (diff)

comment:49 Changed 5 months ago by asn

Sponsor: Sponsor27-canSponsor27-must

comment:50 Changed 5 months ago by asn

Parent ID: #29999

comment:51 Changed 5 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

Add keyword to tickets in network team's roadmap.

comment:52 Changed 5 months ago by dgoulet

Status update:

asn and I have setup an environment to reproduce this INTRODUCE2 DDoS for which we were successful at reproducing the max CPU utilization on the service. However, we haven't figured out just yet how can the service still receives INTRODUCE2 cells 30+ minutes after the circuit has been closed (found from the logs given in private).

Ticket #30291 has been opened regarding a reason of the high CPU usage. And nickm already worked on improvements so we expect these upstream soon.

We'll be working on the DoS master ticket #29999, especially #15516 and #26294 in the coming weeks. Improvements will be coming to master incrementally thus expect more updates about the situation as we progress in this work.

comment:53 Changed 5 months ago by dgoulet

comment:54 in reply to:  52 Changed 4 months ago by HelpDOS

Replying to dgoulet:

Status update:

asn and I have setup an environment to reproduce this INTRODUCE2 DDoS for which we were successful at reproducing the max CPU utilization on the service. However, we haven't figured out just yet how can the service still receives INTRODUCE2 cells 30+ minutes after the circuit has been closed (found from the logs given in private).

Ticket #30291 has been opened regarding a reason of the high CPU usage. And nickm already worked on improvements so we expect these upstream soon.

We'll be working on the DoS master ticket #29999, especially #15516 and #26294 in the coming weeks. Improvements will be coming to master incrementally thus expect more updates about the situation as we progress in this work.

Hi, any further updates? Great to see progress is being made, it is really appreciated!

comment:55 Changed 4 months ago by nickm

Keywords: security added

comment:56 Changed 4 months ago by nickm

Keywords: 041-longterm added

Marking tickets that I think are valuable but which are likely to need more work in 0.4.2.

comment:57 Changed 4 months ago by nickm

Closed #30620 as a duplicate of this, but possibly a useful one: it has debug logs.

comment:58 Changed 4 months ago by pidgin

Any updates on this problem ??

comment:59 in reply to:  58 Changed 4 months ago by dgoulet

Replying to pidgin:

Any updates on this problem ??

Unfortunately not that much that could help stop this problem at once. To summarize:

  1. We've identified the cause of the DoS and defense vectors.
  2. Out of this investigation, a series of bugs were also found including reducing CPU load on path selection (#30291).
  3. We decided to focus on one important defense which will be done through #15516. It primarily focus on defending the network by soaking huge amount of introduction at the intro point so the service doesn't get bombarded. Should help with availability (service will not be overloaded) but not reachability (intro point could drop legit requests).

Our primary goal for now is to protect the network and try as much as possible to avoid too much pressure on it like the last massive HS DoS back in the early months of 2018.

comment:60 Changed 4 months ago by nickm

Keywords: 041-deferred-20190530 added

Marking these tickets as deferred from 041.

comment:61 Changed 4 months ago by nickm

Milestone: Tor: 0.4.1.x-finalTor: 0.4.2.x-final

comment:63 in reply to:  58 Changed 3 months ago by rckthe

Replying to pidgin:

Any updates on this problem ??

Please confirm sam-*-*- discussed by WC is official?

comment:64 Changed 3 months ago by pidgin

it's a beautiful sunny day today.

comment:65 in reply to:  62 Changed 8 weeks ago by HelpDOS

Replying to asn:

A plausible plan forward: https://lists.torproject.org/pipermail/tor-dev/2019-May/013849.html

Any further developments or recent discussions you could link to? My hidden service has been unavailable due to this since February, glad a resolution is being worked on and some of the CPU fixes helped a little but I'm out of the loop as to where you are with the rate limiting and PoW, more interest in the PoW since I don't think the rate limiting will assist with availability at all.

comment:66 Changed 4 days ago by nickm

Keywords: 042-deferred-20190918 added
Milestone: Tor: 0.4.2.x-finalTor: 0.4.3.x-final

Defer numerous 0.4.2 tickets to 0.4.3.

Note: See TracTickets for help on using tickets.