Opened 8 months ago

Closed 5 weeks ago

#28970 closed defect (fixed)

tor_bug_occurred_(): Bug: ../src/or/hs_client.c:624: setup_intro_circ_auth_key: Non-fatal assertion

Reported by: torcrash Owned by: dgoulet
Priority: High Milestone: Tor: 0.4.2.x-final
Component: Core Tor/Tor Version: Tor: 0.3.4.9
Severity: Critical Keywords: tor-client, tor-hs, postfreeze-ok, 040-unreached-must, network-team-roadmap-august, regression?, 041-unreached-must
Cc: asn Actual Points:
Parent ID: #29995 Points: 5
Reviewer: Sponsor: Sponsor27-must

Description

If this is duplicate I'm sorry, but I found nothing similar in old tickets. Sandbox is 1 in torrc. Set priority and severity of this ticket to appropriate values if it is not so critical.

Jan 01 05:31:32.000 [notice] Received reload signal (hup). Reloading config and resetting internal state.
Jan 01 05:31:32.000 [notice] Read configuration file "/usr/share/tor/tor-service-defaults-torrc".
Jan 01 05:31:32.000 [notice] Read configuration file "/etc/tor/torrc".
Jan 01 05:31:32.000 [notice] Tor 0.3.4.9 (git-074ca2e0054fded1) opening log file.
Jan 01 05:31:33.000 [warn] tor_bug_occurred_(): Bug: ../src/or/hs_client.c:624: setup_intro_circ_auth_key: Non-fatal assertion !(desc == NULL) failed. (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: Non-fatal assertion !(desc == NULL) failed in setup_intro_circ_auth_key at ../src/or/hs_client.c:624. Stack trace: (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(log_backtrace+0x42) [0x56345cea26e2] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_bug_occurred_+0xb7) [0x56345cebd587] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(hs_client_circuit_has_opened+0x2ca) [0x56345ce8027a] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(circuit_send_next_onion_skin+0x372) [0x56345cdfd192] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x78fc3) [0x56345cd92fc3] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(circuit_receive_relay_cell+0x311) [0x56345cd95211] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(command_process_cell+0x280) [0x56345ce126f0] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(channel_tls_handle_cell+0x273) [0x56345cdf4493] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x11de1c) [0x56345ce37e1c] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(connection_handle_read+0x892) [0x56345ce2e822] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x53a61) [0x56345cd6da61] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x7fc) [0x7f2c7be623dc] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(do_main_loop+0x25f) [0x56345cd6fe3f] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_run_main+0x1165) [0x56345cd72315] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_main+0x3a) [0x56345cd6a32a] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(main+0x19) [0x56345cd6a099] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2c7a7e6b45] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x500e9) [0x56345cd6a0e9] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] tor_bug_occurred_(): Bug: ../src/or/hs_client.c:443: intro_circ_is_ok: Non-fatal assertion !(!hs_ident_intro_circ_is_valid(circ->hs_ident)) failed. (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: Non-fatal assertion !(!hs_ident_intro_circ_is_valid(circ->hs_ident)) failed in intro_circ_is_ok at ../src/or/hs_client.c:443. Stack trace: (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(log_backtrace+0x42) [0x56345cea26e2] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_bug_occurred_+0xb7) [0x56345cebd587] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(hs_client_send_introduce1+0x352) [0x56345ce7fda2] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(connection_ap_handshake_attach_circuit+0x33c) [0x56345ce1126c] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(connection_ap_attach_pending+0x1a0) [0x56345ce31cb0] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(hs_client_receive_rendezvous_acked+0x79) [0x56345ce803c9] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(rend_process_relay_cell+0x28b) [0x56345cd9c2cb] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x791f7) [0x56345cd931f7] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(circuit_receive_relay_cell+0x311) [0x56345cd95211] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(command_process_cell+0x280) [0x56345ce126f0] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(channel_tls_handle_cell+0x273) [0x56345cdf4493] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x11de1c) [0x56345ce37e1c] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(connection_handle_read+0x892) [0x56345ce2e822] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x53a61) [0x56345cd6da61] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x7fc) [0x7f2c7be623dc] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(do_main_loop+0x25f) [0x56345cd6fe3f] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_run_main+0x1165) [0x56345cd72315] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_main+0x3a) [0x56345cd6a32a] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(main+0x19) [0x56345cd6a099] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2c7a7e6b45] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x500e9) [0x56345cd6a0e9] (on Tor 0.3.4.9 )

Tor on Linux. Is it related to #27471 or #19926?

Child Tickets

TicketStatusOwnerSummaryComponent
#30921closeddgouleths-v3: Close intro circuits when cleaning up the client descriptor cacheCore Tor/Tor

Change History (27)

comment:1 Changed 8 months ago by nickm

Milestone: Tor: 0.4.0.x-final
Priority: ImmediateMedium

comment:2 Changed 7 months ago by nickm

Keywords: postfreeze-ok added

Mark some tickets as postfreeze-ok, to indicate that I think they are okay to accept in 0.4.0 post-freeze. Does not indicate that they are all necessary to do postfreeze.

comment:3 Changed 6 months ago by nickm

Keywords: 040-must added

Marking tickets as 040-must based on triage with dgoulet.

comment:4 Changed 6 months ago by nickm

Priority: MediumHigh

comment:5 Changed 5 months ago by teor

Owner: set to asn
Status: newassigned

I think asn is the best person to do this ticket.

comment:6 Changed 5 months ago by asn

Thanks for the report.

The second non-fatal assert seems reasonable given the lack of error-checking of setup_intro_circ_auth_key(), but that's not the actual bug.

The actual bug is the lack of descriptor when setting up an intro circuit. Where did the descriptor go between launching the intro circuit and the intro circ opening?

This could happen naturally given a very rare race condition: If we fetch a new consensus before the intro circuit opens (but after launching it) and we call cache_clean_v3_as_client() and the descriptor was old, perhaps it expires, gets cleaned and causes the bug.

But I don't believe in very rare race conditions, so I'd take the bug explanation more easily. However, to debug such a bug we would need info logs. Can someone reproduce this or submit info logs for this?

Thank you very much! :)

comment:7 Changed 5 months ago by asn

Status: assignedneeds_information

comment:8 Changed 5 months ago by asn

Sponsor: Sponsor27-can

comment:9 Changed 5 months ago by asn

Milestone: Tor: 0.4.0.x-finalTor: 0.4.1.x-final

It's hard to debug this issue and I can't think of a useful log_warn to add when the bug triggers so that we can detect this better in the future. Info logs would help us figure it out, but we might want to look deeper to improve logging.

I have assigned this to Sponsor27 and I'm delaying it to 041 since this does not seem frequent enough or security-related.

comment:10 Changed 5 months ago by teor

Keywords: 040-unreached-must added; 040-must removed

comment:11 Changed 5 months ago by asn

Sponsor: Sponsor27-canSponsor27-must

comment:12 Changed 5 months ago by asn

Points: 5

comment:13 Changed 5 months ago by pili

Parent ID: #29995

comment:14 Changed 4 months ago by asn

Potentially related to #29034.

comment:15 Changed 4 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

Add keyword for tickets in the network team roadmap.

comment:16 in reply to:  6 Changed 4 months ago by torcrash

Replying to asn:

Thanks for the report.

Thank you for the feedback. Such bugs are not predictable. Last time I think about enabling debug level for tor logs by default, but it would solve the problem only partially. Debug logs are quite big. They contain too much information about Tor usage and local machine. It will be hard to clean huge chunks of such data reliably. Now Tor doesn't have safe debug logging (with logs which can be shared with anybody).

comment:17 Changed 3 months ago by nickm

Keywords: regression? added

comment:18 Changed 3 months ago by nickm

Keywords: 041-must added

comment:19 Changed 3 months ago by asn

Owner: asn deleted
Status: needs_informationassigned

We are hoping that #29034 will fix this issue, but I don't have capacity to fix this in 041 if that doesn't do it.

comment:20 Changed 3 months ago by dgoulet

Cc: dgoulet removed
Status: assignednew

comment:21 Changed 3 months ago by asn

Unclear if the fix from #29034 helped this issue. We managed to reproduce #29034 with HS circuits but need more investigation.

Last edited 3 months ago by asn (previous) (diff)

comment:22 in reply to:  21 Changed 3 months ago by dgoulet

Keywords: 041-unreached-must added; 041-must removed
Milestone: Tor: 0.4.1.x-finalTor: 0.4.2.x-final

Replying to asn:

Unclear if the fix from #29034 helped this issue. We managed to reproduce #29034 with HS circuits but need more investigation.

Honestly, I spent a lot of time trying to find out why this bug can still happen. My approach here would be to "think" it is fixed since #29034 (new version of it) was merged and released. No more occurrence of this would mean for me it is gone (after some time).

We could go back into investigating everything but I'm not sure this would be the best time investment for now.

For the above reasons, I'm moving it to the 042 milestone so we can re-visit it before 042 release. Change this if you disagree, I'll be fine with it :).

comment:23 Changed 3 months ago by asn

IMO before closing this issue we need at least an explanation of why we think that #29034 might affect this ticket. On all the rest, I agree.

comment:24 Changed 2 months ago by dgoulet

One strong possible explanation for this would be #30871. It has been established now that the has_opened() callback can be done on a "mark for close" circuit which means that the circuit was removed from the global circuit map and stripped of its HS identifier.

Second possibility is #30921 that I just opened.

I think both of these are strongly related to this one. More chances of fixing this once both these tickets are closed imo!

comment:25 Changed 2 months ago by dgoulet

Owner: set to dgoulet
Status: newaccepted

comment:26 Changed 6 weeks ago by gaba

Keywords: network-team-roadmap-august added; network-team-roadmap-2019-Q1Q2 removed

comment:27 Changed 5 weeks ago by dgoulet

Resolution: fixed
Status: acceptedclosed

As per discussion in Stockholm, we consider that most likely #29034, #30921 and #30871 (merged in master, awaiting backport) should make this problem go away.

Take 4, lets be hopeful! We can re-open if this still happens on >= tor-0.4.1.3-alpha.

Note: See TracTickets for help on using tickets.