Opened 12 months ago

Closed 3 weeks ago

#28970 closed defect (fixed)

tor_bug_occurred_(): Bug: ../src/or/hs_client.c:624: setup_intro_circ_auth_key: Non-fatal assertion

Reported by: torcrash Owned by: dgoulet
Priority: High Milestone: Tor: 0.3.5.x-final
Component: Core Tor/Tor Version: Tor: 0.3.4.9
Severity: Critical Keywords: consider-backport-after-0424, 042-backport, 041-backport, 040-backport, 035-backport, tor-client, tor-hs, postfreeze-ok, 040-unreached-must, network-team-roadmap-august, regression?, 041-unreached-must, 042-should
Cc: Actual Points: 0.1
Parent ID: #29995 Points: 5
Reviewer: teor Sponsor: Sponsor27-must

Description

If this is duplicate I'm sorry, but I found nothing similar in old tickets. Sandbox is 1 in torrc. Set priority and severity of this ticket to appropriate values if it is not so critical.

Jan 01 05:31:32.000 [notice] Received reload signal (hup). Reloading config and resetting internal state.
Jan 01 05:31:32.000 [notice] Read configuration file "/usr/share/tor/tor-service-defaults-torrc".
Jan 01 05:31:32.000 [notice] Read configuration file "/etc/tor/torrc".
Jan 01 05:31:32.000 [notice] Tor 0.3.4.9 (git-074ca2e0054fded1) opening log file.
Jan 01 05:31:33.000 [warn] tor_bug_occurred_(): Bug: ../src/or/hs_client.c:624: setup_intro_circ_auth_key: Non-fatal assertion !(desc == NULL) failed. (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: Non-fatal assertion !(desc == NULL) failed in setup_intro_circ_auth_key at ../src/or/hs_client.c:624. Stack trace: (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(log_backtrace+0x42) [0x56345cea26e2] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_bug_occurred_+0xb7) [0x56345cebd587] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(hs_client_circuit_has_opened+0x2ca) [0x56345ce8027a] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(circuit_send_next_onion_skin+0x372) [0x56345cdfd192] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x78fc3) [0x56345cd92fc3] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(circuit_receive_relay_cell+0x311) [0x56345cd95211] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(command_process_cell+0x280) [0x56345ce126f0] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(channel_tls_handle_cell+0x273) [0x56345cdf4493] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x11de1c) [0x56345ce37e1c] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(connection_handle_read+0x892) [0x56345ce2e822] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x53a61) [0x56345cd6da61] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x7fc) [0x7f2c7be623dc] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(do_main_loop+0x25f) [0x56345cd6fe3f] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_run_main+0x1165) [0x56345cd72315] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_main+0x3a) [0x56345cd6a32a] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(main+0x19) [0x56345cd6a099] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2c7a7e6b45] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x500e9) [0x56345cd6a0e9] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] tor_bug_occurred_(): Bug: ../src/or/hs_client.c:443: intro_circ_is_ok: Non-fatal assertion !(!hs_ident_intro_circ_is_valid(circ->hs_ident)) failed. (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: Non-fatal assertion !(!hs_ident_intro_circ_is_valid(circ->hs_ident)) failed in intro_circ_is_ok at ../src/or/hs_client.c:443. Stack trace: (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(log_backtrace+0x42) [0x56345cea26e2] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_bug_occurred_+0xb7) [0x56345cebd587] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(hs_client_send_introduce1+0x352) [0x56345ce7fda2] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(connection_ap_handshake_attach_circuit+0x33c) [0x56345ce1126c] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(connection_ap_attach_pending+0x1a0) [0x56345ce31cb0] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(hs_client_receive_rendezvous_acked+0x79) [0x56345ce803c9] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(rend_process_relay_cell+0x28b) [0x56345cd9c2cb] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x791f7) [0x56345cd931f7] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(circuit_receive_relay_cell+0x311) [0x56345cd95211] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(command_process_cell+0x280) [0x56345ce126f0] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(channel_tls_handle_cell+0x273) [0x56345cdf4493] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x11de1c) [0x56345ce37e1c] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(connection_handle_read+0x892) [0x56345ce2e822] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x53a61) [0x56345cd6da61] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x7fc) [0x7f2c7be623dc] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(do_main_loop+0x25f) [0x56345cd6fe3f] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_run_main+0x1165) [0x56345cd72315] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(tor_main+0x3a) [0x56345cd6a32a] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(main+0x19) [0x56345cd6a099] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2c7a7e6b45] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug:     /usr/bin/tor(+0x500e9) [0x56345cd6a0e9] (on Tor 0.3.4.9 )

Tor on Linux. Is it related to #27471 or #19926?

Child Tickets

TicketStatusOwnerSummaryComponent
#30921closeddgouleths-v3: Close intro circuits when cleaning up the client descriptor cacheCore Tor/Tor

Change History (43)

comment:1 Changed 12 months ago by nickm

Milestone: Tor: 0.4.0.x-final
Priority: ImmediateMedium

comment:2 Changed 11 months ago by nickm

Keywords: postfreeze-ok added

Mark some tickets as postfreeze-ok, to indicate that I think they are okay to accept in 0.4.0 post-freeze. Does not indicate that they are all necessary to do postfreeze.

comment:3 Changed 10 months ago by nickm

Keywords: 040-must added

Marking tickets as 040-must based on triage with dgoulet.

comment:4 Changed 10 months ago by nickm

Priority: MediumHigh

comment:5 Changed 9 months ago by teor

Owner: set to asn
Status: newassigned

I think asn is the best person to do this ticket.

comment:6 Changed 9 months ago by asn

Thanks for the report.

The second non-fatal assert seems reasonable given the lack of error-checking of setup_intro_circ_auth_key(), but that's not the actual bug.

The actual bug is the lack of descriptor when setting up an intro circuit. Where did the descriptor go between launching the intro circuit and the intro circ opening?

This could happen naturally given a very rare race condition: If we fetch a new consensus before the intro circuit opens (but after launching it) and we call cache_clean_v3_as_client() and the descriptor was old, perhaps it expires, gets cleaned and causes the bug.

But I don't believe in very rare race conditions, so I'd take the bug explanation more easily. However, to debug such a bug we would need info logs. Can someone reproduce this or submit info logs for this?

Thank you very much! :)

comment:7 Changed 9 months ago by asn

Status: assignedneeds_information

comment:8 Changed 9 months ago by asn

Sponsor: Sponsor27-can

comment:9 Changed 9 months ago by asn

Milestone: Tor: 0.4.0.x-finalTor: 0.4.1.x-final

It's hard to debug this issue and I can't think of a useful log_warn to add when the bug triggers so that we can detect this better in the future. Info logs would help us figure it out, but we might want to look deeper to improve logging.

I have assigned this to Sponsor27 and I'm delaying it to 041 since this does not seem frequent enough or security-related.

comment:10 Changed 9 months ago by teor

Keywords: 040-unreached-must added; 040-must removed

comment:11 Changed 9 months ago by asn

Sponsor: Sponsor27-canSponsor27-must

comment:12 Changed 9 months ago by asn

Points: 5

comment:13 Changed 9 months ago by pili

Parent ID: #29995

comment:14 Changed 8 months ago by asn

Potentially related to #29034.

comment:15 Changed 8 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

Add keyword for tickets in the network team roadmap.

comment:16 in reply to:  6 Changed 8 months ago by torcrash

Replying to asn:

Thanks for the report.

Thank you for the feedback. Such bugs are not predictable. Last time I think about enabling debug level for tor logs by default, but it would solve the problem only partially. Debug logs are quite big. They contain too much information about Tor usage and local machine. It will be hard to clean huge chunks of such data reliably. Now Tor doesn't have safe debug logging (with logs which can be shared with anybody).

comment:17 Changed 7 months ago by nickm

Keywords: regression? added

comment:18 Changed 7 months ago by nickm

Keywords: 041-must added

comment:19 Changed 6 months ago by asn

Owner: asn deleted
Status: needs_informationassigned

We are hoping that #29034 will fix this issue, but I don't have capacity to fix this in 041 if that doesn't do it.

comment:20 Changed 6 months ago by dgoulet

Cc: dgoulet removed
Status: assignednew

comment:21 Changed 6 months ago by asn

Unclear if the fix from #29034 helped this issue. We managed to reproduce #29034 with HS circuits but need more investigation.

Last edited 6 months ago by asn (previous) (diff)

comment:22 in reply to:  21 Changed 6 months ago by dgoulet

Keywords: 041-unreached-must added; 041-must removed
Milestone: Tor: 0.4.1.x-finalTor: 0.4.2.x-final

Replying to asn:

Unclear if the fix from #29034 helped this issue. We managed to reproduce #29034 with HS circuits but need more investigation.

Honestly, I spent a lot of time trying to find out why this bug can still happen. My approach here would be to "think" it is fixed since #29034 (new version of it) was merged and released. No more occurrence of this would mean for me it is gone (after some time).

We could go back into investigating everything but I'm not sure this would be the best time investment for now.

For the above reasons, I'm moving it to the 042 milestone so we can re-visit it before 042 release. Change this if you disagree, I'll be fine with it :).

comment:23 Changed 6 months ago by asn

IMO before closing this issue we need at least an explanation of why we think that #29034 might affect this ticket. On all the rest, I agree.

comment:24 Changed 6 months ago by dgoulet

One strong possible explanation for this would be #30871. It has been established now that the has_opened() callback can be done on a "mark for close" circuit which means that the circuit was removed from the global circuit map and stripped of its HS identifier.

Second possibility is #30921 that I just opened.

I think both of these are strongly related to this one. More chances of fixing this once both these tickets are closed imo!

comment:25 Changed 6 months ago by dgoulet

Owner: set to dgoulet
Status: newaccepted

comment:26 Changed 5 months ago by gaba

Keywords: network-team-roadmap-august added; network-team-roadmap-2019-Q1Q2 removed

comment:27 Changed 5 months ago by dgoulet

Resolution: fixed
Status: acceptedclosed

As per discussion in Stockholm, we consider that most likely #29034, #30921 and #30871 (merged in master, awaiting backport) should make this problem go away.

Take 4, lets be hopeful! We can re-open if this still happens on >= tor-0.4.1.3-alpha.

comment:28 Changed 3 months ago by asn

Resolution: fixed
Status: closedreopened

Hmm, I got this on a recent 042 branch of mine (https://github.com/asn-d6/tor/tree/padanalysis) which is based on e57209dc32. That should include all the fixes from above as it is from about a month ago.

Fortunately I have info logs which might be helpful. The fun thing is that a NEWNYM occured between cannibalization and use of intro circ:

Sep 09 10:36:23.000 [info] circuit_launch_by_extend_info(): Cannibalizing circ 2971206838 (id: 18) for purpose 6 (Hidden service client: Connecting to intro point)
Sep 09 10:36:23.000 [info] circuit_send_intermediate_onion_skin(): Sending extend relay cell.
Sep 09 10:36:23.000 [info] rep_hist_note_used_internal(): New port prediction added. Will continue predictive circ building for 2484 more seconds.
Sep 09 10:36:23.000 [info] connection_ap_handshake_attach_circuit(): Intro 2971206838 (id: 18) and rend circuit 2792227503 (id: 17) circuits are not both ready. Stalling
 conn. (0 sec old)
Sep 09 10:36:24.000 [info] hs_client_receive_rendezvous_acked(): Received an RENDEZVOUS_ESTABLISHED. This circuit is now ready for rendezvous.
Sep 09 10:36:24.000 [info] circpad_setup_machine_on_circ(): Registering machine client_rp_circ to origin circ 17 (10)
Sep 09 10:36:24.000 [info] circpad_node_supports_padding(): Checking padding: supported
Sep 09 10:36:24.000 [info] circpad_negotiate_padding(): Negotiating padding on circuit 17 (10)
Sep 09 10:36:24.000 [info] circpad_machine_spec_transition(): Circpad machine 0 transitioning from 0 to 1
Sep 09 10:36:24.000 [info] circpad_choose_state_length(): State length sampled to 1.
Sep 09 10:36:24.000 [info] circpad_machine_schedule_padding():  Padding in 751 usec
Sep 09 10:36:24.000 [info] circpad_machine_schedule_padding():  Padding in 0 sec, 751 usec
Sep 09 10:36:24.000 [info] connection_ap_handshake_attach_circuit(): ready rend circ 2792227503 (id: 17) already here. Nointro-ack yet on intro 2971206838 (id: 18). (str
eam 1 sec old)
Sep 09 10:36:24.000 [info] connection_ap_handshake_attach_circuit(): Intro 2971206838 (id: 18) and rend circuit 2792227503 (id: 17) circuits are not both ready. Stalling
 conn. (1 sec old)
Sep 09 10:36:24.000 [info] circpad_send_padding_cell_for_callback(): Callback: Sending padding to origin circuit 17 (10) [length: 0]
Sep 09 10:36:24.000 [info] circpad_machine_spec_transition(): Circpad machine 0 transitioning from 1 to 65535
Sep 09 10:36:24.000 [info] circpad_handle_padding_negotiated(): Middle node did not accept our padding request.
Sep 09 10:36:24.000 [info] connection_ap_handshake_attach_circuit(): ready rend circ 2792227503 (id: 17) already here. Nointro-ack yet on intro 2971206838 (id: 18). (str
eam 1 sec old)
Sep 09 10:36:24.000 [info] connection_ap_handshake_attach_circuit(): Intro 2971206838 (id: 18) and rend circuit 2792227503 (id: 17) circuits are not both ready. Stalling
 conn. (1 sec old)
Sep 09 10:36:24.000 [info] circuit_predict_and_launch_new(): Have 7 clean circs (0 uptime-internal, 0 internal), need another hidden service circ.
Sep 09 10:36:24.000 [info] extend_info_from_node(): Including Ed25519 ID for $0FEDEBE83C1F2DE3D6673876BBA1433CCF0ADE51~jannis at 31.31.74.47
Sep 09 10:36:24.000 [info] select_primary_guard_for_circuit(): Selected primary guard Unnamed ($C07622052739697129BA49A5DF3C503CE1AAA100) for circuit.
Sep 09 10:36:24.000 [info] extend_info_from_node(): Including Ed25519 ID for $C07622052739697129BA49A5DF3C503CE1AAA100~Unnamed at 78.129.180.14
Sep 09 10:36:24.000 [info] extend_info_from_node(): Including Ed25519 ID for $ACE29EC7700BD112DEB42B934E5E443476CD577C~StoneCat at 195.201.26.175
Sep 09 10:36:24.000 [info] circuit_send_first_onion_skin(): First hop: finished sending CREATE cell to '$C07622052739697129BA49A5DF3C503CE1AAA100~Unnamed at 78.129.180.1
4'
Sep 09 10:36:24.000 [info] circuit_finish_handshake(): Finished building circuit hop:
Sep 09 10:36:24.000 [info] internal (high-uptime) circ (length 3, last hop jannis): $C07622052739697129BA49A5DF3C503CE1AAA100(open) $ACE29EC7700BD112DEB42B934E5E443476CD
577C(closed) $0FEDEBE83C1F2DE3D6673876BBA1433CCF0ADE51(closed)
Sep 09 10:36:24.000 [info] circuit_send_intermediate_onion_skin(): Sending extend relay cell.
Sep 09 10:36:24.000 [info] handle_deferred_signewnym_cb(): Honoring delayed NEWNYM request
Sep 09 10:36:24.000 [notice] Executing NEWNYM
Sep 09 10:36:24.000 [info] rend_cache_purge(): Purging HS v2 descriptor cache
Sep 09 10:36:24.000 [info] rend_cache_failure_purge(): Purging HS v2 failure cache
Sep 09 10:36:24.000 [info] hs_purge_last_hid_serv_requests(): Purging client last-HS-desc-request-time table
Sep 09 10:36:24.000 [info] cancel_descriptor_fetches(): Hidden service client descriptor fetches cancelled.
Sep 09 10:36:24.000 [info] hs_cache_client_intro_state_purge(): Hidden service client introduction point state cache purged.
Sep 09 10:36:24.000 [info] hs_cache_purge_as_client(): Hidden service client descriptor cache purged.
Sep 09 10:36:24.000 [info] hs_client_purge_state(): Hidden service client state has been purged.
Sep 09 10:36:24.000 [info] connection_edge_process_inbuf(): data from edge while in 'waiting for circuit' state. Leaving it on buffer.
Sep 09 10:36:24.000 [info] connection_edge_reached_eof(): conn (fd 15) reached eof. Closing.
Sep 09 10:36:24.000 [info] connection_handle_listener_read(): New SOCKS connection opened from 127.0.0.1.
Sep 09 10:36:24.000 [info] connection_ap_handle_onion(): No descriptor found in our cache for ieed562tnprqmalo7geqcgxqcdixbg2czptgmqxlxx4v2kwrmb74ufad. Fetching.
Sep 09 10:36:24.000 [info] rep_hist_note_used_internal(): New port prediction added. Will continue predictive circ building for 2484 more seconds.
Sep 09 10:36:24.000 [info] rep_hist_note_used_internal(): New port prediction added. Will continue predictive circ building for 2484 more seconds.
Sep 09 10:36:24.000 [info] connection_ap_make_link(): Making internal anonymized tunnel to 51.68.140.94:443 ...
Sep 09 10:36:24.000 [info] connection_ap_make_link(): ... application connection created and linked.
Sep 09 10:36:24.000 [info] directory_launch_v3_desc_fetch(): Descriptor fetch request for service QQg++1Nr4wYBbvmJARrwENFwm0LL5mZC6735XSrRYH8 with blinded key ecp1LrfJD36sKrkzHCxuel+psRktL3Cg6fZcaGb85io to directory $71EFE4314646C4613895509723BFD354D8EABCAF~Cosworth01 at 51.68.140.94
Sep 09 10:36:24.000 [info] connection_edge_process_inbuf(): data from edge while in 'waiting for rendezvous desc' state. Leaving it on buffer.
Sep 09 10:36:24.000 [info] extend_info_from_node(): Including Ed25519 ID for $71EFE4314646C4613895509723BFD354D8EABCAF~Cosworth01 at 51.68.140.94
Sep 09 10:36:24.000 [info] onion_pick_cpath_exit(): Using requested exit node '$71EFE4314646C4613895509723BFD354D8EABCAF~Cosworth01 at 51.68.140.94'
Sep 09 10:36:24.000 [info] select_primary_guard_for_circuit(): Selected primary guard Unnamed ($C07622052739697129BA49A5DF3C503CE1AAA100) for circuit.
Sep 09 10:36:24.000 [info] extend_info_from_node(): Including Ed25519 ID for $C07622052739697129BA49A5DF3C503CE1AAA100~Unnamed at 78.129.180.14
Sep 09 10:36:24.000 [info] extend_info_from_node(): Including Ed25519 ID for $134DED8B9FC5EEFD61AEF65659CF4AF0AA7FBCB5~Unnamed at 213.239.204.62
Sep 09 10:36:24.000 [info] extend_info_from_node(): Including Ed25519 ID for $3F05B8C1DF491BBF5E7DD199440911A84D9165D8~Unnamed at 91.211.247.246
Sep 09 10:36:24.000 [info] circuit_send_first_onion_skin(): First hop: finished sending CREATE cell to '$C07622052739697129BA49A5DF3C503CE1AAA100~Unnamed at 78.129.180.14'
Sep 09 10:36:24.000 [info] connection_edge_process_inbuf(): data from edge while in 'waiting for circuit' state. Leaving it on buffer.
Sep 09 10:36:24.000 [info] connection_edge_process_inbuf(): data from edge while in 'waiting for circuit' state. Leaving it on buffer.
Sep 09 10:36:24.000 [info] circuit_finish_handshake(): Finished building circuit hop:
Sep 09 10:36:24.000 [info] internal (high-uptime) circ (length 4, last hop ): $C07622052739697129BA49A5DF3C503CE1AAA100(open) $46644B13F6409018E14ECEE1E9594740064C9F24(open) $DC79C9D75CFD7A91F9232922E05D225CBDCB7703(open) $7C499ACD60FB651DD486B6D802290A3EBBECCD84(open)
Sep 09 10:36:24.000 [info] entry_guards_note_guard_success(): Recorded success for primary confirmed guard Unnamed ($C07622052739697129BA49A5DF3C503CE1AAA100)
Sep 09 10:36:24.000 [info] circuit_build_no_more_hops(): circuit built!
Sep 09 10:36:24.000 [info] client_intro_circ_has_opened(): Introduction circuit 2971206838 has opened. Attaching streams.
Sep 09 10:36:24.000 [warn] tor_bug_occurred_(): Bug: src/feature/hs/hs_client.c:685: setup_intro_circ_auth_key: Non-fatal assertion !(desc == NULL) failed. (Future instances of this warning will be silenced.) (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug: Non-fatal assertion !(desc == NULL) failed in setup_intro_circ_auth_key at src/feature/hs/hs_client.c:685. Stack trace: (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(log_backtrace_impl+0x46) [0x560f75f444f6] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(tor_bug_occurred_+0x16b) [0x560f75f3f8fb] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(hs_client_circuit_has_opened+0x3b0) [0x560f75e40ce0] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(circuit_send_next_onion_skin+0x2b8) [0x560f75dbb588] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(+0xb8eb3) [0x560f75df1eb3] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(circuit_receive_relay_cell+0x2b5) [0x560f75df3ff5] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(command_process_cell+0x2c0) [0x560f75dd5bb0] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(channel_tls_handle_cell+0x253) [0x560f75db5433] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(+0xa595c) [0x560f75dde95c] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(connection_handle_read+0xaac) [0x560f75da2c4c] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(+0x6eebe) [0x560f75da7ebe] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6(+0x229ba) [0x7fedcecba9ba] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6(event_base_loop+0x5a7) [0x7fedcecbb537] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(do_main_loop+0xdd) [0x560f75da91dd] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(tor_run_main+0x1115) [0x560f75d96af5] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(tor_main+0x3a) [0x560f75d9418a] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(main+0x19) [0x560f75d93d69] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7fedce74009b] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:36:24.000 [warn] Bug:     ./src/app/tor(_start+0x2a) [0x560f75d93dba] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)

comment:29 Changed 3 months ago by asn

Seems like the NEWNYM is what's important here since I can repro this quite easily by spamming newnyms while doing onion client requests.

comment:30 Changed 3 months ago by asn

And right below the above assert I find this for the same circuit (this is on another run so diff circ id):

Sep 09 10:49:43.000 [info] connection_ap_handshake_attach_circuit(): Found open intro circ 3172098314 (id: 15). Rend circuit 3553720442 (id: 16); Sending introduction. (stream 1 sec old)
Sep 09 10:49:43.000 [warn] tor_bug_occurred_(): Bug: src/feature/hs/hs_client.c:491: intro_circ_is_ok: Non-fatal assertion !(!hs_ident_intro_circ_is_valid(circ->hs_ident)) failed. (Future instances of this warning will be silenced.) (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug: Non-fatal assertion !(!hs_ident_intro_circ_is_valid(circ->hs_ident)) failed in intro_circ_is_ok at src/feature/hs/hs_client.c:491. Stack trace: (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(log_backtrace_impl+0x46) [0x5596ba3034f6] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(tor_bug_occurred_+0x16b) [0x5596ba2fe8fb] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(hs_client_send_introduce1+0x2ea) [0x5596ba1ff76a] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(connection_ap_handshake_attach_circuit+0x3df) [0x5596ba1930df] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(connection_ap_attach_pending+0x158) [0x5596ba197428] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(hs_client_receive_rendezvous_acked+0x80) [0x5596ba1ffda0] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(rend_process_relay_cell+0x174) [0x5596ba249624] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(+0xb8bb7) [0x5596ba1b0bb7] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(circuit_receive_relay_cell+0x2b5) [0x5596ba1b2ff5] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(command_process_cell+0x2c0) [0x5596ba194bb0] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(channel_tls_handle_cell+0x253) [0x5596ba174433] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(+0xa595c) [0x5596ba19d95c] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(connection_handle_read+0xaac) [0x5596ba161c4c] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(+0x6eebe) [0x5596ba166ebe] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6(+0x229ba) [0x7f3d99d5a9ba] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.1.so.6(event_base_loop+0x5a7) [0x7f3d99d5b537] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(do_main_loop+0xdd) [0x5596ba1681dd] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(tor_run_main+0x1115) [0x5596ba155af5] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(tor_main+0x3a) [0x5596ba15318a] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(main+0x19) [0x5596ba152d69] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb) [0x7f3d997e009b] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [warn] Bug:     ./src/app/tor(_start+0x2a) [0x5596ba152dba] (on Tor 0.4.2.0-alpha-dev eef7896e854c3d6d)
Sep 09 10:49:43.000 [info] circuit_mark_for_close_(): Circuit 3172098314 (id: 15) marked for close at src/feature/hs/hs_client.c:660 (orig reason: 2, new reason: 0)

comment:31 Changed 3 months ago by nickm

Keywords: 042-should added

comment:32 Changed 2 months ago by dgoulet

Cc: asn removed
Reviewer: asn
Status: reopenedassigned

comment:33 Changed 7 weeks ago by dgoulet

Actual Points: 0.1

Aaaaaaah!!!

hs_client_purge_state() is what is called on NEWNYM which purges _all_ descriptor from the client cache.

Which means that for an INTRO circuit being established, once it opens, we lookup the descriptor to get the IP object but hey surprised, no more descriptor.

And I bet you anything that the other case of this assert() in the ticket are because of hs_cache_clean_as_client() that cleans up the descriptor while an introduction is being made. Thin window for this race but a race nonetheless.

I see three options:

  1. Assume that because of the cache clean up, we can end up in this situation and thus remove the BUG() with a fat comment and close the circuit instead of keeping it around.
  1. When cleaning up descriptors, keep them in the cache if a circuit is being established. They should be removed from the cache at the next iteration of the cleanup (mainloop event).
  1. When purging a descriptor from the cache, we also close all related intro circuits that are _not_ yet opened.

I went with (1) which is probably what we want since this is a race that can happen and I wouldn't feel good about keeping out of date stuff in the cache or even worst, lying with the NEWNYM that the cache has been "purged-but-not-really-because-you-had-an-intro-circuit". (3) is also not so bad but (1) is very minimal.

PR: https://github.com/torproject/tor/pull/1455
Branch: ticket28970_042_01

comment:34 Changed 7 weeks ago by dgoulet

Reviewer: asnteor

(Putting teor since asn is still afk for a bit).

comment:35 Changed 7 weeks ago by dgoulet

Reviewer: teorasn
Status: assignedneeds_review

teor a bit overloaded with review. Going back to asn, no rush on this.

comment:36 Changed 7 weeks ago by teor

Status: needs_reviewneeds_revision

This looks fine, but do you want to backport it to 0.3.5 and later?
If we don't fix it back to 0.3.5, people will send us bug reports.

comment:37 Changed 7 weeks ago by teor

Keywords: 042-backport 041-backport 040-backport 035-backport added
Reviewer: asnteor

comment:38 Changed 7 weeks ago by dgoulet

Status: needs_revisionneeds_review

035 PR: https://github.com/torproject/tor/pull/1464

Merges cleanly forward.

comment:39 Changed 7 weeks ago by teor

Keywords: consider-backport-after-0424 added
Status: needs_reviewmerge_ready

This is just a cherry-pick, right?
Anyone can merge to 0.4.2 and later, after checking with Nick, because 0.4.2.3-alpha is pending.

comment:40 in reply to:  39 Changed 7 weeks ago by dgoulet

Replying to teor:

This is just a cherry-pick, right?

Yes.

comment:41 Changed 7 weeks ago by nickm

ok with me to merge.

comment:42 Changed 7 weeks ago by teor

Milestone: Tor: 0.4.2.x-finalTor: 0.4.1.x-final

Merged to 0.4.2 and later, leaving open for backport.

comment:43 Changed 3 weeks ago by teor

Milestone: Tor: 0.4.1.x-finalTor: 0.3.5.x-final
Resolution: fixed
Status: merge_readyclosed

Merged to 0.3.5 and later.

Merged #32575, #31939, #31548, #30344, #30258, #28970, #31091, and #32108 together.

Note: See TracTickets for help on using tickets.