Opened 2 weeks ago

Last modified 39 hours ago

#25691 needs_revision defect

Bridges don't work: Non-fatal assertion !(exit_ei == NULL) failed in onion_pick_cpath_exit

Reported by: teor Owned by: nickm
Priority: High Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: regression, 034-must 033-backport 033-maybe-must
Cc: teor Actual Points:
Parent ID: Points: 0.5
Reviewer: dgoulet, teor Sponsor:

Description

When running bridges-min and bridges+ipv6-min in chutney on macOS 10.13, Apple clang 9.0.0 with Xcode 9.2:

Apr 03 08:16:13.578 [info] circuit_predict_and_launch_new: Have 0 clean circs (0 internal), need another exit circ.
Apr 03 08:16:13.578 [info] choose_good_exit_server_general: Found 1 servers that might support 0/0 pending connections.
Apr 03 08:16:13.579 [info] choose_good_exit_server_general: Chose exit server '$2E832C796F35B96C3854941B01FF4E51E0968E24~test002r at 127.0.0.1'
Apr 03 08:16:13.580 [warn] tor_bug_occurred_: Bug: src/or/circuitbuild.c:2390: onion_pick_cpath_exit: Non-fatal assertion !(exit_ei == NULL) failed. (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.585 [warn] Bug: Non-fatal assertion !(exit_ei == NULL) failed in onion_pick_cpath_exit at src/or/circuitbuild.c:2390. Stack trace: (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.586 [warn] Bug:     0   tor                                 0x00665f83 log_backtrace + 67 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.587 [warn] Bug:     1   tor                                 0x006e0ad7 tor_bug_occurred_ + 503 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.587 [warn] Bug:     2   tor                                 0x0011843d circuit_establish_circuit + 18621 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.588 [warn] Bug:     3   tor                                 0x00186560 circuit_launch_by_extend_info + 992 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.589 [warn] Bug:     4   tor                                 0x0017f868 circuit_build_needed_circs + 3784 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.590 [warn] Bug:     5   tor                                 0x0044b33c second_elapsed_callback + 1468 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.591 [warn] Bug:     6   tor                                 0x0075b2f1 periodic_timer_cb + 129 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.592 [warn] Bug:     7   libevent-2.1.6.dylib                0x0104f6f3 event_process_active_single_queue + 1289 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.593 [warn] Bug:     8   libevent-2.1.6.dylib                0x0104bf12 event_base_loop + 1109 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.594 [warn] Bug:     9   tor                                 0x00449fd9 do_main_loop + 3289 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.595 [warn] Bug:     10  tor                                 0x00453990 tor_run_main + 704 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.596 [warn] Bug:     11  tor                                 0x006585a7 tor_main + 71 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.597 [warn] Bug:     12  tor                                 0x000a3546 main + 38 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.598 [warn] Bug:     13  libdyld.dylib                       0xa75576e1 start + 1 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.599 [info] circuit_mark_for_close_: Circuit 0 (id: 2) marked for close at src/or/circuitbuild.c:479 (orig reason: -2, new reason: 0)
Apr 03 08:16:13.600 [info] circuit_free_: Circuit 0 (id: 2) has been freed.

Other networks also log this bug but don't fail.

$ clang --version
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.3.0
...

Child Tickets

TicketStatusOwnerSummaryComponent
#25692needs_revisionnickmonion_extend_cpath: Non-fatal assertion info || client failed.Core Tor/Tor

Change History (20)

comment:1 Changed 2 weeks ago by nickm

I see this intermittently on main-0.3.3.

comment:2 Changed 2 weeks ago by dgoulet

I just updated my dirauth on the test network and when I restarted, I got this in a loop 122 times. So I'm guessing something is related to missing nodes information.

I have info.log for this if anyone wants it (but from the test network so can't be public).

comment:3 Changed 2 weeks ago by cypherpunks

I've seen this one.

[WARN] Bug:     /usr/bin/tor(_start+0x2a) [0x55a3a8ce609a] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7f558cd8c2b1] (on Tor 0.3.3.4-alpha )
[WARN] Bug:     /usr/bin/tor(main+0x19) [0x55a3a8ce6049] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(tor_main+0x3a) [0x55a3a8ce62da] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(tor_run_main+0x275) [0x55a3a8cece95] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(do_main_loop+0x2bc) [0x55a3a8ceb78c] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x6a0) [0x7f558e7b85a0] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(+0x53c78) [0x55a3a8ceac78] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(circuit_build_needed_circs+0x3c8) [0x55a3a8d83e28] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(circuit_launch_by_extend_info+0x7d) [0x55a3a8d8369d] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(circuit_establish_circuit+0xa22) [0x55a3a8d70832] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(tor_bug_occurred_+0xb9) [0x55a3a8e3b989] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(log_backtrace+0x44) [0x55a3a8e20154] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug: Non-fatal assertion !(exit_ei == NULL) failed in onion_pick_cpath_exit at ../src/or/circuitbuild.c:2390. Stack trace: (on Tor 0.3.3.4-alpha ) 
[WARN] tor_bug_occurred_(): Bug: ../src/or/circuitbuild.c:2390: onion_pick_cpath_exit: Non-fatal assertion !(exit_ei == NULL) failed. (on Tor 0.3.3.4-alpha )

It seems to be reproducible when your client has a newer consensus and the bridge lacks the descriptor to extend the circuit.

comment:4 Changed 9 days ago by catalyst

This bug seems to be in 0.3.3 as well; see the logs in #25769.

comment:5 Changed 9 days ago by catalyst

Keywords: 034-must added; 0.3.4-must removed

comment:6 Changed 9 days ago by nickm

Keywords: 033-backport 033-maybe-must added

comment:7 Changed 7 days ago by nickm

Owner: set to nickm
Status: newaccepted

So, I think that the likeliest culprit here is d1874b433953f64. If that's true, then this should be fixed by my branch bug25691_033.

comment:8 Changed 7 days ago by nickm

Status: acceptedneeds_review

comment:9 Changed 7 days ago by nickm

Cc: teor added

Adding teor to cc, since they reviewed #25213 and may have insight.

comment:10 Changed 7 days ago by teor

Status: needs_reviewmerge_ready

Oh, yes, for_direct_connect. Seems sensible.

(Let's eventually refactor all the node code so we don't need to keep on patching little bits like this.)

comment:11 Changed 7 days ago by nickm

Resolution: fixed
Status: merge_readyclosed

Okay; merged to 0.3.3 and forward.

comment:12 Changed 7 days ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: 0.3.3.x-final

comment:13 Changed 4 days ago by nickm

Resolution: fixed
Status: closedreopened

I just saw this on 0.3.3 as of b65024f57df964af880d8101ec137179c756c618. So even if we fixed a case of this, we didn't fix every case.

Last edited 4 days ago by nickm (previous) (diff)

comment:14 Changed 4 days ago by nickm

(This happened on a chutney network, running bridges+ipv6-min, on the bridge authority.)

comment:15 Changed 4 days ago by nickm

branch 25691_diagnostic has some diagnostic logs to try to sort this out.

comment:16 Changed 4 days ago by nickm

Status: reopenedaccepted

comment:17 Changed 4 days ago by nickm

Status: acceptedneeds_review

I have a new branch that fixes this for me on chutney: bug25691_033_again. The fix is more complicated this time, unfortunately -- so maybe we should merge it into 0.3.4 before we consider a backport?

There's a pull request at https://github.com/torproject/tor/pull/48 .

comment:18 Changed 3 days ago by teor

Reviewer: dgoulet

Assigning dgoulet as reviewer, because this branch covers #25691 and #25692, and dgoulet is reviewer on #25692.

comment:19 Changed 2 days ago by teor

Reviewer: dgouletdgoulet, teor
Status: needs_reviewneeds_revision

I added a review on github.

It looks good, there is a tricky change in compute_frac_paths_available() that I would probably defer to 0.3.4 Tor: unspecified.

I am happy for dgoulet to do the next review.

Edit: Tor: unspecified is for everything that isn't urgent, and isn't on the roadmap. Bridge clients shouldn't care about the amount of guard descriptors they have. But the current implementation works and doesn't need to be fixed urgently.

Last edited 2 days ago by teor (previous) (diff)

comment:20 Changed 39 hours ago by nickm

Thanks! I'll do another round of revisions once dgoulet has had a chance to review?

Note: See TracTickets for help on using tickets.