Opened 5 months ago

Closed 3 months ago

Last modified 6 weeks ago

#25691 closed defect (fixed)

Bridges don't work: Non-fatal assertion !(exit_ei == NULL) failed in onion_pick_cpath_exit

Reported by: teor Owned by: nickm
Priority: High Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: regression, 034-must 033-backport 033-maybe-must
Cc: Actual Points:
Parent ID: Points: 0.5
Reviewer: dgoulet, teor Sponsor:

Description

When running bridges-min and bridges+ipv6-min in chutney on macOS 10.13, Apple clang 9.0.0 with Xcode 9.2:

Apr 03 08:16:13.578 [info] circuit_predict_and_launch_new: Have 0 clean circs (0 internal), need another exit circ.
Apr 03 08:16:13.578 [info] choose_good_exit_server_general: Found 1 servers that might support 0/0 pending connections.
Apr 03 08:16:13.579 [info] choose_good_exit_server_general: Chose exit server '$2E832C796F35B96C3854941B01FF4E51E0968E24~test002r at 127.0.0.1'
Apr 03 08:16:13.580 [warn] tor_bug_occurred_: Bug: src/or/circuitbuild.c:2390: onion_pick_cpath_exit: Non-fatal assertion !(exit_ei == NULL) failed. (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.585 [warn] Bug: Non-fatal assertion !(exit_ei == NULL) failed in onion_pick_cpath_exit at src/or/circuitbuild.c:2390. Stack trace: (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.586 [warn] Bug:     0   tor                                 0x00665f83 log_backtrace + 67 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.587 [warn] Bug:     1   tor                                 0x006e0ad7 tor_bug_occurred_ + 503 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.587 [warn] Bug:     2   tor                                 0x0011843d circuit_establish_circuit + 18621 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.588 [warn] Bug:     3   tor                                 0x00186560 circuit_launch_by_extend_info + 992 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.589 [warn] Bug:     4   tor                                 0x0017f868 circuit_build_needed_circs + 3784 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.590 [warn] Bug:     5   tor                                 0x0044b33c second_elapsed_callback + 1468 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.591 [warn] Bug:     6   tor                                 0x0075b2f1 periodic_timer_cb + 129 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.592 [warn] Bug:     7   libevent-2.1.6.dylib                0x0104f6f3 event_process_active_single_queue + 1289 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.593 [warn] Bug:     8   libevent-2.1.6.dylib                0x0104bf12 event_base_loop + 1109 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.594 [warn] Bug:     9   tor                                 0x00449fd9 do_main_loop + 3289 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.595 [warn] Bug:     10  tor                                 0x00453990 tor_run_main + 704 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.596 [warn] Bug:     11  tor                                 0x006585a7 tor_main + 71 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.597 [warn] Bug:     12  tor                                 0x000a3546 main + 38 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.598 [warn] Bug:     13  libdyld.dylib                       0xa75576e1 start + 1 (on Tor 0.3.4.0-alpha-dev 3df954549232bf55)
Apr 03 08:16:13.599 [info] circuit_mark_for_close_: Circuit 0 (id: 2) marked for close at src/or/circuitbuild.c:479 (orig reason: -2, new reason: 0)
Apr 03 08:16:13.600 [info] circuit_free_: Circuit 0 (id: 2) has been freed.

Other networks also log this bug but don't fail.

$ clang --version
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.3.0
...

Child Tickets

TicketStatusOwnerSummaryComponent
#25692closednickmonion_extend_cpath: Non-fatal assertion info || client failed.Core Tor/Tor
#26029closedonion_extend_cpathCore Tor/Tor

Change History (27)

comment:1 Changed 5 months ago by nickm

I see this intermittently on main-0.3.3.

comment:2 Changed 5 months ago by dgoulet

I just updated my dirauth on the test network and when I restarted, I got this in a loop 122 times. So I'm guessing something is related to missing nodes information.

I have info.log for this if anyone wants it (but from the test network so can't be public).

comment:3 Changed 5 months ago by cypherpunks

I've seen this one.

[WARN] Bug:     /usr/bin/tor(_start+0x2a) [0x55a3a8ce609a] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7f558cd8c2b1] (on Tor 0.3.3.4-alpha )
[WARN] Bug:     /usr/bin/tor(main+0x19) [0x55a3a8ce6049] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(tor_main+0x3a) [0x55a3a8ce62da] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(tor_run_main+0x275) [0x55a3a8cece95] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(do_main_loop+0x2bc) [0x55a3a8ceb78c] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x6a0) [0x7f558e7b85a0] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(+0x53c78) [0x55a3a8ceac78] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(circuit_build_needed_circs+0x3c8) [0x55a3a8d83e28] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(circuit_launch_by_extend_info+0x7d) [0x55a3a8d8369d] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(circuit_establish_circuit+0xa22) [0x55a3a8d70832] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(tor_bug_occurred_+0xb9) [0x55a3a8e3b989] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug:     /usr/bin/tor(log_backtrace+0x44) [0x55a3a8e20154] (on Tor 0.3.3.4-alpha ) 
[WARN] Bug: Non-fatal assertion !(exit_ei == NULL) failed in onion_pick_cpath_exit at ../src/or/circuitbuild.c:2390. Stack trace: (on Tor 0.3.3.4-alpha ) 
[WARN] tor_bug_occurred_(): Bug: ../src/or/circuitbuild.c:2390: onion_pick_cpath_exit: Non-fatal assertion !(exit_ei == NULL) failed. (on Tor 0.3.3.4-alpha )

It seems to be reproducible when your client has a newer consensus and the bridge lacks the descriptor to extend the circuit.

comment:4 Changed 4 months ago by catalyst

This bug seems to be in 0.3.3 as well; see the logs in #25769.

comment:5 Changed 4 months ago by catalyst

Keywords: 034-must added; 0.3.4-must removed

comment:6 Changed 4 months ago by nickm

Keywords: 033-backport 033-maybe-must added

comment:7 Changed 4 months ago by nickm

Owner: set to nickm
Status: newaccepted

So, I think that the likeliest culprit here is d1874b433953f64. If that's true, then this should be fixed by my branch bug25691_033.

comment:8 Changed 4 months ago by nickm

Status: acceptedneeds_review

comment:9 Changed 4 months ago by nickm

Cc: teor added

Adding teor to cc, since they reviewed #25213 and may have insight.

comment:10 Changed 4 months ago by teor

Status: needs_reviewmerge_ready

Oh, yes, for_direct_connect. Seems sensible.

(Let's eventually refactor all the node code so we don't need to keep on patching little bits like this.)

comment:11 Changed 4 months ago by nickm

Resolution: fixed
Status: merge_readyclosed

Okay; merged to 0.3.3 and forward.

comment:12 Changed 4 months ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: 0.3.3.x-final

comment:13 Changed 4 months ago by nickm

Resolution: fixed
Status: closedreopened

I just saw this on 0.3.3 as of b65024f57df964af880d8101ec137179c756c618. So even if we fixed a case of this, we didn't fix every case.

Last edited 4 months ago by nickm (previous) (diff)

comment:14 Changed 4 months ago by nickm

(This happened on a chutney network, running bridges+ipv6-min, on the bridge authority.)

comment:15 Changed 4 months ago by nickm

branch 25691_diagnostic has some diagnostic logs to try to sort this out.

comment:16 Changed 4 months ago by nickm

Status: reopenedaccepted

comment:17 Changed 4 months ago by nickm

Status: acceptedneeds_review

I have a new branch that fixes this for me on chutney: bug25691_033_again. The fix is more complicated this time, unfortunately -- so maybe we should merge it into 0.3.4 before we consider a backport?

There's a pull request at https://github.com/torproject/tor/pull/48 .

comment:18 Changed 4 months ago by teor

Reviewer: dgoulet

Assigning dgoulet as reviewer, because this branch covers #25691 and #25692, and dgoulet is reviewer on #25692.

comment:19 Changed 4 months ago by teor

Reviewer: dgouletdgoulet, teor
Status: needs_reviewneeds_revision

I added a review on github.

It looks good, there is a tricky change in compute_frac_paths_available() that I would probably defer to 0.3.4 Tor: unspecified.

I am happy for dgoulet to do the next review.

Edit: Tor: unspecified is for everything that isn't urgent, and isn't on the roadmap. Bridge clients shouldn't care about the amount of guard descriptors they have. But the current implementation works and doesn't need to be fixed urgently.

Last edited 4 months ago by teor (previous) (diff)

comment:20 Changed 4 months ago by nickm

Thanks! I'll do another round of revisions once dgoulet has had a chance to review?

comment:21 Changed 4 months ago by dgoulet

Nothing much more to say after teor's review here. Lets get it revised and I can do the final review if needed.

comment:22 Changed 4 months ago by nickm

Status: needs_revisionneeds_review

I've made the requested revisions and opened tickets #25885 and #25886 for the more complicated issues.

comment:23 Changed 4 months ago by teor

Status: needs_reviewneeds_revision

There are four comment typos: I added comments on GitHub,

Once that's done, this branch is ready for merge!

comment:24 Changed 4 months ago by nickm

Status: needs_revisionmerge_ready

wow; four typos on two lines. I am in rare form!

Branch updated on github, and squashed as bug25691_033_again_squashed.

That latter branch has now been merged on 0.3.4. If it causes no trouble there, we can backport to 0.3.3

comment:25 Changed 3 months ago by teor

Can we backport this fix to 0.3.3?

Relay operators on 0.3.3.6 are complaining about #25692:
https://lists.torproject.org/pipermail/tor-relays/2018-May/015309.html

comment:26 Changed 3 months ago by nickm

Resolution: fixed
Status: merge_readyclosed

Merged to 0.3.3. Let's keep an eye out for reports of trouble in 0.3.4.1-alpha before we release this in 0.3.3 though.

comment:27 Changed 6 weeks ago by teor

Cc: teor removed

Remove useless CC

Note: See TracTickets for help on using tickets.