Opened 5 months ago

Closed 3 weeks ago

#21576 closed defect (fixed)

Bug: Assertion linked_dir_conn_base failed in connection_ap_handshake_send_begin

Reported by: alecmuffett Owned by: teor
Priority: Medium Milestone: Tor: 0.2.9.x-final
Component: Core Tor/Tor Version: Tor: 0.2.9.3-alpha
Severity: Normal Keywords: crash, 029-backport
Cc: Actual Points: 0.5
Parent ID: Points: 0.2
Reviewer: Sponsor:

Description

I am using Tor for Onionbalance. Uploading descriptors for a large OnionBalance config, I crashed Tor with this message:

...
Feb 28 09:59:44.000 [info] connection_free_(): Freeing linked Directory connection [client reading] with 0 bytes on inbuf, 0 on outbuf.
Feb 28 09:59:44.000 [info] circuit_finish_handshake(): Finished building circuit hop:
Feb 28 09:59:44.000 [info] internal circ (length 4, last hop kouettland): $566D30FF44DFBB163F632C439A86CC5B1431EC7C(open) $254EB51B0B85B2FB8A70997875DA493420A30458(open) $9844B981A80B3E4B50897098E2D65167E6AEF127(open) $B9FB493DC3CAC92A1F923DC05A251DCF3F4A4410(open)
Feb 28 09:59:44.000 [info] circuit_send_next_onion_skin(): circuit built!
Feb 28 09:59:44.000 [info] pathbias_count_build_success(): Got success count 273.152447/280.411541 for guard Unnamed ($566D30FF44DFBB163F632C439A86CC5B1431EC7C)
Feb 28 09:59:44.000 [info] internal circ (length 4): $566D30FF44DFBB163F632C439A86CC5B1431EC7C(open) $254EB51B0B85B2FB8A70997875DA493420A30458(open) $9844B981A80B3E4B50897098E2D65167E6AEF127(open) $B9FB493DC3CAC92A1F923DC05A251DCF3F4A4410(open)
Feb 28 09:59:44.000 [info] link_apconn_to_circ(): Looks like completed circuit to [scrubbed] does allow optimistic data for connection to [scrubbed]
Feb 28 09:59:44.000 [info] connection_ap_handshake_send_begin(): Sending relay cell 1 on circ 3620016270 to begin stream 11043.
Feb 28 09:59:44.000 [err] tor_assertion_failed_(): Bug: src/or/connection_edge.c:2443: connection_ap_handshake_send_begin: Assertion linked_dir_conn_base failed; aborting. (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug: Assertion linked_dir_conn_base failed in connection_ap_handshake_send_begin at src/or/connection_edge.c:2443. Stack trace: (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(log_backtrace+0x4c) [0x54c0ea48] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(tor_assertion_failed_+0x90) [0x54c2a0f0] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(connection_ap_handshake_send_begin+0x574) [0x54bbce5c] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(connection_ap_handshake_attach_chosen_circuit+0xf4) [0x54b94438] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(connection_ap_handshake_attach_circuit+0x2ec) [0x54b96ed4] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(connection_ap_attach_pending+0x198) [0x54bbc02c] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(circuit_try_attaching_streams+0x2c) [0x54b93c04] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(circuit_send_next_onion_skin+0x2c8) [0x54b7fe70] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(+0x5c470) [0x54b19470] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(circuit_receive_relay_cell+0x2c8) [0x54b1b8e8] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(command_process_cell+0x324) [0x54b984a8] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(channel_tls_handle_cell+0x238) [0x54b774d8] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(+0x104ad0) [0x54bc1ad0] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(+0xfb40c) [0x54bb840c] (on Tor 0.2.9.9 56788a2489127072)
Feb 28 09:59:44.000 [err] Bug:     tor(+0x3293c) [0x54aef93c] (on Tor 0.2.9.9 56788a2489127072)

OnionBalance and other log/config files available as desired. Config file as follows:

DataDirectory /home/pi/eotk/onionbalance.d
ControlPort unix:/home/pi/eotk/onionbalance.d/tor-control.sock
PidFile /home/pi/eotk/onionbalance.d/tor.pid
Log info file /home/pi/eotk/onionbalance.d/tor.log
SafeLogging 1
HeartbeatPeriod 60 minutes
RunAsDaemon 1
# onionbalance
SocksPort 127.0.0.1:9055
CookieAuthentication 1
MaxClientCircuitsPending 1024
  • alec

Child Tickets

Attachments (1)

tor-crash-logs.tar.bz2 (258.8 KB) - added by alecmuffett 5 months ago.
Tor logfiles, OnionBalance logfiles, Configs for both.

Download all attachments as: .zip

Change History (12)

comment:1 Changed 5 months ago by alecmuffett

It turns out that this is repeatable; logfiles to come.

Changed 5 months ago by alecmuffett

Tor logfiles, OnionBalance logfiles, Configs for both.

comment:2 Changed 5 months ago by asn

  • Component changed from Core Tor to Core Tor/Tor
  • Milestone set to Tor: 0.3.0.x-final

Triaging this in 0.3.0 given that it's an assert bug. We can consider moving to 0.3.1 if we feel like it.

comment:3 Changed 5 months ago by alecmuffett

  • Component changed from Core Tor/Tor to Core Tor
  • Milestone Tor: 0.3.0.x-final deleted

For reference, in case it's relevant, the worker onions for OnionBalance all have configs that look something like this.

@asn: If it continues, this fatally blocks my ability to use and deploy OnionBalance into corporate environments. That would be somewhat deleterious.

# -*- conf -*-
# eotk (c) 2017 Alec Muffett

# template note: here we use TOR_DIR not PROJECT_DIR because of the
# relocation of Tor directories under `softmap`
DataDirectory /home/pi/eotk/projects.d/wiki.d/hs-1.d
ControlPort unix:/home/pi/eotk/projects.d/wiki.d/hs-1.d/tor-control.sock
PidFile /home/pi/eotk/projects.d/wiki.d/hs-1.d/tor.pid
Log notice file /home/pi/eotk/projects.d/wiki.d/hs-1.d/tor.log
SafeLogging 1
HeartbeatPeriod 60 minutes
LongLivedPorts 80,443
RunAsDaemon 1

# use single onions
SocksPort 0 # have to disable this for single onions
HiddenServiceSingleHopMode 1 # yep, i want single onions
HiddenServiceNonAnonymousMode 1 # yes, really, honest, i swear

# softmap
HiddenServiceDir /home/pi/eotk/projects.d/wiki.d/hs-1.d
HiddenServicePort 80 unix:/home/pi/eotk/projects.d/wiki.d/hs-1.d/port-80.sock
HiddenServicePort 443 unix:/home/pi/eotk/projects.d/wiki.d/hs-1.d/port-443.sock
HiddenServiceNumIntroductionPoints 3

comment:4 Changed 5 months ago by teor

  • Component changed from Core Tor to Core Tor/Tor
  • Keywords crash 029-backport added
  • Milestone set to Tor: 0.3.0.x-final

(Fixing the accidental reverts caused by the last update.)

Marking for possible backport to 0.2.9: this crash bug causes denial-of-service, so it may qualify as a security issue.

A simple fix for 0.2.9 is to BUG() and return -1 rather than crashing when either of the relevant pair of assertions fail.

Earlier in the function, we should probably BUG() and return -1 if the stream is marked for close. (Or is in an inconsistent state: there should be no way that a linked connection has a NULL linked_conn.)

comment:5 Changed 5 months ago by teor

  • Owner set to teor
  • Status changed from new to assigned

comment:6 Changed 5 months ago by teor

  • Actual Points set to 0.5
  • Points set to 0.2
  • Status changed from assigned to needs_review
  • Version changed from Tor: 0.2.9.9 to Tor: 0.2.9.3-alpha

This timing-dependent crash occurs when a directory connection is unlinked, and then tries to send a BEGIN cell shortly afterwards. It could happen on any tor node that uses BEGINDIR, which includes clients, bridges, relays (when using BEGINDIR to an ORPort-only mirror, introduced in #20711 in tor-0.3.0.4-rc). Authorities are unaffected: they never use BEGINDIR.

It particularly affects hidden services, hidden service clients, bridges, bridge clients, and anything with AllDirActionsPrivate set, because these roles/options use BEGINDIR more, over longer connections.

Since it is network-timing dependent, I can't rule out it being triggered remotely.

Please see my branch bug21576_029_v2, which stops opening invalid connections, rather than crashing.

comment:7 follow-up: Changed 5 months ago by asn

  • Status changed from needs_review to merge_ready

I'm a big newbie with this part of the codebase (particularly linked connections and how they work), but the fix looks reasonable given the description above, and it doesn't seem like it can make matters worse.

I guess a side-question is whether that's the best place in the codebase to bail if the linked connection does not exist. Maybe we shouldn't even start sending the BEGIN cell if the linked conn has been teared down?

Anyway, I think Nick will be better to evaluate this patch, so I'm marking this as merge_ready.

comment:8 in reply to: ↑ 7 Changed 5 months ago by teor

Replying to asn:

I'm a big newbie with this part of the codebase (particularly linked connections and how they work), but the fix looks reasonable given the description above, and it doesn't seem like it can make matters worse.

I guess a side-question is whether that's the best place in the codebase to bail if the linked connection does not exist. Maybe we shouldn't even start sending the BEGIN cell if the linked conn has been teared down?

We could do that too, but that's a separate ticket.

We must always do the check right before we do the circuit length anonymity check, because we can't work out if a directory circuit needs to be anonymous or not without its linked directory connection.

Anyway, I think Nick will be better to evaluate this patch, so I'm marking this as merge_ready.

comment:9 Changed 5 months ago by nickm

  • Milestone changed from Tor: 0.3.0.x-final to Tor: 0.2.9.x-final

This looks okay to me. I'll take it in 0.3.0 and mark it for possible 0.2.9 backport as a crash bug.

comment:10 Changed 4 weeks ago by teor

nickm: we should probably backport this to 0.2.9 at some point, it seems to work in 0.3.0.

comment:11 Changed 3 weeks ago by nickm

  • Resolution set to fixed
  • Status changed from merge_ready to closed

Backported to 0.2.9. Thanks for the reminder!

Note: See TracTickets for help on using tickets.