Opened 5 months ago

Last modified 2 months ago

#33582 new defect

Make bridges wait until they have bootstrapped, before publishing their descriptor

Reported by: teor Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-bridge, tor-relay, prop311, outreachy-ipv6, easy, 044-deferred
Cc: Actual Points:
Parent ID: #33050 Points: 1
Reviewer: Sponsor: Sponsor55-can

Description (last modified by teor)

Instead of this fix, we can make chutney check tor's logs for reachability self-test successes (#34037), or implement strict self-tests (#33222).

On bridges, there's a race condition when bridges try to publish their descriptor to the bridge authority:

  • bridges try to publish their descriptors before bootstrapping
  • but bridges can't publish their descriptors, because they don't have enough directory info to build a circuit to the bridge authority

Bridges will eventually try to publish their descriptors again, when they become dirty.

We should make bridges wait until they have bootstrapped, before they try to publish their descriptors. (This might be a good change for relays as well: there isn't much point in publishing a relay that can't bootstrap.)

This issue happens regardless of AssumeReachable. It is most obvious in chutney networks.

This ticket isn't essential. But the workarounds seem to cause weird race conditions, which are time-consuming to diagnose and fix.

Child Tickets

TicketStatusOwnerSummaryComponent
#33408newMake tor versions sortable, by adding the commit number to EXTRA_INFOCore Tor/Tor
#33581newRestore bridge networkstatus checks in chutneyCore Tor/Chutney

Change History (10)

comment:1 Changed 5 months ago by arma

Yes please, this sounds like a fine plan.

To clarify, right now both bridges and relays *do* wait until they have reached 100% bootstrapped, if AssumeReachable is default (0)? But this bug appears when AssumeReachable is set to 1, and then it tries to publish without regard for whether it has bootstrapped to 100%?

comment:2 in reply to:  1 ; Changed 5 months ago by teor

Replying to arma:

To clarify, right now both bridges and relays *do* wait until they have reached 100% bootstrapped, if AssumeReachable is default (0)? But this bug appears when AssumeReachable is set to 1, and then it tries to publish without regard for whether it has bootstrapped to 100%?

That's not quite right.

When AssumeReachable is 1, relays and bridges wait until they have done reachability self-checks. These self-checks can't happen until the relay/bridge is 100% bootstrapped.

AssumeReachable 1 breaks the dependency between descriptor publication and reachability self-checks. (Which is fine for relays, but the transitive 100% bootstrap requirement is actually needed for bridges.)

In detail:

If AssumeReachable is 0, then relays and bridges:

  1. Bootstrap, then they have enough descriptors to build circuits to
  2. Perform reachability self-checks, then they decide to
  3. Publish their descriptor.

If AssumeReachable is 1, then relays:

  1. Immediately publish their descriptor to all directory authority DirPorts.

If AssumeReachable is 1, then bridges:

  1. Immediately try to publish their descriptor to the bridge authority ORPort, but fail, because they don't have enough descriptors to build circuits.
  2. Bootstrap, then they have enough descriptors to build circuits.
  3. Eventually, their descriptor becomes dirty again, and they decide to publish it to the bridge authority ORPort, and succeed.
Last edited 5 months ago by teor (previous) (diff)

comment:3 Changed 5 months ago by teor

#33583 doesn't fix this issue, because bridges think they are reachable as soon as the bridge client connects to them.

For details, see:
https://gitweb.torproject.org/torspec.git/tree/proposals/311-relay-ipv6-reachability.txt#n498

comment:4 Changed 5 months ago by teor

Parent ID: #33232#33050

comment:5 Changed 5 months ago by teor

Keywords: easy added

comment:6 in reply to:  2 Changed 3 months ago by teor

Description: modified (diff)

Replying to teor:

Replying to arma:

To clarify, right now both bridges and relays *do* wait until they have reached 100% bootstrapped, if AssumeReachable is default (0)? But this bug appears when AssumeReachable is set to 1, and then it tries to publish without regard for whether it has bootstrapped to 100%?

That's not quite right.

When AssumeReachable is 1, relays and bridges wait until they have done reachability self-checks. These self-checks can't happen until the relay/bridge is 100% bootstrapped.

It turns out that this isn't true, either.

Tor's current relay and bridge reachability self-test only checks for the first inbound create cell. This cell can be sent by a bridge client, even before the bridge has bootstrapped.

We might end up fixing this bug in #33222, if we check for returned created cells *and* their corresponding inbound create cells, rather than just inbound create cells.

Last edited 3 months ago by teor (previous) (diff)

comment:7 Changed 3 months ago by teor

Description: modified (diff)

Instead of this fix, we can make chutney check tor's logs for reachability self-test successes. See #34037.

comment:8 Changed 3 months ago by teor

Description: modified (diff)

comment:9 Changed 3 months ago by teor

Description: modified (diff)

comment:10 Changed 2 months ago by nickm

Keywords: 044-deferred added
Milestone: Tor: 0.4.4.x-finalTor: unspecified

Bulk-remove tickets from 0.4.4. Add the 044-deferred label to them.

Note: See TracTickets for help on using tickets.