Opened 9 years ago

Closed 9 years ago

Last modified 7 years ago

#2572 closed defect (fixed)

Bridge authority crashes on SIGHUP

Reported by: rransom Owned by: rransom
Priority: Very High Milestone: Tor: 0.2.2.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay
Cc: shamrock Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Tonga has crashed twice, once on January 31, again when it was restarted on February 16:

Feb 16 06:25:01.238 [notice] Tor 0.2.2.22-alpha (git-21b3de6cf37d4e60) opening new log file.
Feb 16 06:25:01.327 [err] Unable to add own descriptor to directory: Skipping router descriptor: not in consensus.
Feb 16 06:25:01.327 [warn] options_act(): Bug: Error initializing keys; exiting
Feb 16 06:25:01.327 [err] set_options(): Bug: Acting on config options left us in a broken state. Dying.

Both crashes were at the same time of day, and with identical log messages. Tonga is now running 0.2.1.29 again as a stopgap measure, but we need to find and fix this problem.

The "Skipping router descriptor: not in consensus." message came from src/or/routerlist.c line 3313, in router_add_to_routerlist. The only changes 'git blame' turned up earlier in that function that are present in 0.2.2.22-alpha and not in 0.2.1.29 were made in 9ca311f6 (Allow using regular relays as bridges).

Child Tickets

Change History (10)

comment:1 in reply to:  description Changed 9 years ago by rransom

Replying to rransom:

The "Skipping router descriptor: not in consensus." message came from src/or/routerlist.c line 3313, in router_add_to_routerlist. The only changes 'git blame' turned up earlier in that function that are present in 0.2.2.22-alpha and not in 0.2.1.29 were made in 9ca311f6 (Allow using regular relays as bridges).

I missed two other possibly relevant commits:

comment:2 Changed 9 years ago by rransom

Those three commits seem to be innocent.

comment:3 Changed 9 years ago by rransom

851a980065e6b2df8d could be the culprit. (Found using 'git log -Scurrent_consensus tor-0.2.1.29..tor-0.2.2.22-alpha'.)

If so, Tonga was probably publishing a consensus document listing all bridges (rather than mirroring the public network consensus as a bridge authority should) at the time of the crash.

comment:4 in reply to:  3 Changed 9 years ago by rransom

Replying to rransom:

If so, Tonga was probably publishing a consensus document listing all bridges (rather than mirroring the public network consensus as a bridge authority should) at the time of the crash.

Probably not -- BridgeAuthoritativeDir is checked in src/or/directory.c before responding to directory requests.

comment:5 in reply to:  3 ; Changed 9 years ago by rransom

Owner: set to rransom
Status: newassigned

Replying to rransom:

851a980065e6b2df8d could be the culprit.

No.

We know that the following control flow led to the "Unable to add own descriptor to directory: Skipping router descriptor: not in consensus." log message, and to the crash:

This bug seems to have been caused by two problems:

  • The fix for #2433 caused Tor to call init_keys more frequently than it was originally intended to be called. This part is why Tonga crashed while running 0.2.2.22-alpha and not while running 0.2.1.29.
  • init_keys insisted on adding Tonga's own descriptor to its routerlist because Tonga was an authority for descriptors with some purpose, but Tonga couldn't force its own descriptor into its routerlist because it was not an authority for descriptors with purpose general. This part is why the other directory authorities never crash in this manner.

Additionally, in 0.2.1.29, init_keys would have been called only during startup, before Tor had loaded or obtained a consensus, so router_add_to_routerlist would not have failed even in a bridge authority not listed in the current network consensus.

The fix for this bug is to replace authdir_mode(options) with authdir_mode_handles_descs(options, ROUTER_PURPOSE_GENERAL) in init_keys (on line 632 of src/or/router.c as of tor-0.2.2.22-alpha and current maint-0.2.2 HEAD).

comment:6 Changed 9 years ago by rransom

Status: assignedneeds_review

See bug2572 ( ssh://mob@repo.or.cz/srv/git/tor/rransom.git bug2572 ).

comment:7 in reply to:  5 Changed 9 years ago by rransom

Replying to rransom:

The fix for this bug is to replace authdir_mode(options) with authdir_mode_handles_descs(options, ROUTER_PURPOSE_GENERAL) in init_keys (on line 632 of src/or/router.c as of tor-0.2.2.22-alpha and current maint-0.2.2 HEAD).

ROUTER_PURPOSE_GENERAL may need to be changed if a directory authority can also be a bridge relay. (But we should not allow directory authorities to be bridge relays.)

comment:8 Changed 9 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

merged! thanks!

comment:9 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:10 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.