Opened 3 weeks ago

Closed 13 days ago

#27003 closed defect (fixed)

Regression: 'SETCONF ORPort' can kill tor process

Reported by: atagar Owned by: nickm
Priority: High Milestone: Tor: 0.3.5.x-final
Component: Core Tor/Tor Version: Tor: unspecified
Severity: Normal Keywords: regression, 035-must 034-backport
Cc: Actual Points:
Parent ID: Points:
Reviewer: asn Sponsor:

Description

Hi network team. Recently-ish Stem's integ tests began failing when ran with network tests ('run_tests.py --integ --target ONLINE'). Here's the repro...

  1. My torrc...
DataDirectory /home/atagar/.tor
ORPort 9052
ControlPort 9051
ExitRelay 0
  1. Start tor.
  1. Connect to the control port and toggle the ORPort off and on...
% telnet localhost 9051

AUTHENTICATE
250 OK

RESETCONF ORPort
250 OK

SETCONF ORPort=9090
Connection closed by foreign host.

The tor dies with...

Jul 31 14:49:21.000 [notice] Opening OR listener on 0.0.0.0:9090
Jul 31 14:49:21.000 [warn] Can't get my x509 link cert.
Jul 31 14:49:21.000 [err] Unable to update Ed25519 keys!  Exiting.
Jul 31 14:49:21.000 [notice] Your Tor server's identity key fingerprint is 'Unnamed 4853AB6F9215A837EA3562CF4AF00713737FDF01'
Jul 31 14:49:21.000 [notice] Now checking whether ORPort 71.231.87.208:9090 is reachable... (this may take up to 20 minutes -- look for log messages indicating success)
Jul 31 14:49:21.000 [notice] Configured to measure directory request statistics, but no GeoIP database found. Please specify a GeoIP database using the GeoIPFile option.

This was with the latest tor git commit (7e4ac02).

Child Tickets

Change History (15)

comment:1 Changed 3 weeks ago by teor

Keywords: regression 035-must added
Milestone: Tor: 0.3.5.x-final
Version: Tor: unspecified

comment:2 Changed 3 weeks ago by arma

Severity: MinorNormal

comment:3 Changed 3 weeks ago by nickm

Keywords: 034-backport added

confirmed; and I think 034 has this problem too. 033 does not.

comment:4 Changed 3 weeks ago by nickm

Owner: set to nickm
Status: newaccepted

comment:5 Changed 3 weeks ago by nickm

git bisect reports that 61d5ce83a1ecc459798b530730316bfebe3bb192 is the first bad commit here, but that confuses me because it's just a merge.

eta: I'll check whether I bisected wrong.

Last edited 3 weeks ago by nickm (previous) (diff)

comment:6 Changed 3 weeks ago by nickm

It looks like this might be caused by our #25761 fix somehow. If I revert 7e8c5e36622d4581840bb0a037a66d9504c433a5 and 5dc00c06615ef1e99e1d82dfdf41823c0d4bf7f6, this bug goes away. Now to find out why.

comment:7 Changed 3 weeks ago by nickm

Here is the bottom of the call-stack when we hit the error:

#0  check_ed_keys_callback (now=1533216819, options=0x555555a3ce30)
    at src/or/main.c:1840
#1  0x00005555555bf081 in periodic_event_dispatch (ev=0x555555a78060, 
    data=0x555555a06380 <periodic_events+576>) at src/or/periodic.c:55
#2  0x00005555555a592d in rescan_periodic_events (options=<optimized out>)
    at src/or/main.c:1605
#3  0x00005555556b9055 in hs_service_map_has_changed ()
    at src/or/hs_service.c:2978
#4  0x00005555555d6dca in rend_service_prune_list_impl_ ()
    at src/or/rendservice.c:622
#5  rend_service_prune_list () at src/or/rendservice.c:637
#6  0x00005555556aff50 in stage_services (service_list=0x5555579032d0)
    at src/or/hs_config.c:51
#7  hs_config_service_all (options=options@entry=0x555555a3ce30, 
    validate_only=validate_only@entry=0) at src/or/hs_config.c:565
#8  0x0000555555649852 in options_act (old_options=0x555558251ab0)
    at src/or/config.c:1853

comment:8 Changed 3 weeks ago by nickm

Okay, I think the problem here is actually in the implementation of rescan_periodic_events -- it invokes the events that it enables, which shouldn't really happen from this low in the call stack.

comment:9 Changed 3 weeks ago by nickm

Status: acceptedneeds_review

Fix in branch bug27003_034_v2; PR at https://github.com/torproject/tor/pull/261

comment:10 Changed 2 weeks ago by ahf

Status: needs_reviewmerge_ready

Looks good. Have we stopped having the "Roger newline" at the end of files?

comment:11 Changed 2 weeks ago by nickm

we're phasing out the extra newline-- I'm not convinced Roger ever actually wanted it, and everybody seems to have trouble with it.

comment:12 Changed 2 weeks ago by nickm

For the record, I'd like an extra look here from somebody who can tell me whether I'm going to mess up the HS code with this change.

comment:13 Changed 2 weeks ago by teor

Reviewer: asn
Status: merge_readyneeds_review

Since asn wrote #25761 and dgoulet is on holidays, your best reviewer is asn.

(For what it's worth, I checked this fix and #25761 and they look fine.)

comment:14 Changed 13 days ago by asn

Status: needs_reviewmerge_ready

Hmm, I can't seem to be able to repro this bug neither with bug27003_034_v2^ or with 7e4ac02. See logs in the end of this post.

In any case, I manually analyzed this patch with regards to the #25939 changes to make sure that it doesn't impact HSes. I think there is no problem here for the following reason:

  • If after a HUP we end up with HSes enabled, this change should not cause issues since the periodic HS event will just trigger immediately after in the event loop and do all the useful HS stuff.
  • If after a HUP we end up with no HSes enabled, this patch does not impact behavior since it only mods periodic_event_enable() and not period_event_disable().

Furthermore, chutney tests run well for me, so this seems like a good thing to merge_ready.


Here are logs of a successful run with an unpatched bug27003_034_v2^:

$ ./src/or/tor -f ../confs/bug27003
Aug 07 11:48:01.860 [notice] Tor 0.3.4.4-rc-dev (git-861d690018c213ad) running on Linux with Libevent 2.1.8-stable, OpenSSL 1.1.0h, Zlib 1.2.11, Liblzma N/A, and Libzstd N/A.
Aug 07 11:48:01.860 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Aug 07 11:48:01.860 [notice] Read configuration file "/home/f/Computers/tor/mytor/../confs/bug27003".
Aug 07 11:48:01.863 [notice] Your ContactInfo config option is not set. Please consider setting it, so we can contact you if your server is misconfigured or something else goes wrong.
Aug 07 11:48:01.863 [notice] Based on detected system memory, MaxMemInQueues is set to 5754 MB. You can override this by setting MaxMemInQueues by hand.
Aug 07 11:48:01.863 [warn] ControlPort is open, but no authentication method has been configured.  This means that any program on your computer can reconfigure your Tor.  That's bad!  You should upgrade your Tor controller as soon as possible.
Aug 07 11:48:01.863 [notice] Scheduler type KIST has been enabled.
Aug 07 11:48:01.863 [notice] Opening Control listener on 127.0.0.1:9059
Aug 07 11:48:01.863 [notice] Opening OR listener on 0.0.0.0:0
Aug 07 11:48:01.863 [notice] OR listener listening on port 34639.
Aug 07 11:48:01.000 [notice] Configured to measure directory request statistics, but no GeoIP database found. Please specify a GeoIP database using the GeoIPFile option.
Aug 07 11:48:01.000 [notice] Your Tor server's identity key fingerprint is 'Unnamed F56258E818AEFB2BF55F5DDF8DD7D63DF0C0F63F'
Aug 07 11:48:01.000 [notice] Bootstrapped 0%: Starting
Aug 07 11:48:02.000 [warn] Please upgrade! This version of Tor (0.3.4.4-rc-dev) is not recommended, according to the directory authorities. Recommended versions are: 0.2.9.14,0.2.9.15,0.2.9.16,0.3.2.10,0.3.2.11,0.3.3.2-alpha,0.3.3.3-alpha,0.3.3.4-alpha,0.3.3.5-rc,0.3.3.6,0.3.3.7,0.3.3.8,0.3.3.9,0.3.4.1-alpha,0.3.4.2-alpha,0.3.4.3-alpha,0.3.4.4-rc,0.3.4.5-rc,0.3.4.6-rc
Aug 07 11:48:04.000 [notice] Starting with guard context "default"
Aug 07 11:48:04.000 [notice] Bootstrapped 80%: Connecting to the Tor network
Aug 07 11:48:04.000 [notice] Bootstrapped 85%: Finishing handshake with first hop
Aug 07 11:48:04.000 [notice] Guessed our IP address as XXX (source: 204.13.164.118).
Aug 07 11:48:05.000 [notice] Bootstrapped 90%: Establishing a Tor circuit
Aug 07 11:48:06.000 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Aug 07 11:48:06.000 [notice] Bootstrapped 100%: Done
Aug 07 11:48:06.000 [notice] Now checking whether ORPort XXX:34639 is reachable... (this may take up to 20 minutes -- look for log messages indicating success)
Aug 07 11:48:07.000 [notice] New control connection opened from 127.0.0.1.
Aug 07 11:48:11.000 [warn] ControlPort is open, but no authentication method has been configured.  This means that any program on your computer can reconfigure your Tor.  That's bad!  You should upgrade your Tor controller as soon as possible.
Aug 07 11:48:11.000 [notice] Closing no-longer-configured OR listener on 0.0.0.0:34639
Aug 07 11:48:11.000 [notice] Closing old OR listener on 0.0.0.0:34639
Aug 07 11:48:14.000 [notice] Your ContactInfo config option is not set. Please consider setting it, so we can contact you if your server is misconfigured or something else goes wrong.
Aug 07 11:48:14.000 [warn] ControlPort is open, but no authentication method has been configured.  This means that any program on your computer can reconfigure your Tor.  That's bad!  You should upgrade your Tor controller as soon as possible.
Aug 07 11:48:14.000 [notice] Opening OR listener on 0.0.0.0:9999
Aug 07 11:48:14.000 [notice] Heartbeat: It seems like we are not in the cached consensus.
Aug 07 11:48:14.000 [notice] Heartbeat: Tor's uptime is 0:00 hours, with 6 circuits open. I've sent 15 kB and received 16 kB.
Aug 07 11:48:14.000 [notice] Circuit handshake stats since last time: 0/0 TAP, 0/0 NTor.
Aug 07 11:48:14.000 [notice] Since startup, we have initiated 0 v1 connections, 0 v2 connections, 0 v3 connections, and 1 v4 connections; and received 0 v1 connections, 0 v2 connections, 0 v3 connections, and 0 v4 connections.
Aug 07 11:48:14.000 [notice] DoS mitigation since startup: 0 circuits killed with too many cells. [cc not enabled] [conn not enabled] 0 single hop clients refused.
Aug 07 11:48:14.000 [notice] Your Tor server's identity key fingerprint is 'Unnamed F56258E818AEFB2BF55F5DDF8DD7D63DF0C0F63F'
Aug 07 11:48:14.000 [notice] Now checking whether ORPort XXX:9999 is reachable... (this may take up to 20 minutes -- look for log messages indicating success)
Aug 07 11:48:14.000 [notice] Configured to measure directory request statistics, but no GeoIP database found. Please specify a GeoIP database using the GeoIPFile option.
^CAug 07 11:48:17.000 [notice] Interrupt: we have stopped accepting new connections, and will shut down in 30 seconds. Interrupt again to exit now.
$ ncat localhost 9059
AUTHENTICATE
250 OK
RESETCONF ORPort
250 OK
SETCONF ORPort=9999
250 OK

comment:15 Changed 13 days ago by nickm

Resolution: fixed
Status: merge_readyclosed

okay, merged to 0.3.4 and forward.

Note: See TracTickets for help on using tickets.