Opened 2 years ago

Closed 18 months ago

Last modified 18 months ago

#23693 closed defect (fixed)

0.3.1.7: Assertion threadpool failed in cpuworker_queue_work

Reported by: alif Owned by: nickm
Priority: Medium Milestone: Tor: 0.3.4.x-final
Component: Core Tor/Tor Version: Tor: 0.3.1.9
Severity: Normal Keywords: prop286, 034-triage-20180328, 034-must crash 033-backport 032-backport 031-backport
Cc: arma Actual Points:
Parent ID: Points:
Reviewer: asn Sponsor:

Description (last modified by arma)

On Ubuntu 14.04 I installed Tor version 0.3.1.7 (git-5fa14939bca67c23)

Upon starting tor as a service, it soon crashes. The following are the log entries:

Sep 29 02:26:03.000 [notice] Tor 0.3.1.7 (git-5fa14939bca67c23) opening log file.
Sep 29 02:26:03.000 [notice] Parsing GEOIP IPv4 file /usr/share/tor/geoip.
Sep 29 02:26:03.000 [notice] Parsing GEOIP IPv6 file /usr/share/tor/geoip6.
Sep 29 02:26:03.000 [warn] Could not open "/usr/share/doc/tor/tor-exit-notice.html": Permission denied
Sep 29 02:26:03.000 [warn] DirPortFrontPage file '/usr/share/doc/tor/tor-exit-notice.html' not found. Continuing anyway.
Sep 29 02:26:03.000 [notice] Bootstrapped 0%: Starting
Sep 29 02:26:04.000 [notice] Starting with guard context "default"
Sep 29 02:26:04.000 [notice] Opening Socks listener on /var/run/tor/socks
Sep 29 02:26:04.000 [notice] Opening Control listener on /var/run/tor/control
Sep 29 02:26:04.000 [notice] Bootstrapped 5%: Connecting to directory server
Sep 29 02:26:04.000 [notice] Bootstrapped 10%: Finishing handshake with directory server
Sep 29 02:26:04.000 [notice] Bootstrapped 15%: Establishing an encrypted directory connection
Sep 29 02:26:05.000 [notice] Bootstrapped 20%: Asking for networkstatus consensus
Sep 29 02:26:05.000 [notice] Bootstrapped 25%: Loading networkstatus consensus
Sep 29 02:26:08.000 [err] tor_assertion_failed_(): Bug: ../src/or/cpuworker.c:499: cpuworker_queue_work: Assertion threadpool failed; aborting. (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug: Assertion threadpool failed in cpuworker_queue_work at ../src/or/cpuworker.c:499. Stack trace: (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(log_backtrace+0x42) [0x5624134a32b2] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(tor_assertion_failed_+0x94) [0x5624134bb904] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(cpuworker_queue_work+0x65) [0x56241345f395] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(consdiffmgr_add_consensus+0x2f3) [0x562413450fe3] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(networkstatus_set_current_consensus+0x9f1) [0x562413395971] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(connection_dir_reached_eof+0xc09) [0x5624134678d9] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(+0x105e6b) [0x562413440e6b] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(+0x4e921) [0x562413389921] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x754) [0x7eff0e3a9f24] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(do_main_loop+0x24d) [0x56241338aa4d] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(tor_main+0x1c35) [0x56241338e215] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(main+0x19) [0x5624133863c9] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7eff0d556f45] (on Tor 0.3.1.7 )
Sep 29 02:26:08.000 [err] Bug:     /usr/bin/tor(+0x4b41b) [0x56241338641b] (on Tor 0.3.1.7 )__

Child Tickets

Change History (42)

comment:1 Changed 2 years ago by arma

Component: - Select a componentCore Tor/Tor
Milestone: Tor: 0.3.2.x-final

comment:2 Changed 2 years ago by arma

Description: modified (diff)

comment:3 Changed 2 years ago by arma

Is this repeatable?

comment:4 Changed 2 years ago by arma

Can you paste your torrc file? It looks like you modified it from the original.

Also, is this the Tor deb? Or did you install Tor from some other way?

comment:5 Changed 2 years ago by arma

Summary: 0.3.1.7 daemon fails0.3.1.7: Assertion threadpool failed in cpuworker_queue_work

comment:6 Changed 2 years ago by nickm

alif, if you could answer any of the questions above, that would help us diagnose and fix this bug. I have some guesses below, but they're just guesses.

Some ideas, based on looking at the code: There are two ways I think this could happen: if we reach cpuworker_queue_work() without having called cpu_init(), or if we somehow fail to create a threadpool in cpu_init() when we do call it. But I don't think it can be the second case, since that would have created a nonfatal assertion from threadpool_new().

We call cpu_init() in two cases: when our settings change, the transition affects workers, and we have become a server; or when we start as a server in main.c.

I think that the check in the first cpu_init() case might be wrong: if we start as a client, and then transition to a bridge (not a public server), I don't think we will trigger options_transition_affects_workers().

comment:7 Changed 2 years ago by nickm

Owner: set to nickm
Status: newaccepted

comment:8 Changed 2 years ago by nickm

Status: acceptedneeds_review

Possible fix in branch bug23693_029 in my public repository, assuming I have the diagnosis right.

comment:9 Changed 2 years ago by nickm

Keywords: 029-backport 030-backport 031-backport added

comment:10 Changed 2 years ago by alif

Well, I'm no longer able to reproduce this, nickm! Sorry.
It persisted for a couple of days after having updated Tor to 0.3.1.7 using a deb from the projects repository, until I had to reboot for a different reason.

Now I'm back to "[notice] While fetching directory info, no running diverseness known. Will try again later. (purpose 6)" which is preventing me from making a circuit via obfs3, even though I'm able to do so in the Tor-browser via obfs4. But that's a different issue.

Anyway, my torrc at the time of the errors is the following (I had disabled bridges to try to debug and to make the report less complicated). I removed commented lines for clarity and redacted secrets:

<begin torrc>

Log notice file /var/log/tor/notices.log

ControlPort 9051
HashedControlPassword 16:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

PortForwarding 1

Address redacted.example.com

Nickname XXXXXXXX

ContactInfo XXXXXXXXelsewhereXXXXXX

DirPort 9030 # what port to advertise for directory connections
DirPortFrontPage /usr/share/doc/tor/tor-exit-notice.html

ExitPolicy reject *:* # no exits allowed

HiddenServiceStatistics 1

UseBridges 0
UpdateBridgesFromAuthority 1

ClientTransportPlugin obfs2,obfs3,ScrambleSuit exec /usr/bin/obfsproxy managed

#Some bridge definitions go here; obfs3 and plain

<end torrc>

Also here's my /etc/apparmor.d/abstractions/tor since I had modified it to be able to run obfsproxy in ubuntu 14.04:
<begin /etc/apparmor.d/abstractions/tor>

# vim:syntax=apparmor

  #include <abstractions/base>
  #include <abstractions/nameservice>

  network tcp,
  network udp,

  capability chown,
  capability dac_read_search,
  capability fowner,
  capability fsetid,
  capability setgid,
  capability setuid,

  /usr/bin/tor r,
  /usr/sbin/tor r,

  # Needed by obfs4proxy
  /proc/sys/net/core/somaxconn r,

  /proc/sys/kernel/random/uuid r,
  /sys/devices/system/cpu/ r,
  /sys/devices/system/cpu/** r,

  /etc/tor/* r,
  /usr/share/tor/** r,

  /usr/bin/obfsproxy PUx,
  /usr/bin/obfs4proxy Pix,

<end /etc/apparmor.d/abstractions/tor>

Last edited 2 years ago by alif (previous) (diff)

comment:11 Changed 2 years ago by alif

Now, trying to solve my connectivity problem, I installed obfs4proxy from the Xenial repository, and copied over the obfs4 bridge definitions from Tor-browser's torrc to /var/lib/tor but still nothing changed. I still got "[notice] While fetching directory info, no running dirservers known. Will try again later. (purpose 6)"

But after I copied the "cached-x" files from Tor browser's Data directory to my system and restarted the Tor service, the exception occurred again:

Oct 03 00:54:11.000 [notice] Tor 0.3.1.7 (git-5fa14939bca67c23) opening log file.
Oct 03 00:54:11.000 [notice] Parsing GEOIP IPv4 file /usr/share/tor/geoip.
Oct 03 00:54:11.000 [notice] Parsing GEOIP IPv6 file /usr/share/tor/geoip6.
Oct 03 00:54:11.000 [warn] Could not open "/usr/share/doc/tor/tor-exit-notice.html": Permission denied
Oct 03 00:54:11.000 [warn] DirPortFrontPage file '/usr/share/doc/tor/tor-exit-notice.html' not found. Continuing anyway.
Oct 03 00:54:11.000 [notice] Bootstrapped 0%: Starting
Oct 03 00:54:12.000 [notice] Starting with guard context "bridges"
Oct 03 00:54:12.000 [notice] new bridge descriptor 'XXXXXXX' (cached): $XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX~XXXXXXXXX at XX.XX.XXX.XX
Oct 03 00:54:12.000 [notice] new bridge descriptor 'XXXXXXXX' (cached): $XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX~XXXXXXXXXXXXX at XX.XXX.XX.XX
…
…
…
Oct 03 00:54:12.000 [notice] new bridge descriptor 'XXXX' (cached): $XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX~XXXX at XX.XX.XX.XX
Oct 03 00:54:12.000 [notice] new bridge descriptor 'XXXXX' (cached): $XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX~XXXXX at XXX.XX.XX.XX
Oct 03 00:54:12.000 [notice] Delaying directory fetches: Pluggable transport proxies still configuring
Oct 03 00:54:12.000 [notice] Opening Socks listener on /var/run/tor/socks
Oct 03 00:54:12.000 [notice] Opening Control listener on /var/run/tor/control
Oct 03 00:54:13.000 [err] tor_assertion_failed_(): Bug: ../src/or/cpuworker.c:499: cpuworker_queue_work: Assertion threadpool failed; aborting. (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug: Assertion threadpool failed in cpuworker_queue_work at ../src/or/cpuworker.c:499. Stack trace: (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/bin/tor(log_backtrace+0x42) [0x55fb088902b2] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/bin/tor(tor_assertion_failed_+0x94) [0x55fb088a8904] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/bin/tor(cpuworker_queue_work+0x65) [0x55fb0884c395] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/bin/tor(consdiffmgr_rescan+0x9a7) [0x55fb0883f037] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/bin/tor(+0x4ec7d) [0x55fb08776c7d] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x754) [0x7fa5e1eecf24] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/bin/tor(do_main_loop+0x24d) [0x55fb08777a4d] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/bin/tor(tor_main+0x1c35) [0x55fb0877b215] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/bin/tor(main+0x19) [0x55fb087733c9] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fa5e1099f45] (on Tor 0.3.1.7 )
Oct 03 00:54:13.000 [err] Bug:     /usr/bin/tor(+0x4b41b) [0x55fb0877341b] (on Tor 0.3.1.7 )

the files I copied are:

  • cached-certs
  • cached-descriptors
  • cached-descriptors.new
  • cached-microdesc-consensus
  • cached-microdescs
  • cached-microdescs.new

lines changed in torrc:

ClientTransportPlugin obfs2,obfs3,obfs4,scramblesuit exec /usr/bin/obfs4proxy
#ClientTransportPlugin obfs2,obfs3,ScrambleSuit exec /usr/bin/obfsproxy managed

following the previous two lines in torrc are some obfs4 definitions copied from tor-browser

Last edited 2 years ago by alif (previous) (diff)

comment:12 Changed 2 years ago by alif

Commenting out #DirPort 9030 solves it. Re-enabling it reproduces that assertion failure.

I now have a working Tor service that is able to go all the way to Bootstrapped 100%: Done.

Please, note that I haven't tested commenting out Dirport within my original configuration before the introduced obfs4, bridge definitions and data files copied from Tor-Browser.

Also note that when the assertion failure disappeared and I was left with "[notice] While fetching directory info, no running dirservers known. Will try again later. (purpose 6)" in Comment:10, I had DirPort 9030 enabled!

Last edited 2 years ago by alif (previous) (diff)

comment:13 Changed 2 years ago by nickm

Keywords: review-group-24 added

review-group-24 is now open.

comment:14 in reply to:  8 Changed 2 years ago by dgoulet

Status: needs_reviewmerge_ready

Replying to nickm:

Possible fix in branch bug23693_029 in my public repository, assuming I have the diagnosis right.

lgtm; I confirm that going from client -> bridge is working properly.

Agree on the backport.

comment:15 Changed 2 years ago by nickm

Thanks! I've merged this to 0.2.9 and forward.

comment:16 Changed 2 years ago by nickm

Milestone: Tor: 0.3.2.x-finalTor: 0.2.9.x-final
Resolution: fixed
Status: merge_readyclosed

(please reopen if this bug occurs in any version released _after_ today.)

comment:17 Changed 23 months ago by rustybird

Resolution: fixed
Status: closedreopened

(please reopen if this bug occurs in any version released _after_ today.)

It still occurs if server_mode() is false but dir_server_mode() is true. Doesn't seem to make a difference (with 0.3.1.9) if it is set up like that in torrc on startup, or the result of being reconfigured.

(Use case for this configuration: http://github.com/rustybird/corridor calls SETCONF DirPort="127.0.0.1:9030 NoAdvertise" to ensure the client continues to refresh the consensus even when dormant.)

comment:18 Changed 23 months ago by nickm

Milestone: Tor: 0.2.9.x-finalTor: 0.3.2.x-final

comment:19 Changed 22 months ago by nickm

Milestone: Tor: 0.3.2.x-finalTor: 0.3.4.x-final

comment:20 in reply to:  17 Changed 22 months ago by teor

Keywords: prop286 added; 029-backport 030-backport 031-backport review-group-24 removed
Version: Tor: 0.3.1.7Tor: 0.3.1.9

Replying to rustybird:

(please reopen if this bug occurs in any version released _after_ today.)

It still occurs if server_mode() is false but dir_server_mode() is true. Doesn't seem to make a difference (with 0.3.1.9) if it is set up like that in torrc on startup, or the result of being reconfigured.

(Use case for this configuration: http://github.com/rustybird/corridor calls SETCONF DirPort="127.0.0.1:9030 NoAdvertise" to ensure the client continues to refresh the consensus even when dormant.)

Running a directory mirror will cause a lot of unnecessary load and disk usage, particularly with newer tor versions. You'll generate a whole bunch of compressed diffs that you'll never serve.

Try using FetchDirInfoEarly 1 instead. If this doesn't work, just issue a RESOLVE request to a common address every hour or two at random, to keep tor alive. We have a proposal for a better controller API for this, it's at https://gitweb.torproject.org/torspec.git/tree/proposals/286-hibernation-api.txt

Also, if you want a consensus with IPv6 addresses on a client, use UseMicrodescriptors 0.

If you don't care about descriptors, and want to save bandwidth, use FetchServerDescriptors 0. You might find some bugs using this option, it's not well-tested.

You can set SOCKSPort 0 if you're not using it. It might add a bit of security.

Edit: link to prop286

Last edited 22 months ago by teor (previous) (diff)

comment:21 Changed 22 months ago by arma

Running maint-0.3.2, I start my Tor client with fetchuselessdescriptors 1 dirport 9030, and on startup I get this stacktrace and abort:

Jan 06 04:25:28.000 [notice] Bootstrapped 85%: Finishing handshake with first hop
Jan 06 04:25:29.000 [err] tor_assertion_failed_(): Bug: src/or/cpuworker.c:499: cpuworker_queue_work: Assertion threadpool failed; aborting. (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug: Assertion threadpool failed in cpuworker_queue_work at src/or/cpuworker.c:499. Stack trace: (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     src/or/tor(log_backtrace+0x42) [0x55f592aa5922] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     src/or/tor(tor_assertion_failed_+0x8c) [0x55f592ac071c] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     src/or/tor(cpuworker_queue_work+0x6f) [0x55f592a4bb1f] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     src/or/tor(consdiffmgr_rescan+0x82f) [0x55f592a3e44f] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     src/or/tor(+0x51aaf) [0x55f592973aaf] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x7fc) [0x7fbd389a03dc] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     src/or/tor(do_main_loop+0x244) [0x55f5929747c4] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     src/or/tor(tor_main+0x1c25) [0x55f592978005] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     src/or/tor(main+0x19) [0x55f59296ff29] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514) 
Jan 06 04:25:29.000 [err] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fbd3794ab45] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)
Jan 06 04:25:29.000 [err] Bug:     src/or/tor(+0x4df79) [0x55f59296ff79] (on Tor 0.3.2.8-rc-dev 5f2c7a85671ee514)  
Aborted

Looks like the consensus diff manager wants to use the threadpool, but I'm not a relay so nothing set it up.

comment:22 Changed 19 months ago by nickm

Keywords: 034-triage-20180328 added

comment:23 Changed 19 months ago by nickm

Keywords: 034-removed-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

comment:24 Changed 19 months ago by nickm

Keywords: 034-must crash 033-backport 032-backport added; 034-removed-20180328 removed

comment:25 Changed 19 months ago by tiejohg2sahth

I am affected by this bug too with the same stack trace as the original poster.

I have just migrated from Ubuntu 14.04 to 18.04. Tor was installed from the official PPA.

I am running a relay and have very little changes from the default configuration.
I am available if you need more data to reproduce.

comment:26 Changed 19 months ago by nickm

The most useful information here would be the tor version and your configuration (the torrc file)

comment:27 in reply to:  21 Changed 19 months ago by arma

Replying to arma:

Running maint-0.3.2, I start my Tor client with fetchuselessdescriptors 1 dirport 9030, and on startup I get this stacktrace and abort

To be clear, this is repeatable. I just did it again now, with Tor master:

Apr 09 21:59:53.749 [err] Bug: Assertion threadpool failed in cpuworker_queue_work at src/or/cpuworker.c:510. Stack trace: (on Tor 0.3.4.0-alpha-dev 21c81348a39dd235)
Apr 09 21:59:53.749 [err] Bug:     src/or/tor(log_backtrace+0x42) [0x5649ed2260b2] (on Tor 0.3.4.0-alpha-dev 21c81348a39dd235)
Apr 09 21:59:53.749 [err] Bug:     src/or/tor(tor_assertion_failed_+0x8c) [0x5649ed24140c] (on Tor 0.3.4.0-alpha-dev 21c81348a39dd235)
Apr 09 21:59:53.749 [err] Bug:     src/or/tor(cpuworker_queue_work+0x6f) [0x5649ed1c9cbf] (on Tor 0.3.4.0-alpha-dev 21c81348a39dd235)
Apr 09 21:59:53.749 [err] Bug:     src/or/tor(consdiffmgr_rescan+0x839) [0x5649ed1bc169] (on Tor 0.3.4.0-alpha-dev 21c81348a39dd235)
[...]

You could too, I think!

comment:28 in reply to:  26 Changed 19 months ago by tiejohg2sahth

Replying to nickm:

The most useful information here would be the tor version and your configuration (the torrc file)

Tor version as reported by apt-cache show tor: 0.3.2.10-1~bionic+1
My torrc: https://pastebin.com/raw/CWTMmwHc

comment:29 Changed 19 months ago by nickm

Keywords: 031-backport added
Status: reopenedneeds_review

Okay, there's a fix in bug23693_031_redux, probably.

comment:30 Changed 18 months ago by dgoulet

Reviewer: dgoulet

comment:31 Changed 18 months ago by asn

Reviewer: dgouletasn

comment:32 Changed 18 months ago by asn

Hmm, I took the torrc from comment:28 to test the patch. The original assert seems to be fixed but now it crashes on a different place:

Apr 17 14:01:00.000 [notice] Bootstrapped 0%: Starting
Apr 17 14:01:00.000 [notice] Starting with guard context "default"
Apr 17 14:01:00.000 [err] tor_assertion_failed_(): Bug: src/or/router.c:142: dup_onion_keys: Assertion onionkey failed; aborting. (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug: Assertion onionkey failed in dup_onion_keys at src/or/router.c:142. Stack trace: (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(log_backtrace+0x43) [0x557795fffab3] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(tor_assertion_failed_+0x8d) [0x557796018add] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(dup_onion_keys+0x10f) [0x557795f22a9f] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(server_onion_keys_new+0x41) [0x557795ef2f91] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(+0x1283b7) [0x557795fb93b7] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(threadpool_new+0x18b) [0x55779601f91b] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(cpu_init+0xad) [0x557795fb97dd] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(do_main_loop+0x15d) [0x557795ee0d2d] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(tor_main+0xe25) [0x557795ee3b25] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(main+0x19) [0x557795edc729] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7faf9e096a87] (on Tor 0.3.1.10-dev 386f8016b7373bec)
Apr 17 14:01:00.000 [err] Bug:     ./src/or/tor(_start+0x2a) [0x557795edc77a] (on Tor 0.3.1.10-dev 386f8016b7373bec)

I guess that when we are not in server mode, Tor won't create the onionkey in init_keys()... I wonder if we should try to fix these situations with patches like the one from comment:29, or we should just disallow having a DirPort without an ORPort and abort if such a configuration is seen. IIUC, we are planning to eventually deprecate DirPort anyhow and just use BEGIN_DIR, right?

Here is the torrc:

DirPort 9030
SocksPort 0
Log notice stdout
DataDirectory /tmp/tor
RelayBandwidthRate 4 MBytes
RelayBandwidthBurst 5 MBytes
ExitRelay 0

comment:33 Changed 18 months ago by asn

Status: needs_reviewneeds_revision

comment:34 Changed 18 months ago by teor

Maybe we will deprecate DirPort some time in the future. Maybe we won't, There are bootstrapping and diagnostic issues.

But here are some questions we can answer right now:

  • Do we support DirPort without ORPort?
  • If we do, when was the last Tor release that it actually worked?
  • Why don't we have any tests for DirPort only operation?

As far as I can tell, people who get this bug seem to be setting DirPort as a workaround.
They don't actually want to serve descriptors, they just want them available locally.

If we can't find a use case that involves serving descriptors, I think we should:

  • fix the hibernation options so they allow people to download descriptors every hour if that's what they need, then
  • deprecate DirPort-only operation

But I'm not sure if we can remove features as a backport, so we are stuck with fixing crashes like this (or saying "don't do that").

comment:35 Changed 18 months ago by nickm

Cc: arma added
Status: needs_revisionneeds_information

Putting this into needs_information based on Teor's questions above. Roger, what do you think here?

comment:36 Changed 18 months ago by teor

Status: needs_informationnew

I can reproduce this issue with the following minimal test case:

tor DataDirectory `mktemp -d` DirPort 12345 ORPort 0 SOCKSPort 0

This command works with these tor versions:

  • 0.2.5.16-dev
  • 0.2.9.15-dev

This command fails for these tor versions:

  • 0.3.1.10-dev
  • 0.3.2.10-dev
  • 0.3.3.5-rc
  • master

With minor variations on:

Apr 18 10:08:27.000 [notice] Bootstrapped 40%: Loading authority key certs
Apr 18 10:08:34.000 [err] tor_assertion_failed_: Bug: src/or/cpuworker.c:499: cpuworker_queue_work: Assertion threadpool failed; aborting. (on Tor 0.3.1.10-dev ce8e7427b9284ef1)
...

So I suggest:

  • now: we fix the bugs in this feature in 0.3.1 and later
  • in 0.3.4 or 0.3.5: we decide if we want to support DirPort-only and write tests for it, or if we want to deprecate it

One use case for DirPort-only is a local directory mirror for large deployments. It can be configured using the FallbackDir torrc option, to take load off relays or authorities. But we could just tell people to use ORPort 12345 PublishDescriptor 0 as a workaround.

Edit: forgot a version

Last edited 18 months ago by teor (previous) (diff)

comment:37 Changed 18 months ago by nickm

Status: newneeds_review

I've updated bug23693_031_redux with an actual commit to actually work. I'm fine merging it to 0.3.1 and forward; we can open a separate ticket to test or disable the feature.

comment:38 Changed 18 months ago by asn

Status: needs_reviewmerge_ready

LGTM and tests pass.

comment:39 Changed 18 months ago by nickm

Resolution: fixed
Status: merge_readyclosed

merged!

comment:40 Changed 18 months ago by nickm

The original version of this patch had a bug: #23693.

comment:41 Changed 18 months ago by tiejohg2sahth

My I ask when the fix will be available through the official PPA on Ubuntu 18.04?

I see with:

curl -s https://deb.torproject.org/torproject.org/pool/main/t/tor/ | grep -o '"tor_.*bionic.*_amd64\.deb"'

that 0.3.3.5-rc1 and 0.3.4.0-alpha are available in the PPA, presumably with the fix for this bug merged in it, however when I update with apt, I am still stuck with the 0.3.2.10 version which is unfortunately unusable.

Or is there a particular step to do to be able to update to the pre-release versions?

Thank you

comment:42 Changed 18 months ago by teor

No, this fix is not in 0.3.3.5-rc:
https://blog.torproject.org/tor-0335-rc-released
But it is in nightly.

The fix will be available in the next 0.3.1, 0.3.2, and 0.3.3 releases.

Your options are:

Note: See TracTickets for help on using tickets.