Opened 5 weeks ago

Last modified 5 weeks ago

#32059 new defect

Bug when reading accounting interval while starting a new relay

Reported by: willbarr Owned by:
Priority: Low Milestone: Tor: 0.4.3.x-final
Component: Core Tor/Tor Version: Tor: 0.3.5.8
Severity: Normal Keywords: backport?
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

[warn] router_should_be_dirserver(): Bug: An accounting interval is not allowed to be zero seconds long. Raising to 1. (on Tor 0.3.5.8 )
[notice] Your Tor server's identity key fingerprint is 'x'
[notice] Configured hibernation. This interval begins at 2019-09-30 00:00:00 and ends at 2019-09-30 00:00:00. We have no prior estimate for bandwidth, so we will start out awake and hibernate when we exhaust our quota.
[notice] Configured hibernation. This interval begins at 2019-10-14 00:00:00 and ends at 2019-10-15 00:00:00. We have no prior estimate for bandwidth, so we will start out awake and hibernate when we exhaust our quota.

To reproduce: start a new relay with following torrc

SocksPort 0

DataDirectory /var/lib/tor/relay3

ORPort 1234

Nickname test

AccountingMax 100 GBytes

AccountingStart day 00:00

Child Tickets

Change History (5)

comment:1 Changed 5 weeks ago by willbarr

Forgot to specify: it happens on CentOS 7.7(3.10.0-957.27.2.el7.x86_64)

comment:2 Changed 5 weeks ago by arma

I am not able to reproduce.

It seems like, to reproduce, we want to get some situation where
router_should_be_dirserver() gets called before accounting has been set up.

comment:3 Changed 5 weeks ago by nickm

Keywords: backport? added
Milestone: Tor: 0.4.3.x-final

comment:4 Changed 5 weeks ago by willbarr

gdb output of tor 0.4.2.2-alpha compiled from source on centos 7.6:

[root@test app]# gdb ./tor
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-115.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/tor-0.4.2.2-alpha/src/app/tor...done.
(gdb) break router_should_be_dirserver
Breakpoint 1 at 0x14886c: file src/feature/relay/router.c, line 1213.
(gdb) run -f ~/test
Starting program: /root/tor-0.4.2.2-alpha/src/app/./tor -f ~/test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Oct 15 00:47:33.252 [notice] Tor 0.4.2.2-alpha running on Linux with Libevent 2.0.21-stable, OpenSSL 1.1.1d, Zlib 1.2.7, Liblzma 5.2.2, and Libzstd 1.4.2.
Oct 15 00:47:33.252 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Oct 15 00:47:33.252 [notice] This version is not a stable Tor release. Expect more bugs than usual.
Oct 15 00:47:33.252 [notice] Read configuration file "/root/test".
Oct 15 00:47:33.253 [notice] Your ContactInfo config option is not set. Please consider setting it, so we can contact you if your server is misconfigured or something else goes wrong.
Oct 15 00:47:33.253 [notice] Based on detected system memory, MaxMemInQueues is set to 743 MB. You can override this by setting MaxMemInQueues by hand.
Oct 15 00:47:33.253 [notice] By default, Tor does not run as an exit relay. If you want to be an exit relay, set ExitRelay to 1. To suppress this message in the future, set ExitRelay to 0.
Oct 15 00:47:33.254 [notice] Opening OR listener on 0.0.0.0:1111
Oct 15 00:47:33.254 [notice] Opened OR listener on 0.0.0.0:1111
Oct 15 00:47:33.000 [notice] You are running a new relay. Thanks for helping the Tor network! If you wish to know what will happen in the upcoming weeks regarding its usage, have a look at https://blog.torproject.org/blog/lifecycle-of-a-new-relay
Oct 15 00:47:33.000 [notice] It looks like I need to generate and sign a new medium-term signing key, because I don't have one. To do that, I need to load (or create) the permanent master identity key. If the master identity key was not moved or encrypted with a passphrase, this will be done automatically and no further action is required. Otherwise, provide the necessary data using 'tor --keygen' to do it manually.
​
Breakpoint 1, decide_to_advertise_dir_impl (options=0x555555ab8b20, dir_port=<optimized out>, supports_tunnelled_dir_requests=1)
    at src/feature/relay/router.c:1299
1299      return router_should_be_dirserver(options, dir_port);
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64 libevent-2.0.21-4.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libscrypt-1.21-1.el7.x86_64 libseccomp-2.3.1-3.el7.x86_64 libzstd-1.4.2-1.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) n
Oct 15 00:47:39.000 [warn] router_should_be_dirserver(): Bug: An accounting interval is not allowed to be zero seconds long. Raising to 1. (on Tor 0.4.2.2-alpha )
1300    }
(gdb) bt
#0  decide_to_advertise_dir_impl (options=0x555555ab8b20, dir_port=<optimized out>, supports_tunnelled_dir_requests=<optimized out>)
    at src/feature/relay/router.c:1300
#1  0x000055555569dc7b in router_should_advertise_begindir (supports_tunnelled_dir_requests=<optimized out>, options=0x555555ab8b20)
    at src/feature/relay/router.c:1321
#2  router_dump_router_to_string (router=0x555555ad5490, ident_key=0x555555acd480, tap_key=<optimized out>, ntor_keypair=<optimized out>, 
    signing_keypair=0x555555acd7f0) at src/feature/relay/router.c:2959
#3  0x000055555569f942 in router_dump_and_sign_routerinfo_descriptor_body (ri=0x555555ad5490) at src/feature/relay/router.c:2240
#4  router_build_fresh_descriptor (r=0x7fffffffdca8, e=0x7fffffffdcb0) at src/feature/relay/router.c:2298
#5  0x000055555569fc23 in router_rebuild_descriptor (force=0) at src/feature/relay/router.c:2352
#6  0x000055555569fd8f in router_get_my_routerinfo_with_err (err=0x0) at src/feature/relay/router.c:1647
#7  0x000055555569fefd in router_get_my_routerinfo () at src/feature/relay/router.c:1629
#8  router_get_my_descriptor () at src/feature/relay/router.c:1676
#9  0x00005555556a0334 in init_keys () at src/feature/relay/router.c:1077
#10 0x0000555555656e9e in accounting_set_wakeup_time () at src/feature/hibernate/hibernate.c:608
#11 configure_accounting (now=1571100453) at src/feature/hibernate/hibernate.c:472
#12 0x00005555556575fd in accounting_run_housekeeping (now=now@entry=1571100453) at src/feature/hibernate/hibernate.c:588
#13 0x00005555556e2538 in or_state_save (now=1571100453) at src/app/config/statefile.c:519
#14 0x00005555556e27f2 in or_state_load () at src/app/config/statefile.c:465
#15 0x00005555556defc5 in options_act (old_options=0x0) at src/app/config/config.c:1984
#16 set_options (new_val=new_val@entry=0x555555ab8b20, msg=msg@entry=0x7fffffffe0d8) at src/app/config/config.c:983
#17 0x00005555556e0153 in options_init_from_string (cf_defaults=cf_defaults@entry=0x555555ab3d80 "", 
    cf=cf@entry=0x555555ab1f60 "SocksPort 0\n\nDataDirectory /var/lib/tor/relayd\n\nORPort 1111\n\nNickname test\n\nAccountingMax 100 GBytes\---Type <return> to continue, or q <return> to quit---
n\nAccountingStart day 00:00\n", command=command@entry=0, command_arg=command_arg@entry=0x0, msg=msg@entry=0x7fffffffe0d8)
    at src/app/config/config.c:5492
#18 0x00005555556e069f in options_init_from_torrc (argc=argc@entry=3, argv=argv@entry=0x555555ab1e20) at src/app/config/config.c:5317
#19 0x00005555555b55b2 in tor_init (argc=argc@entry=3, argv=argv@entry=0x555555ab1e20) at src/app/main/main.c:620
#20 0x00005555555b5dee in tor_run_main (tor_cfg=tor_cfg@entry=0x555555a8f010) at src/app/main/main.c:1300
#21 0x00005555555b44ba in tor_main (argc=3, argv=0x7fffffffe4b8) at src/feature/api/tor_api.c:164
#22 0x00005555555b4029 in main (argc=<optimized out>, argv=<optimized out>) at src/app/main/tor_main.c:32
(gdb)

comment:5 Changed 5 weeks ago by arma

Thanks. Yes, this is definitely a bug.

I recommend against a backport, since the user-facing issue is "just" a confusing log message that says "Bug".

The underlying problem is that our start-up / initialization code is all tied around itself. We're loading the torrc file, so we load the state file, but we don't have one yet so we write a new state file, but to save the state file we need to check on our current accounting status, so we initialize that, but somewhere along the way we decided that accounting_set_wakeup_time() should be deterministic, and we base it on your relay identity key, but on first start we don't have one yet, so we generate one, and after we've made it we want to make sure it parses, so we make a router descriptor for ourselves, but to properly make a router descriptor we need to know if we're going to list our DirPort in the descriptor, and to decide that we need to figure out if we're likely to exceed our accountingmax value in the current accounting interval, because if we are then we'll opt not to list the dirport. And remember, this is all happening because we loaded Tor's state file and found that it wasn't there yet. Then way way way later in options_act, we call things like

  /* Set up accounting */
  if (accounting_parse_options(options, 0)<0) {
    ...
  }
  if (accounting_is_enabled(options))
    configure_accounting(time(NULL));

You can see why I recommend against a backport. :)

We could fix this particular issue by making that log_warn(LD_BUG) in router_should_be_dirserver() be quieter. I guess that is orthogonal to whether we should try to untangle the startup initialization stuff -- which is really related to the upcoming modularization work.

Note: See TracTickets for help on using tickets.