Opened 3 months ago

Last modified 8 days ago

#31524 assigned defect

GETINFO bw-event-cache spike value(s) in it

Reported by: toralf Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.4.1.4-rc
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Discussing with atagar an issue with the bandwidth graph of Nyx yield into the fact, that Tor reports 1 or 2 unusal high values when nyx is started. After a minute or so the effect is gone and Nyx can scale the y-axis to a reasonable value.

atagar found out, that there are 1 or 2 spike values in it (look for 3495411062,3952777936) of the debug out out of Nyx:

08/25/2019 21:18:09 [TRACE] Sent to tor: GETINFO bw-event-cache
08/25/2019 21:18:09 [TRACE] Received from tor: 250-bw-event-cache=1374134393,1426052109 20086925,20125282 21800116,21982023 21443601,21381953 21284145,21223794 20736818,20840129 20095450,20002302 20327971,20091436 18481818,18770265 20642518,20484788 17916670,17984439 21675050,21693959 21112734,20784723 19053526,19193284 19295884,19489502 21443785,21575604 21134294,21020113 24810387,24663932 21866896,22876759 22146840,22135880 21265637,21323808 20355415,20195913 22161894,22076797 23243894,23320227 22162244,22280711 20837266,20883852 21569369,21709237 19729547,19565307 19509953,19770894 20445427,20681393 21164201,22030989 23182698,23054018 22427910,22683231 24570368,24540675 23788945,23736745 24326209,24063010 23491274,23547576 24465451,24291718 23117418,23200615 22771292,22946982 23969237,23981784 23612713,23562155 23382374,23423884 20351264,20211681 21590861,21769434 21812207,21644025 23301464,23486635 23190866,23611494 25596210,25579703 23296743,23781312 24948896,24956987 25415713,25555740 26407789,26504165 25006181,25166607 3495411062,3952777936 28124111,26406950 26007090,26935060 26897763,26935089 26866687,27170833 28168859,26699345 25750280,26406950 27570110,26935089 28007323,26935089 26590862,26935089 28579673,26935089 112518425,114606735 27113917,27462685 26127629,26406921 79812696,79748989 25846716,26896706 29074858,26973472 24947719,26935089 27687620,26864938 25539933,26838674 27171107,27068897 27140138,26879337 26899976,26495460 25950990,26844241 28442790,27025937 26116568,26829132 27827545,26960905 27035490,26415775\n250 OK

Just a guess: The 2 affected relays have about 20 - 40 MByte/sec bw. The spike values are often above 3,000,000,000, looking like an arbitrary unsigned long int.

Whilst the severity on the Nyx graph is negligible I do wonder about the root cause.

Child Tickets

Change History (11)

comment:1 Changed 3 months ago by teor

Keywords: 042-should security-low added
Milestone: Tor: 0.4.2.x-final
Severity: TrivialNormal
Status: newneeds_information

Marking as security-low, because this issue could be:

  • reading uninitialised values,
  • unsigned int underflow, or
  • some other calculation error.

We do expect some spikes for bandwidth self-tests and consensus downloads, but a 3 GB spike is unexpected.

We recently did a refactor of this code.
Does the issue occur in 0.2.9 or 0.3.5?

comment:2 Changed 6 weeks ago by toralf

I can confirm this under 0.4.1.x at a stable hardened Gentoo and at 0.4.0.5 at a Debian 10 system. I do not run relays with older Tor versions.

comment:3 Changed 6 weeks ago by teor

Status: needs_informationnew

If we can't reproduce these issues on older Tor versions, that makes it harder for us to diagnose this issue,

Here's how I think we can move forward:

  • set an "unreasonably large" bandwidth amount threshold, and log a BUG() warning when:
    • we try to add an amount over that threshold,
    • any of our totals are over that threshold, or
    • we try to put an amount over that threshold in our descriptor, extra-info, or control port.
  • when we subtract a bandwidth value, log a BUG() if the final amount would be less than zero.

(We recently added code that subtracts bandwidth values if the circuit is closed before writing.)

BUG() warnings log stack traces, so we should be able to locate the source of the issue more easily.

Last edited 6 weeks ago by teor (previous) (diff)

comment:4 Changed 6 weeks ago by teor

Owner: set to teor
Status: newassigned

comment:5 Changed 6 weeks ago by arma

I always figured this was from Tor starting up and seeding itself with its previous self-measured bandwidth value from the state file, and essentially thinking of itself as having done it all at once just then.

comment:6 in reply to:  5 ; Changed 6 weeks ago by nickm

Replying to arma:

I always figured this was from Tor starting up and seeding itself with its previous self-measured bandwidth value from the state file, and essentially thinking of itself as having done it all at once just then.

Let's see if that looks like the case. Toralf, does the spike correspond to a time in your logs when you restarted Tor?

comment:7 in reply to:  6 Changed 6 weeks ago by toralf

Replying to nickm:

Replying to arma:

I always figured this was from Tor starting up and seeding itself with its previous self-measured bandwidth value from the state file, and essentially thinking of itself as having done it all at once just then.

Let's see if that looks like the case. Toralf, does the spike correspond to a time in your logs when you restarted Tor?

This could indeed be the case. The issue is not seen immediately after restarting the Tor process, after a while I do have a spike of 100 MB, after a little bit longer while I do have the 3 GB.

comment:8 Changed 6 weeks ago by nickm

Are those spikes at the same interval each time?

comment:9 in reply to:  8 Changed 6 weeks ago by toralf

Replying to nickm:

Are those spikes at the same interval each time?

The behaviour is always the same after restarting a Tor process. The values are the same. They do start with few MB - when I start Nyx immediately after restarting the Tor process - and after few minutes - when I restart Nyx - the spike is about 3 GB.

comment:10 Changed 4 weeks ago by teor

Keywords: security-low removed
Milestone: Tor: 0.4.2.x-finalTor: unspecified
Owner: teor deleted

This doesn't look like a security issue, so I don't think we need to fix it in 0.4.2.

comment:11 Changed 8 days ago by teor

Keywords: 042-should removed
Note: See TracTickets for help on using tickets.