Opened 4 years ago

Closed 4 years ago

Last modified 6 months ago

#13988 closed defect (fixed)

Report bandwidth usage aggregated over a longer period

Reported by: nickm Owned by:
Priority: Medium Milestone: Tor: 0.2.5.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-relay stats 025-backport
Cc: robgjansen, karsten, atagar Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

NUM_SECS_BW_SUM_INTERVAL == 15 minutes seems pretty short; can we go up to 4 hours or 6 hours or so?

Note that while 0.2.2.x was still on the network we couldn't do this:

/** How large are the intervals for which we track and report bandwidth use? */
/* XXXX Watch out! Before Tor 0.2.2.21-alpha, using any other value here would
 * generate an unparseable state file. */
#define NUM_SECS_BW_SUM_INTERVAL (15*60)

But now, nobody should be downgrading to 0.2.2.x; this change should be approximately trivial.

Child Tickets

Change History (19)

comment:1 Changed 4 years ago by karsten

Cc: karsten added

Related to #13838. I don't have a good answer yet. We can probably do this, but I want to think about what this might break. Already on my list.

comment:2 Changed 4 years ago by nickm

Okay. Do we think this is answerable on a next-month-or-so timeframe? I'd hate to delay indefinitely.

comment:3 Changed 4 years ago by karsten

Cc: atagar added

Okay, I looked at all metrics-related code that might be affected and future-proofed one that would only accept the current 15-minute intervals. I think I'm prepared for this change. And if we change the interval and something breaks, I can still fix it then.

Regarding the new interval length, I prefer either 4 hours or 12 hours with a slight preference towards 4 hours. I'm already using those two intervals in the "1 month" and "3 months" bandwidth graphs that Onionoo provides (see this example), so I wouldn't have to write new code. 6 hours would require me to write new code.

I suggest announcing this change on tor-dev@ before actually merging/deploying it. I'm also cc'ing atagar in case he knows of any Stem applications using relay or bridge bandwidth histories.

comment:4 Changed 4 years ago by nickm

Keywords: 025-backport added; easy removed
Status: newneeds_review

See branch bug13988_025 in my public repository.

comment:5 Changed 4 years ago by karsten

FWIW, your branch looks correct. :)

comment:6 Changed 4 years ago by atagar

Hi. I'm not clear, does this impact the BWHistoryReadValues and BWHistoryWriteValues entries in tor's state file? If so then this will break arm's bandwidth graph prepopulation...

https://gitweb.torproject.org/arm.git/tree/arm/util/__init__.py#n123

If not then exactly what does this impact?

comment:7 Changed 4 years ago by nickm

It affects that, and the extrainfo values. (It wouldn't "break" the graph prepopulation, but it would make that prepopulation inaccurate by a constant factor until arm implements support for BWHistory{Read,Write}Interval.)

comment:8 Changed 4 years ago by atagar

It wouldn't "break" the graph prepopulation, but it would make that prepopulation inaccurate by a constant factor until arm implements support for BWHistory{Read,Write}Interval.

Don't follow. Could you please explain further? The individual BWHistory*Values correspond to graph columns so if individual values constitutes the sum over four hours or so then it's useless for graphing.

comment:9 Changed 4 years ago by atagar

On a site note I wouldn't care about this interval if the control protocol had a method for "give me the last, say, 300 (5 minutes) worth of BW events" so I could prepopulate the graph at higher granularities. I'm just using the state file because it's the only game in town. :)

comment:10 Changed 4 years ago by arma

What does this do to reporting of bandwidth stats in extrainfo descriptors? In particular, a) do we lose any time periods since the descriptors are published only every 18 hours? Seems like currently we hear recent bandwidth stats but with this change we'll not hear as consistently about the most recent couple of hours. And b) For relays that restart often, are we going to fail to report more of their stats?

comment:11 in reply to:  10 Changed 4 years ago by karsten

Replying to arma:

What does this do to reporting of bandwidth stats in extrainfo descriptors? In particular, a) do we lose any time periods since the descriptors are published only every 18 hours? Seems like currently we hear recent bandwidth stats but with this change we'll not hear as consistently about the most recent couple of hours.

No. Total reported stats will still cover 24 hours. We shouldn't lose any data if descriptors are published at least every 18 to 24 hours. Only if descriptors were published less often than every 24 hours, we'd lose data.

And b) For relays that restart often, are we going to fail to report more of their stats?

Yes. Relays will not report incomplete intervals, but they'll at least start the new interval right after restarting. On average we'll lose something around half the interval time per restart, so 2 hours instead of 7.5 minutes right now. Should be okay.

comment:12 Changed 4 years ago by nickm

See also #14128

comment:13 Changed 4 years ago by nickm

Milestone: Tor: 0.2.6.x-finalTor: 0.2.5.x-final

Merged to master; marking for possible 0.2.5 backport.

comment:14 Changed 4 years ago by arma

Sounds like a great thing to call a feature in 0.2.6. I don't think a backport is needed. (As a bonus, that means #14128 doesn't need a backport either.)

comment:15 Changed 4 years ago by cypherpunks

This should be backported; it's a security improvement.

comment:16 Changed 4 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

I agree with cypherpunks; backported to 0.2.5.

comment:17 Changed 4 years ago by dcf

Cross-referencing #15453; it seems this change broke the "3 days" and "1 week" graphs in Globe.

comment:18 in reply to:  17 Changed 3 years ago by toralf

Severity: Normal

Replying to dcf:

Cross-referencing #15453; it seems this change broke the "3 days" and "1 week" graphs in Globe.

Which I was wonering about - whether the stats are broken or just hidden to misdirect an advisary ?

comment:19 in reply to:  description Changed 6 months ago by AnubuGoering

Replying to nickm:

NUM_SECS_BW_SUM_INTERVAL == 15 minutes seems pretty short; can we go up to 4 hours or 6 hours or so?

Note that while 0.2.2.x was still on the network we couldn't do this:

/** How large are the intervals for which we track and report bandwidth use? */
/* XXXX Watch out! Before Tor 0.2.2.21-alpha, using any other value here would
 * generate an unparseable state file. */
#define NUM_SECS_BW_SUM_INTERVAL (15*60)

But now, nobody should be downgrading to 0.2.2.x; this change should be approximately trivial.

Note: See TracTickets for help on using tickets.