Opened 12 years ago

Last modified 7 years ago

#696 closed defect (Duplicate)

WFU not computed right for never-down relay

Reported by: arma Owned by:
Priority: Low Milestone: post 0.2.1.x
Component: Core Tor/Tor Version: 0.2.0.19-alpha
Severity: Keywords:
Cc: arma, nickm, karsten, seeess Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

corfu has been up since it generated its key -- it has never been down.

Its entry in moria1's router-stability file is

R 7CAA2F5F998053EF5D2E622563DEB4A6175E49AC
+MTBF 0 0.00000 S=2008-05-15 23:26:01
+WFU 0 0

But in the cached-consensus, we have
r corfu fKovX5mAU+9dLmIlY960phdeSaw fuzu6FZU2vwuiT47PLMblVITIAI 2008-06-09 06:19:48 140.247.60.83 443 80
s Fast Guard Running V2Dir Valid

It has no stable flag. Because its WFU is 0. Even though its only uptime
run is currently at about 26 days.

moria1 has been up since May 19.

So are we forgetting to update our WFU entries for relays that don't have
an up or down event? I wonder how much this is skewing our 'stable'
calculations.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Change History (14)

comment:1 Changed 11 years ago by nickm

I've added a little info-level logging for this; next I'll add a debugging output page.

comment:2 Changed 11 years ago by nickm

r16981 adds a /tor/dbg-stability.txt URL to dump current MTBF/WFU calculations.

comment:3 Changed 11 years ago by arma

For example:

router AW3UQowu8EsOOxf6yqfYwvMlv7A sandvine 69.241.40.20
uptime-started 2008-09-19 11:50:59 UTC
wfu 0.994

weighted-time 870050
weighted-uptime 860142

mtbf 537295.6

weighted-run-length 860142
total-run-weights 1.839504

The consensus says

r sandvine AW3UQowu8EsOOxf6yqfYwvMlv7A 3oibvNRbu74b0c+yokHRRakg3zw 2008-09-26 13
:36:07 69.241.40.20 9001 9030
s Fast Guard Named Running V2Dir Valid

It's been up for 8 days, but it is still not Stable.

moria2's cutoffs are
Sep 27 00:46:09.085 [info] Cutoffs: For Stable, 255514 sec uptime, 364304
sec MTBF. For Fast: 20480 bytes/sec. For Guard: WFU 98.221%, time-known
691200 sec, and bandwidth 51200 or 61440 bytes/sec.

and moria2 is rating sandvine as Stable in its v2 status.

moria1's cutoffs are
Sep 27 00:48:12.217 [info] Cutoffs: For Stable, 263177 sec uptime, 430457
sec MTBF. For Fast: 20480 bytes/sec. For Guard: WFU 97.682%, time-known
691200 sec, and bandwidth 51200 or 61440 bytes/sec.

which means moria1 should be rating it Stable. But it doesn't end up Stable
in the consensus.

So perhaps moria1 isn't rating it Stable, and there's a bug there. Or perhaps
there are some other authorities not rating it Stable when they should?

Perhaps one of the next steps is to figure out a way to read v3 votes? E.g.
make them available reliably at some url, or write them to the datadir.

comment:4 Changed 11 years ago by nickm

I added more code in r17003 to log all the other things that are supposed to influence the stability calculation.

comment:5 Changed 11 years ago by arma

The plot thickens!

moria1, dizum, tor26 think sandvine is stable.
ides, dannenberg, gabelmoo think it is not.

comment:6 Changed 11 years ago by arma

Perhaps one of the next steps is to figure out a way to read v3 votes? E.g.
make them available reliably at some url, or write them to the datadir.

Implemented in r17008

comment:7 Changed 11 years ago by arma

ides's stability measurements are all wonky.
http://fscked.org:9030/tor/dbg-stability.txt

comment:8 Changed 11 years ago by arma

(Didn't we have a bug with the reliability stats file at one point, and we made
a backward incompatible change, and figured hey, no problem, we'll just get
the authorities to delete the old one?)

comment:9 Changed 11 years ago by nickm

That is indeed so. Had ides not ever deleted the old one? Is this bug still there?

comment:10 Changed 11 years ago by arma

I had vaguely thought that ides was born after that bug got fixed. I had
thought it was just moria1 and tor26 that needed to delete their stability
files.

My understanding is that the bug remains. Part of the challenge is that we
don't have any good stats scripts to compare which relay is which.

For example, I just did some poking through the v3-status-votes file, and
everybody marks gabelmoo as Stable but ides. Bug or different perspective or
what? We don't have any good way to know.

comment:11 Changed 11 years ago by nickm

So probably what we want to debug this is a dgb-stability.txt *and* a vote from each authority for a given timeslice,
so we can figure out what's up with the authories that vote strangely.

comment:12 Changed 11 years ago by arma

See also bug 969. I'm going to close this one as a duplicate of that one,
since that one has more info about the actual bug.

comment:13 Changed 11 years ago by arma

flyspray2trac: bug closed.
duplicate of 969

comment:14 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.