Opened 9 months ago

Closed 5 months ago

Last modified 4 months ago

#28563 closed defect (implemented)

Work out how sbws can report excluded relays in the bandwidth file

Reported by: teor Owned by: juga
Priority: Medium Milestone: sbws: 1.1.x-final
Component: Core Tor/sbws Version:
Severity: Normal Keywords: tor-bwauth, sbws-1.0-must-moved-20181128, sbws-11x-final-removed-20190312, sbws-110-proposed, changes-version-minor
Cc: Actual Points:
Parent ID: #28547 Points:
Reviewer: nickm Sponsor:

Description

If we report excluded relays in the bandwidth file, then they will be publicly archived and available for anyone to analyse.

We just need to work out a syntax that makes tor ignore excluded relays.

Child Tickets

TicketStatusOwnerSummaryComponent
#29813closedjugaAdd unmeasured and vote Line KeyValues in the bandwidth-file-specCore Tor/Tor
#29853closedjugasbws should show relays for diagnostics, even when MIN_REPORT has not been reachedCore Tor/sbws

Change History (40)

comment:1 Changed 9 months ago by juga

I see some inconvenients to do this:

  • once we figure out why are being relays excluded, we might not want to keep the same format.
  • we need to wait until longclaw update to the code that publish the files
  • it'd add like around 1000 extra relays with some extra data, though this might not be a problem.
  • the delay that implies creating the spec before

I was thinking either on something temporal:

  1. produce a different file, with the relays excluded and useful data*
  2. implement other script to dump the data to a DB. It sounds kind of crazy, but it might not be much work and from that it's easier to make queries

While currently it's only me accessing to the original data, i can publish the results of that.

*Currently relays excluded, can be because:

  • circuits timeout, this is already in the raw results file
  • when scaling, doesn't find 2 measurements that are at least 1 day away and 5 days recent
  • something else we don't know yet

What about if i try one of the two other approaches before?. Otherwise i'm fine with this.

comment:2 in reply to:  1 Changed 9 months ago by teor

Replying to juga:

I see some inconvenients to do this:

  • once we figure out why are being relays excluded, we might not want to keep the same format.

I'm not sure what you mean here.
Do you think we'll change the bandwidth file format?
I expect that we'll add extra keys to the relay lines that count the relays excluded at each stage. Then we will add more keys for the stages that exclude a lot of relays.

  • we need to wait until longclaw update to the code that publish the files

Or you or micah can sync the file to a public web server, like people.torproject.org.
Most of the other directory authority operators sync their bandwidth files somewhere public.

  • it'd add like around 1000 extra relays with some extra data, though this might not be a problem.

I don't think it's a problem.

  • the delay that implies creating the spec before

I'm not sure what you mean here.
Are you concerned that the spec will take too much time?
It is ok to try a few things in the code, then update the spec.
And I can work on the spec next week.

I was thinking either on something temporal:

  1. produce a different file, with the relays excluded and useful data*
  2. implement other script to dump the data to a DB. It sounds kind of crazy, but it might not be much work and from that it's easier to make queries

While currently it's only me accessing to the original data, i can publish the results of that.

*Currently relays excluded, can be because:

  • circuits timeout, this is already in the raw results file
  • when scaling, doesn't find 2 measurements that are at least 1 day away and 5 days recent
  • something else we don't know yet

Let's make a pad and list all the reasons from the code:
https://pad.riseup.net/p/sbws-exclude-reasons-keep

I think there are a lot more, see the children of #28547.

What about if i try one of the two other approaches before?. Otherwise i'm fine with this.

Relay operators, authority operators, and developers need to be able to find out why a relay isn't being measured.

If we put that information in the bandwidth file, authorities serve the bandwidth file, and metrics archives it, then the information is public and available.

Your other options are good for you to do a quick analysis. But they do not help everyone else.

comment:3 Changed 9 months ago by teor

To complete this ticket, we should work out how to make all supported versions of tor ignore a bandwidth file line. We should put the excluded relays at the end of the file, so they are not put in the header in the consensus.

Here are the things we should test:

  • something that stops tor parsing the file
  • no bw= key
  • a bw=0 key

comment:4 Changed 9 months ago by teor

Keywords: sbws-1.0-must-moved-20181128 added
Milestone: sbws 1.0 (MVP must)sbws 1.0.4

Moving all sbws 1.0 must planning and feature tickets to 1.0.4.

comment:5 Changed 9 months ago by teor

Milestone: sbws 1.0.4sbws 1.1

Milestone renamed

comment:6 Changed 9 months ago by teor

Milestone: sbws 1.1sbws: 1.1.x

Milestone renamed

comment:7 Changed 9 months ago by teor

Milestone: sbws: 1.1.xsbws: 1.1.x-final

Milestone renamed

comment:8 Changed 7 months ago by teor

Long-term, we might want to put excluded relays in their own section, after a section terminator =====.

comment:9 Changed 5 months ago by juga

Keywords: sbws-11x-final-removed-20190312 sbws-110-proposed changes-version-minor added
Milestone: sbws: 1.1.x-finalsbws: 1.1.0

Move tickets that imply a minor version change to 1.1.0.

comment:10 Changed 5 months ago by juga

After the other sibling tickets are merged, these would be added KeyValues[0]:

Header KeyValues:

recent_consensus_count=1
recent_measurement_attempt_count=15
recent_measurement_exclusion_not_distanciated_count=0
recent_measurement_exclusion_not_min_num_count=4
recent_measurement_exclusion_not_recent_count=0
recent_measurement_exclusion_not_success_count=11
recent_measurement_failure_count=0
recent_priority_list_count=1
recent_priority_relay_count=15

Bandwidth lines KeyValues:

error_destination=0
error_second_relay=0
relay_recent_measurement_exclusion_not_distanciated_count=0
relay_recent_measurement_exclusion_not_min_num_count=1
relay_recent_measurement_exclusion_not_recent_count=0
relay_recent_measurement_exclusion_not_success_count=0
relay_recent_priority_list_count=1
relay_recent_measurement_attempt_count=1

[0] The value doesn't matter, it's just a copy paste of a test.
There are no KeyValues with the word new, only recent, to do not have to parse an old bandwidth file.
If after running this for 5 days (default recent number of days) with the public network we decide that we still need the new keys, we can create new tickets.

comment:11 Changed 5 months ago by juga

12:11 < teor4> juga: can you add a comment on #28563 that links to the commit that lists all the relays in the bandwidth file, even the

excluded ones?

Sorry i understood all the KeyValues, not all the relays.

No, there is not such a commit, and even when we implement #28158, we might not see all the relays because of some other type of problem (Network problems, programming bugs): https://github.com/juga0/sbws/commit/6fc0c16fc756f9cf4820b94d60b26c59b350a125.

What would you suggest to do with this ticket or #28158?

comment:12 Changed 5 months ago by teor

I think we should do two things:

  1. show all the relays in the consensus in the sbws bandwidth file
  2. when sbws wants tor to exclude a relay from the vote, make tor ignore the bandwidths for that relay

You can do whatever you need to do to the tickets to make these things happen.

comment:13 in reply to:  12 ; Changed 5 months ago by juga

Replying to teor:

I think we should do two things:

  1. show all the relays in the consensus in the sbws bandwidth file
  2. when sbws wants tor to exclude a relay from the vote, make tor ignore the bandwidths for that relay

You can do whatever you need to do to the tickets to make these things happen.

Just to clarify whether i understand this correctly, from https://trac.torproject.org/projects/tor/ticket/28158#comment:13:

One thing would be to include all the "non-eligible relays" lines, as lines that Tor will not currently parse.

Let's change the relays that are in the bandwidth file, but tor won't vote on them, in #28563.

Ok, so i'll add non-eligible relays without bw key so Tor doesn't parse them.

Other different thing is to report all "eligible relays", even if they are not the 60% of the relays in the network.

So, from 1. above, i'll include all relays (eligible and not eligible) without the bw key so Tor doesn't parse them, when they're not the 60%.

Correct?

comment:14 in reply to:  13 ; Changed 5 months ago by teor

Replying to juga:

Replying to teor:

I think we should do two things:

  1. show all the relays in the consensus in the sbws bandwidth file
  2. when sbws wants tor to exclude a relay from the vote, make tor ignore the bandwidths for that relay

You can do whatever you need to do to the tickets to make these things happen.

Just to clarify whether i understand this correctly, from https://trac.torproject.org/projects/tor/ticket/28158#comment:13:

One thing would be to include all the "non-eligible relays" lines, as lines that Tor will not currently parse.

Let's change the relays that are in the bandwidth file, but tor won't vote on them, in #28563.

Ok, so i'll add non-eligible relays without bw key so Tor doesn't parse them.

Other different thing is to report all "eligible relays", even if they are not the 60% of the relays in the network.

So, from 1. above, i'll include all relays (eligible and not eligible) without the bw key so Tor doesn't parse them, when they're not the 60%.

Correct?

I think I know what you mean, but I am not sure. Please use shorter sentences?

Finding Every Relay

Relay operators want to know about their relays.
So we need to know what sbws thinks about every relay.
We can find out what sbws thinks by putting some information about every relay in every bandwidth file.

It doesn't matter if we are excluding the relays due to eligibility, or 60% measured, or some other rule. We want all the relays in the file.

There are two different ways to find every relay:

  1. all the relays which sbws has tried to test in the last 5 days
  2. all the relays in the current consensus

I think option 1 is better, because it includes more relays. But option 2 is also ok. You can choose the option that is easier or faster. Please document the option that you choose.

Making Tor Ignore Relays

We don't want this change to change the relays that tor is voting on. So we want tor to ignore the new relays added by this change.

Tor ignores lines without a "bw=" key. But Tor 0.3.5 and later log an "Incomplete bandwidth file" warning for every incomplete line, every time they parse the file. That's not ok: we can not fill up authority operators logs with warnings.

Instead, you could use "bw=0 unmeasured=1 vote=0" and patch tor so that it ignores relay lines with "vote=0". Can you open another ticket for this Tor change?

Tor versions without the patch will give more 0 votes to unmeasured relays, so they will get lower bandwidth values in the consensus. That's ok.

We might want to vote 0 for unmeasured relays in a future sbws or tor version. That's why I added the "unmeasured=1" key. But it's not a change I want to make now, because you only have 10 paid days left. Can you open another ticket for this sbws change?

Please also document the way that you make tor ignore relays.

comment:15 in reply to:  14 Changed 5 months ago by juga

Replying to teor:

Tor ignores lines without a "bw=" key. But Tor 0.3.5 and later log an "Incomplete bandwidth file" warning for every incomplete line, every time they parse the file. That's not ok: we can not fill up authority operators logs with warnings.

Sorry, i forgot about the warning.

Instead, you could use "bw=0 unmeasured=1 vote=0" and patch tor so that it ignores relay lines with "vote=0". Can you open another ticket for this Tor change?

We have always say to do not give "bw=0" but "bw=1" because 0 can cause problems in Tor.
Shouldn't "bw=0" be "bw=1"?.

Tor versions without the patch will give more 0 votes to unmeasured relays, so they will get lower bandwidth values in the consensus. That's ok.

We might want to vote 0 for unmeasured relays in a future sbws or tor version.

Hmm you mean to vote that the relay has 0 bandwidth?, right? (not to don't vote for the unmeasured relay).

That's why I added the "unmeasured=1" key. But it's not a change I want to make now, because you only have 10 paid days left. Can you open another ticket for this sbws change?

Hmm, if i add "unmeasured=1" and make a Tor patch that ignores lines with "unmeasured=1", then we do not need to add "vote=0" and in a future Tor version we can still change Tor to vote on those lines, right?.

comment:16 in reply to:  14 ; Changed 5 months ago by juga

Replying to teor:

Instead, you could use "bw=0 unmeasured=1 vote=0" and patch tor so that it ignores relay lines with "vote=0". Can you open another ticket for this Tor change?

Regarding the word "unmeasured", i'm not sure whether it should be named "excluded" or something similar, since maybe the relay was measured but excluded for some other reason.
I guess this word is not very important though.

comment:17 Changed 5 months ago by juga

Status: newneeds_review

To get this in review this week, not waiting for teor answers, i implemented "bw=1 unmeasured=1 vote=0".
If it needs changes after teor answer, the changes are fast/easy to do.
To review only the last commit:
https://github.com/torproject/sbws/pull/353

comment:18 Changed 5 months ago by juga

Owner: set to juga
Status: needs_reviewassigned

set assigned

comment:19 Changed 5 months ago by juga

Status: assignedneeds_review

set back to review

comment:20 Changed 5 months ago by asn

Reviewer: asn

comment:21 Changed 5 months ago by asn

Reviewer: asnnickm

comment:22 in reply to:  16 ; Changed 5 months ago by teor

Replying to juga:

Replying to teor:

Tor ignores lines without a "bw=" key. But Tor 0.3.5 and later log an "Incomplete bandwidth file" warning for every incomplete line, every time they parse the file. That's not ok: we can not fill up authority operators logs with warnings.

Sorry, i forgot about the warning.

I only found out about the warning when I tested your design on an authority,

Instead, you could use "bw=0 unmeasured=1 vote=0" and patch tor so that it ignores relay lines with "vote=0". Can you open another ticket for this Tor change?

We have always say to do not give "bw=0" but "bw=1" because 0 can cause problems in Tor.
Shouldn't "bw=0" be "bw=1"?.

Yes, I think 1 is better.

Tor versions without the patch will give more 0 votes to unmeasured relays, so they will get lower bandwidth values in the consensus. That's ok.

We might want to vote 0 for unmeasured relays in a future sbws or tor version.

Hmm you mean to vote that the relay has 0 bandwidth?, right? (not to don't vote for the unmeasured relay).

Vote 0 bandwidth for the relay. Or vote 1 bandwidth for the relay.

That's why I added the "unmeasured=1" key. But it's not a change I want to make now, because you only have 10 paid days left. Can you open another ticket for this sbws change?

What is the ticket number of the new ticket for the sbws change to vote on unmeasured relays?

Hmm, if i add "unmeasured=1" and make a Tor patch that ignores lines with "unmeasured=1", then we do not need to add "vote=0" and in a future Tor version we can still change Tor to vote on those lines, right?.

We could change Tor. But that takes 6-12 months from change to release to authority deployment.
So I want to be able to change sbws.

To make changing sbws easy, I want to separate reasons and actions.
"unmeasured=1" is the reason, "vote=0" is the suggested action.
In future, sbws might have more reasons, or sbws might suggest different actions.

Replying to juga:

Replying to teor:

Instead, you could use "bw=0 unmeasured=1 vote=0" and patch tor so that it ignores relay lines with "vote=0". Can you open another ticket for this Tor change?

What is the ticket number of the new ticket for the Tor patch to ignore vote=0?

Regarding the word "unmeasured", i'm not sure whether it should be named "excluded" or something similar, since maybe the relay was measured but excluded for some other reason.
I guess this word is not very important though.

No, I mean "unmeasured", because that's the word we use in the consensus, when a relay doesn't have enough votes with measured bandwidths.

Last edited 5 months ago by teor (previous) (diff)

comment:23 Changed 5 months ago by teor

Please also document the way that you make tor ignore relays.

We'll also need a ticket to update the bandwidth file spec. It will help the reviewer if you update the spec at the same time as the code.

comment:24 in reply to:  22 Changed 5 months ago by teor

Replying to teor:

Replying to juga:

Replying to teor:

Instead, you could use "bw=0 unmeasured=1 vote=0" and patch tor so that it ignores relay lines with "vote=0". Can you open another ticket for this Tor change?

What is the ticket number of the new ticket for the Tor patch to ignore vote=0?

Oh, I see, it is #29806.

comment:25 in reply to:  22 Changed 5 months ago by juga

Replying to teor:
[...]

To make changing sbws easy, I want to separate reasons and actions.
"unmeasured=1" is the reason, "vote=0" is the suggested action.
In future, sbws might have more reasons, or sbws might suggest different actions.

ok, i see. No need to change then the PR then.

comment:26 in reply to:  23 ; Changed 5 months ago by juga

Replying to teor:

Please also document the way that you make tor ignore relays.

We'll also need a ticket to update the bandwidth file spec. It will help the reviewer if you update the spec at the same time as the code.

Created #29813, will work on it in the next hours.

comment:27 in reply to:  26 Changed 5 months ago by juga

Replying to juga:

Replying to teor:

Please also document the way that you make tor ignore relays.

We'll also need a ticket to update the bandwidth file spec. It will help the reviewer if you update the spec at the same time as the code.

Created #29813, will work on it in the next hours.

Done.

I put this ticket as the parent of #29806 and #29813, since they might need to be reviewed with this.

comment:28 Changed 5 months ago by nickm

Left a code review on github; I think this looks okay as a design, though we need to make sure we don't actually give this output to any tor authority versions that aren't running with #29806.

I also set myself as reviewer of, and did review on, the child tickets.

comment:29 Changed 5 months ago by nickm

Status: needs_reviewneeds_revision

comment:30 in reply to:  28 Changed 5 months ago by teor

Replying to nickm:

Left a code review on github; I think this looks okay as a design, though we need to make sure we don't actually give this output to any tor authority versions that aren't running with #29806.

We'd really like to deploy this change in sbws, even if the authorities haven't upgraded to a version of Tor with #29806. (Or even if we decide not to backport #29806.)

I made some additions to the spec on the pull request, to explain why that's ok:

Because Tor versions 0.3.4.11, 0.3.5.8, 0.4.0.2-alpha, and earlier
ignore this KeyValue, generator implementations MUST set "bw=1"
for a relay when Tor directory authorities SHOULD ignore its relay
line. Using the minimum bw value allows authorities that do not
understand "vote=0" or "unmeasured=1" to produce sensible votes
for unmeasured relays.

comment:31 Changed 5 months ago by juga

Status: needs_revisionneeds_review

nickm, i think i've solved all that you commented.

comment:32 Changed 5 months ago by juga

Cc: pastly juga removed

wops, sorry pastly for the spam

comment:33 Changed 5 months ago by nickm

Status: needs_reviewneeds_revision

lgtm now

comment:34 Changed 5 months ago by nickm

Status: needs_revisionmerge_ready

whoops.

comment:35 Changed 5 months ago by teor

Juga, after you merge this code, please check that the number of relays in the bandwidth file is about the same as the number of relays in the consensus.

comment:36 in reply to:  35 ; Changed 5 months ago by juga

Replying to teor:

Juga, after you merge this code, please check that the number of relays in the bandwidth file is about the same as the number of relays in the consensus.

ok, i'm merging everything in my master branch (not in git.tpo) and i'll run it for at at least 1 loop in a test server, then check number of relays lines is aprox. the same as relays in the consensus.

comment:37 Changed 5 months ago by juga

merged, but can't close cause of children

comment:38 in reply to:  36 Changed 5 months ago by juga

Replying to juga:

Replying to teor:

Juga, after you merge this code, please check that the number of relays in the bandwidth file is about the same as the number of relays in the consensus.

ok, i'm merging everything in my master branch (not in git.tpo) and i'll run it for at at least 1 loop in a test server, then check number of relays lines is aprox. the same as relays in the consensus.

Yeah, it's aprox. the same.

comment:39 in reply to:  37 Changed 5 months ago by juga

Resolution: implemented
Status: merge_readyclosed

Replying to juga:

merged, but can't close cause of children

All children now closed.

comment:40 Changed 4 months ago by teor

Milestone: sbws: 1.1.0sbws: 1.1.x-final

Move sbws 1.1.0 tickets into 1.1.x-final, to match Tor's milestone scheme.

Note: See TracTickets for help on using tickets.