Opened 7 months ago

Last modified 6 months ago

#33832 needs_information defect

For relays that change ip, only the measurements with the last ip are kept

Reported by: juga Owned by: gk
Priority: Medium Milestone: sbws: 1.1.x-final
Component: Core Tor/sbws Version: sbws: 1.1.0
Severity: Normal Keywords: sbws-roadmap, intro, GeorgKoppen202004
Cc: juga, gk Actual Points:
Parent ID: #33121 Points:
Reviewer: Sponsor:

Description

which makes those relays more likely to don't be "eligible" cause don't have enough results and therefore, sbws voting in less relays.

Child Tickets

Change History (9)

comment:2 Changed 7 months ago by juga

Keywords: intro added

To solve this, the function trim_results_ip_changed in resultdump.py should be changed

comment:3 Changed 7 months ago by gk

Keywords: GeorgKoppen202004 added
Owner: set to gk
Status: newassigned

comment:4 in reply to:  2 ; Changed 7 months ago by gk

Status: assignedneeds_information

Replying to juga:

To solve this, the function trim_results_ip_changed in resultdump.py should be changed

What is the intended behavior? Not ignoring previous measurements anymore? Do we still need the trim_results_ip_changed function in that case (you seem to indicate that)?

comment:5 in reply to:  4 ; Changed 6 months ago by juga

Replying to gk:

Replying to juga:

To solve this, the function trim_results_ip_changed in resultdump.py should be changed

What is the intended behavior? Not ignoring previous measurements anymore? Do we still need the trim_results_ip_changed function in that case (you seem to indicate that)?

Hmm, i thought that we could ignore the other ips but still keep the results. After your questions, i realize that it'd not make sense to do that.
So, we can probably close this ticket as invalid. Maybe let fix other tickets first and then rethink about this one?

comment:6 in reply to:  5 ; Changed 6 months ago by gk

Replying to juga:

Replying to gk:

Replying to juga:

To solve this, the function trim_results_ip_changed in resultdump.py should be changed

What is the intended behavior? Not ignoring previous measurements anymore? Do we still need the trim_results_ip_changed function in that case (you seem to indicate that)?

Hmm, i thought that we could ignore the other ips but still keep the results. After your questions, i realize that it'd not make sense to do that.
So, we can probably close this ticket as invalid. Maybe let fix other tickets first and then rethink about this one?

We can do that. But why do we keep only the measurements just with the last IP address? I mean it's still the *same* relay, just has a new IP address (you could probably argue that the IP address is just an attribute like an option in the torrc file and we do not exclude measurements when any of those changes). Is the assumption that bandwidth improves with a new IP address? Or is Torflow doing that and sbws is trying to mimic it? Or...?

I looked at the git history of this code to figure out whether there is some rationale given in a commit message or ticket it points to or some older comment but did not find any hint.

comment:7 Changed 6 months ago by arma

I would expect that we would aggregate all our results of testing the relay "somewhat recently", regardless of what IP address it was on when we did each test.

Otherwise, as juga says, we're more likely to end up relays on dynamic ip addresses being unmeasured.

I think this behavior that I describe ("use them all") is what Torflow does, too, but I'm not certain.

comment:8 in reply to:  6 ; Changed 6 months ago by juga

Replying to gk and arma:

i'm replying to both of you, last comments.

So, my first thought was, yeah, the relay is the same, it's probably changing IP cause of having dynamic IP address, so we should keep the recent measurements with a different IP.

Then, i thought, what if the relay changes IP cause it changes physical location? (while keeping the keys/configuration), in that case the measurements could actually be quite different and the previous ones would not make sense.

So we have at least 2 cases in which a relay might change IP and think the first one is the more common one.

I don't know what's was the initial thought to implement this, i also didn't find i in the git log or tickets.

I realize now that we don't need to change the code, we can just change the default configuration to set reset_bw_ipv4_changes to false.

I don't know what Torflow does and it's not so easy to check.

What we can do, it's to wait a few days until longclaw uses nightly changes, see how many relays it's measuring with the current fixes and see whether we should change this.

comment:9 in reply to:  8 Changed 6 months ago by gk

Replying to juga:

Replying to gk and arma:

i'm replying to both of you, last comments.

So, my first thought was, yeah, the relay is the same, it's probably changing IP cause of having dynamic IP address, so we should keep the recent measurements with a different IP.

Then, i thought, what if the relay changes IP cause it changes physical location? (while keeping the keys/configuration), in that case the measurements could actually be quite different and the previous ones would not make sense.

So we have at least 2 cases in which a relay might change IP and think the first one is the more common one.

I don't know what's was the initial thought to implement this, i also didn't find i in the git log or tickets.

I realize now that we don't need to change the code, we can just change the default configuration to set reset_bw_ipv4_changes to false.

Yep, that's what I came up with as well.

I don't know what Torflow does and it's not so easy to check.

What we can do, it's to wait a few days until longclaw uses nightly changes, see how many relays it's measuring with the current fixes and see whether we should change this.

I am fine with that.

Note: See TracTickets for help on using tickets.