Opened 8 years ago

Closed 14 months ago

#1854 closed task (wontfix)

Investigate raising the minimum bandwidth for getting the Fast flag

Reported by: arma Owned by: arma
Priority: Medium Milestone:
Component: Metrics/Analysis Version:
Severity: Keywords: performance loadbalancing
Cc: karsten, gsathya, asn, robgjansen, aaron.m.johnson@…, iang, mo, adrelanos@…, zen@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by arma)

Mike's performance work has shown that the smaller relays -- for example, the ones that set bandwidthrate and bandwidthburst to 20k -- are never good news to have in your circuit.

Damon McCoy's hotpets 2010 paper showed more details of how you could improve performance by dumping the bottom X% of the relays.

Of course, there's a network effect issue here: clearly you get better performance if you're the only one ignoring the slower relays.

But I think there's something to this even when everybody is doing it. Our load balancing makes a 500KB relay 10x more likely to be used than a 50KB relay, but given a whole lot of users building paths, the 50KB relay will get overloaded more often and show worse characteristics when overloaded than the 500KB relay -- in large part because we're load balancing by circuit rather than by byte.

So I'd like to do a series of performance experiments where the directory authorities take away the Fast flag from everybody whose consensus bandwidth is under X.

Ideally we'd do it while the network is under a variety of load conditions (removing capacity from the network when there's a lot of load seems like it would hurt us more, but then, using overloaded relays when there's a lot of load could hurt us a lot too).

This could even be a research task that we try to give to a research group that wants to work on simulated Tor network performance. But I think that's a separate project.

Along with the performance simulations we need to consider the anonymity implications of reducing the diversity of relays. How much anonymity do we lose if we treat anonymity as entropy? How much do we lose if we consider the location-based anonymity metrics of Feamster or Edman? Ideally we'd figure out some way to compare performance and anonymity so we can decide if we like various points in the tradeoff space. Really, we should be working on this piece already to analyze whether Mike's bwauth algorithm is worth it.

Finally, should we consider keeping them in the network if they have nice exit policies?

Relays that are too slow should be encouraged to become bridges. Even better, we should help people recognize when they ought to start out as a bridge rather than trying to be a relay.

Child Tickets

Attachments (10)

degree-of-anonymity-min-cw-2012-09-18.png (59.0 KB) - added by karsten 6 years ago.
entropy-min-cw-2012-09-18.png (65.2 KB) - added by karsten 6 years ago.
0001-store-all-server-descs-in-memory.patch (2.9 KB) - added by gsathya 6 years ago.
linf-min-adv-bw-2012-11-26.pdf (135.1 KB) - added by karsten 6 years ago.
linf-min-adv-bw-2012-11-26-a.pdf (143.7 KB) - added by karsten 6 years ago.
linf-min-adv-bw-2012-11-26-b.pdf (158.2 KB) - added by karsten 6 years ago.
linf-min-adv-bw-2012-11-26-c.pdf (199.9 KB) - added by karsten 6 years ago.
linf-min-adv-bw-2012-11-26-c.2.pdf (199.9 KB) - added by karsten 6 years ago.
linf-min-adv-bw-2012-11-27.pdf (342.6 KB) - added by karsten 6 years ago.
linf-min-adv-bw-2012-11-27-a.pdf (357.3 KB) - added by karsten 6 years ago.

Download all attachments as: .zip

Change History (70)

comment:1 Changed 8 years ago by Sebastian

What about using them for directory information still? Seems like once we have microdescriptors out, they could cache these and distribute the new descriptors to clients (thought bootstrapping off them might be a bit painful).

comment:2 Changed 8 years ago by arma

Description: modified (diff)

comment:3 in reply to:  description ; Changed 8 years ago by arma

Component: Tor RelayMetrics
Milestone: Deliverable-Mar2011
Owner: set to karsten

Replying to arma:

So I'd like to do a series of performance experiments where the directory authorities take away the Running flag from everybody whose consensus bandwidth is under X.

A cleaner way to run the experiment would be to take away their Fast flag. I could imagine putting in a consensus param that authorities look at (off by default), so we can easily modify the values over time and see how the torperf output changes.

We should also ponder if we really mean consensus bandwidth here, or if we mean relay descriptor bandwidth. Currently the Fast flag is assigned based on relay descriptor bandwidth.

comment:4 Changed 8 years ago by arma

Component: MetricsAnalysis

comment:5 Changed 7 years ago by karsten

Summary: Project: Raise the minimum bandwidth for being a relay?Investigate raising the minimum bandwidth for being a relay
Type: enhancementtask

This sounds like an analysis task among many others we should work on. Removing the "Project: " part from the summary. If this is really a project, please change ticket type to "project."

comment:6 in reply to:  3 Changed 7 years ago by arma

Replying to arma:

A cleaner way to run the experiment would be to take away their Fast flag. I could imagine putting in a consensus param that authorities look at (off by default), so we can easily modify the values over time and see how the torperf output changes.

Added as #3946.

comment:7 Changed 7 years ago by arma

Description: modified (diff)
Keywords: performance added
Summary: Investigate raising the minimum bandwidth for being a relayInvestigate raising the minimum bandwidth for getting the Fast flag

Modified this task to be more clearly about analyzing taking the Fast flag away.

comment:8 Changed 7 years ago by arma

Keywords: loadbalancing added

comment:9 Changed 7 years ago by karsten

Owner: changed from karsten to arma
Status: newassigned

Why is this ticket assigned to me? Am I supposed to do something here? If so, please tell me and re-assign to me. Assigning to ticket reporter for now.

comment:10 Changed 6 years ago by arma

Cc: karsten gsathya asn robgjansen added

Based on Rob's CSET paper, I am now less optimistic that we can answer this question with simulations: messing with what relays make up a test network is among the least solved pieces of simulating Tor networks.

So I think we should proceed in two directions:

A) We should get gsathya or asn or whoever to confirm that dropping relays with bandwidth less than X doesn't change any of the diversity metrics much (because they're never picked often enough to matter). What's the largest X for which you can reasonably say that?

B) Then we should do an actual performance experiment on the live Tor network, using the FastFlagMinThreshold consensus param added in #3946, and see what we see on torperf.

Judging the performance experiment on the live Tor network will be especially messy because there are so many variables, but I think despite that it may still be the best route.

comment:11 in reply to:  10 Changed 6 years ago by robgjansen

Replying to arma:

Based on Rob's CSET paper, I am now less optimistic that we can answer this question with simulations: messing with what relays make up a test network is among the least solved pieces of simulating Tor networks.

Unless we use rpw's machine to simulate all of the existing relays. Then we don't worry about down sampling problems;)

comment:12 in reply to:  10 ; Changed 6 years ago by karsten

Replying to arma:

A) We should get gsathya or asn or whoever to confirm that dropping relays with bandwidth less than X doesn't change any of the diversity metrics much (because they're never picked often enough to matter).

So, is this ticket about dropping relays from the consensus, or taking away their Fast flag? I can see how we can graph the former, but I'm not sure about the latter.

What's the largest X for which you can reasonably say that?

Sounds like we want #6232 graphs with the minimum bandwidth to keep relays in the consensus on the X axis. For example, a graph similar to https://trac.torproject.org/projects/tor/attachment/ticket/6232/entropy-august.png would have its blue lines decreasing steadily, because we're taking away relays, but the red lines would stay on the same level and only drop in the last third or so, because we start taking away relays from the slowest ones.

Is that what you have in mind here?

gsathya, asn, is this something you want to look into?

comment:13 in reply to:  12 ; Changed 6 years ago by gsathya

Replying to karsten:

gsathya, asn, is this something you want to look into?

Yep. How do you want the output of the script to look like?

comment:14 in reply to:  12 ; Changed 6 years ago by arma

Replying to karsten:

So, is this ticket about dropping relays from the consensus, or taking away their Fast flag? I can see how we can graph the former, but I'm not sure about the latter.

Shouldn't matter much. I guess that leads to: do your consensus diversity analysis tools consider the Fast flag? They probably should, since clients do.

comment:15 in reply to:  12 ; Changed 6 years ago by arma

Replying to karsten:

Sounds like we want #6232 graphs with the minimum bandwidth to keep relays in the consensus on the X axis. For example, a graph similar to https://trac.torproject.org/projects/tor/attachment/ticket/6232/entropy-august.png would have its blue lines decreasing steadily, because we're taking away relays, but the red lines would stay on the same level and only drop in the last third or so, because we start taking away relays from the slowest ones.

Is that what you have in mind here?

Sounds plausible. One nice way of looking at it might be: what's the highest bandwidth cutoff such that the red lines in your graph lose 1% or less? Then the same question for 2%, 3%, 4%, 5%.

Of course, that needs a definition of what it means for two lines to differ. We might try defining the difference as the point x where f1(x) and f2(x) differ the most. If there's noise, we might define it as the 10th percentile of these points x, which would let us say "90% of the time there was at most a 1% difference."

comment:16 Changed 6 years ago by arma

s/as the point x/at the point x/ and s/as the 10th/at the 10th/

comment:17 Changed 6 years ago by arma

Or you could bust out real stats and use that, if you prefer. :)

comment:18 in reply to:  13 ; Changed 6 years ago by karsten

Replying to gsathya:

Replying to karsten:

gsathya, asn, is this something you want to look into?

Yep. How do you want the output of the script to look like?

Cool! How about a format similar to #6232?

validafter,min_cw,relays,all,max_all,exit,max_exit,guard,max_guard,country,max_country,as,max_as
2012-09-10 01:00:00,1,3040,7.44,11.26,5.79,9.73,6.12,8.99,3.23,6.26,5.44,9.57
2012-09-10 01:00:00,2,[...]

In that output, min_cw is the minimum consensus weight of relays that we keep in the consensus. That value would start at the smallest consensus weight in the network, and we'd calculate entropy values for all relays in the consensus. Then we'd raise the minimum to the second-smallest value in the network, throw out all relays below that value, and compute new entropy values. Continue until we're at the relay with highest consensus weight.

The first column, validafter, is the consensus valid-after time. The third column, relays, contains the number of relays left. The other columns (all, max_all, etc.) are defined similar to #6232.

Roger, please note that I assumed you want to cut out relays based on consensus weight, not advertised bandwidth. Please correct me if that assumption is wrong. (Writing the analysis script for consensus weights is probably easier, so we could later extend it to advertised bandwidth if required.)

comment:19 in reply to:  14 Changed 6 years ago by karsten

Replying to arma:

Replying to karsten:

So, is this ticket about dropping relays from the consensus, or taking away their Fast flag? I can see how we can graph the former, but I'm not sure about the latter.

Shouldn't matter much.

Really?

I guess that leads to: do your consensus diversity analysis tools consider the Fast flag? They probably should, since clients do.

Our tools don't consider the Fast flag. They're only based on relays' consensus weights, their Exit and Guard flags, and the bandwidth-weights line.

Simulating what clients would do, including considering the Fast flag is almost impossible. There are too many variables which relays clients would pick depending on what other relays they already have in their circuit, including family settings and same /16's, that we can't reasonably model. If we want results this precise, we'll have to run simulations with the actual Tor code.

comment:20 in reply to:  15 Changed 6 years ago by karsten

Replying to arma:

Sounds plausible. One nice way of looking at it might be: what's the highest bandwidth cutoff such that the red lines in your graph lose 1% or less? Then the same question for 2%, 3%, 4%, 5%.

Sure, that's something that the CDF I suggested above should show. We could put percent values on the y axis and start with current diversity at 100%. Then you could read what x value corresponds to 99% (98%, ...).

Of course, that needs a definition of what it means for two lines to differ. We might try defining the difference as the point x where f1(x) and f2(x) differ the most. If there's noise, we might define it as the 10th percentile of these points x, which would let us say "90% of the time there was at most a 1% difference."

Ah, my idea was to start with a single consensus. Combining multiple consensuses would be step 2. (The data format I suggested above should support the graphs you suggest here.)

comment:21 in reply to:  18 Changed 6 years ago by gsathya

Cc: aaron.m.johnson@… added
Status: assignedneeds_review

Replying to karsten:

Cool! How about a format similar to #6232?

validafter,min_cw,relays,all,max_all,exit,max_exit,guard,max_guard,country,max_country,as,max_as
2012-09-10 01:00:00,1,3040,7.44,11.26,5.79,9.73,6.12,8.99,3.23,6.26,5.44,9.57
2012-09-10 01:00:00,2,[...]

In that output, min_cw is the minimum consensus weight of relays that we keep in the consensus. That value would start at the smallest consensus weight in the network, and we'd calculate entropy values for all relays in the consensus. Then we'd raise the minimum to the second-smallest value in the network, throw out all relays below that value, and compute new entropy values. Continue until we're at the relay with highest consensus weight.

The first column, validafter, is the consensus valid-after time. The third column, relays, contains the number of relays left. The other columns (all, max_all, etc.) are defined similar to #6232.

Please review my bug_1854 branch! Thanks!

There seems to be quite a bit of relays with "None" bandwidth, should we consider such relays?(They count when calculating number of relays but don't provide any bandwidth)

comment:22 in reply to:  18 ; Changed 6 years ago by arma

Replying to karsten:

Roger, please note that I assumed you want to cut out relays based on consensus weight, not advertised bandwidth. Please correct me if that assumption is wrong. (Writing the analysis script for consensus weights is probably easier, so we could later extend it to advertised bandwidth if required.)

The Fast and Guard flags look at descriptor bandwidth, not consensus bandwidth. So yes, eventually we should do a version of this analysis that looks at descriptor bandwidth.

comment:23 Changed 6 years ago by arma

Cc: iang added

Changed 6 years ago by karsten

Changed 6 years ago by karsten

comment:24 Changed 6 years ago by karsten

Replying to gsathya:

Please review my bug_1854 branch! Thanks!

Fixed two bugs, but otherwise looks good. Merged.

Please see the output graphs here and here.

There seems to be quite a bit of relays with "None" bandwidth, should we consider such relays?(They count when calculating number of relays but don't provide any bandwidth)

I think this was caused by your code looking at position-dependent consensus weights, not raw consensus weights. Should be fixed.

comment:25 in reply to:  22 ; Changed 6 years ago by karsten

Status: needs_reviewneeds_revision

Replying to arma:

Replying to karsten:

Roger, please note that I assumed you want to cut out relays based on consensus weight, not advertised bandwidth. Please correct me if that assumption is wrong. (Writing the analysis script for consensus weights is probably easier, so we could later extend it to advertised bandwidth if required.)

The Fast and Guard flags look at descriptor bandwidth, not consensus bandwidth. So yes, eventually we should do a version of this analysis that looks at descriptor bandwidth.

A version of this analysis that looks at descriptor bandwidth would sort relays by advertised bandwidth and cut off the slowest relays based on that. In the graphs, the x axis would say "Minimum advertised bandwidth" instead of "Minimum consensus weight", and of course the lines might be slightly different. But everything else would remain the same, including how we calculate guard entropy for the "All guards" sub graph.

We'll mostly have to change router.bandwidth to router.advertised_bw a few times in the code. Shouldn't be too hard.

Sathya, do you want to look into this, or shall I?

comment:26 in reply to:  25 ; Changed 6 years ago by gsathya

Status: needs_revisionneeds_review

Replying to karsten:

Sathya, do you want to look into this, or shall I?

Done. I just monkey patched router.bandwidth to router.advertised_bw, I think this is fine for now.

comment:27 Changed 6 years ago by arma

I talked to Ian and Aaron a bit more about this analysis. What we'd like to see, for a given consensus, is a graph with bandwidth cutoff on the x axis and L_\inf on the y axis. L_\inf is the largest distance between the two probability distributions -- one being the probability distribution of which relay you'd pick from the pristine consensus, and the other the distribution in the modified consensus. "largest distance" means the element (i.e. relay) with the largest difference.

Then we should consider time: looking at C consensuses over the past year or something, for a given cutoff, we should graph the cdf of these C data points where each data point is the L_\inf of that consensus for that cutoff. The hope is that for some cutoffs, the cdf has very high area-under-the-curve.

comment:28 in reply to:  27 ; Changed 6 years ago by karsten

Status: needs_reviewneeds_revision

Replying to arma:

I talked to Ian and Aaron a bit more about this analysis. What we'd like to see, for a given consensus, is a graph with bandwidth cutoff on the x axis and L_\inf on the y axis. L_\inf is the largest distance between the two probability distributions -- one being the probability distribution of which relay you'd pick from the pristine consensus, and the other the distribution in the modified consensus. "largest distance" means the element (i.e. relay) with the largest difference.

Sounds doable. I'd say let's start with plain consensus weight fractions and postpone exit, guard, country, and AS probabilities until we have a better handle on this type of analysis.

A possible output file could look like this:

validafter,min_advbw,relays,linf
2012-09-10 01:00:00,1,3040,0.03553
2012-09-10 01:00:00,2,[...]

Here, validafter is the consensus valid-after time, min_advbw is the minimum advertised bandwidth of relays kept in the modified consensus, relays is the number of those relays, and linf is the largest difference between consensus weight fractions of all relays. The probability in the pristine consensus is always the consensus weight fraction. The probability in the modified consensus is 0 if the relay was excluded, or the consensus weight fraction relative to the new consensus weight sum (which is lower than the original consensus weight sum, because we cut out some relays). We'll want to compare probabilities of all relays, including those that we excluded, because they have non-zero probability in the modified consensus.

Then we should consider time: looking at C consensuses over the past year or something, for a given cutoff, we should graph the cdf of these C data points where each data point is the L_\inf of that consensus for that cutoff. The hope is that for some cutoffs, the cdf has very high area-under-the-curve.

Sure, we should be able to plot those graphs from the file format above.

Sathya, want to look into modifying pyentropy.py for the linf stuff?

comment:29 in reply to:  26 Changed 6 years ago by karsten

Replying to gsathya:

Done. I just monkey patched router.bandwidth to router.advertised_bw, I think this is fine for now.

Hmm. I didn't look very closely, but I think this doesn't work. We'll want to exclude relays based on descriptor bandwidth but calculate the various metrics based on consensus weight. With your patch we're using descriptor bandwidth for everything.

This specific patch is probably moot, now that we're going to change the analysis from entropy values to L_\inf. But we'll want to have a similar patch for the L_\inf stuff, too.

comment:30 in reply to:  28 ; Changed 6 years ago by gsathya

Status: needs_revisionneeds_review

Replying to karsten:

Here, validafter is the consensus valid-after time, min_advbw is the minimum advertised bandwidth of relays kept in the modified consensus, relays is the number of those relays, and linf is the largest difference between consensus weight fractions of all relays. The probability in the pristine consensus is always the consensus weight fraction. The probability in the modified consensus is 0 if the relay was excluded, or the consensus weight fraction relative to the new consensus weight sum (which is lower than the original consensus weight sum, because we cut out some relays). We'll want to compare probabilities of all relays, including those that we excluded, because they have non-zero probability in the modified consensus.

"The probability in the modified consensus is 0 if the relay was excluded," and "including those that we excluded, because they have non-zero probability in the modified consensus" seem to be contradicting?

Sathya, want to look into modifying pyentropy.py for the linf stuff?

I'm ignoring the probabilities of relays that we excluded because they have 0 probability. Please check my bug_1854_v2 branch Thanks!

comment:31 in reply to:  30 ; Changed 6 years ago by karsten

Status: needs_reviewneeds_revision

Replying to gsathya:

"The probability in the modified consensus is 0 if the relay was excluded," and "including those that we excluded, because they have non-zero probability in the modified consensus" seem to be contradicting?

What I meant is "including those that we excluded, because they have non-zero probability in the pristine consensus". Sorry.

Sathya, want to look into modifying pyentropy.py for the linf stuff?

I'm ignoring the probabilities of relays that we excluded because they have 0 probability. Please check my bug_1854_v2 branch Thanks!

Can you change the above? I just had a quick look, but I'd want to look closer once it's doing the thing that I think arma et al. had in mind.

And can you either remove pyentropy.py, or move your changes to pyentropy.py and remove pylinf.py, so that there's only the code file that we're actually using in the repository?

Thanks!

comment:32 in reply to:  31 Changed 6 years ago by gsathya

Status: needs_revisionneeds_review

Replying to karsten:

Replying to gsathya:

"The probability in the modified consensus is 0 if the relay was excluded," and "including those that we excluded, because they have non-zero probability in the modified consensus" seem to be contradicting?

What I meant is "including those that we excluded, because they have non-zero probability in the pristine consensus". Sorry.

Sathya, want to look into modifying pyentropy.py for the linf stuff?

I'm ignoring the probabilities of relays that we excluded because they have 0 probability. Please check my bug_1854_v2 branch Thanks!

Can you change the above? I just had a quick look, but I'd want to look closer once it's doing the thing that I think arma et al. had in mind.

Done

And can you either remove pyentropy.py, or move your changes to pyentropy.py and remove pylinf.py, so that there's only the code file that we're actually using in the repository?

Done

Please check my bug_1854_v2 branch. Thanks!

comment:33 Changed 6 years ago by karsten

Status: needs_reviewnew

Merged with a minor tweak that otherwise would bite us when evaluating the data.

That's part one of the analysis. Next steps are:

  • rewrite plot-entropy.R to visualize a single consensus,
  • run pylinf.py on, say, 1 year of consensuses (and 1y1m of server descriptors to be sure we have all server descriptors referenced from consensuses),
  • generate graph data for L_\inf for a given min_adv_bw over time, and
  • visualize previously generated graph data, probably using R.

Want to look into one of these next steps?

comment:34 Changed 6 years ago by arma

Over the past few days, the minimum bandwidth for the Fast flag looks like it moved from 32KB/s up to 50KB/s and then back down. So maybe there is data to analyze even without explicitly doing the experiment. :)

Changed 6 years ago by gsathya

comment:35 Changed 6 years ago by gsathya

Status: newneeds_review

Some background from Karsten's email -

I usually take another approach for combining network statuses and server descriptors in an analysis: parse *all* server descriptors, extract the relevant parts, keep them in memory stored under their descriptor digest, parse consensuses, use server descriptor parts from memory. This is faster, because we only have to parse a server descriptor once, not every time it's referenced from a consensus, which can be 12 times or more. There's also the option to store intermediate results from parsing server descriptors in a temp file and only read that when re-running the analysis, which typically happens quite often. This approach is also more efficient, because we can parse server descriptors contained in tarballs without extracting them.

I've changed pylinf to be able to read a single tar file or a bunch of tar server descriptor files and store it in memory. I haven't had the chance to test it much, let me know if you find any bugs.

comment:36 Changed 6 years ago by karsten

Sounds good. Did the code produce meaningful output? I won't be able to review the code today, but I could try tomorrow or Friday. Knowing that the code probably works as expected would be good though. Thanks!

comment:37 in reply to:  36 ; Changed 6 years ago by gsathya

Replying to karsten:

Sounds good. Did the code produce meaningful output? I won't be able to review the code today, but I could try tomorrow or Friday. Knowing that the code probably works as expected would be good though. Thanks!

Made more changes here - https://github.com/gsathya/metrics-tasks/compare/bug_1854_v2 It's been running on my tiny vps for more than 4 hrs processing 2 months server descriptor and 1 month consensus data. I'm going to run this on lemmonni now.

comment:38 in reply to:  37 Changed 6 years ago by gsathya

Replying to gsathya:

Replying to karsten:

Sounds good. Did the code produce meaningful output? I won't be able to review the code today, but I could try tomorrow or Friday. Knowing that the code probably works as expected would be good though. Thanks!

Made more changes here - https://github.com/gsathya/metrics-tasks/compare/bug_1854_v2 It's been running on my tiny vps for more than 4 hrs processing 2 months server descriptor and 1 month consensus data. I'm going to run this on lemmonni now.

http://codesurfers.net/~gsathya/entropy.csv 1 month consensus data with 3 months of serverdescriptor data

comment:39 Changed 6 years ago by karsten

Status: needs_reviewnew

Code looks good, merged. I also graphed your 1 month of data. I'm currently running your script in an EC2 instance on 1 year of data. Will let you know once I have results.

Changed 6 years ago by karsten

comment:40 Changed 6 years ago by karsten

Status: newneeds_information

Running this code on an EC2 m1.large instance took 15 minutes to set up (download and uncompress tarballs) and 9 hours to run.

Here are some results. gsathya, does this look plausible? arma and iang, is this what you had expected?

comment:41 Changed 6 years ago by iang

What's going on at the right end of the linf graph there?

Other than that, the plot shows that setting the cutoff to 1 MB/s (using only the top 400 relays or so) would affect the choice of relays in a tiny amount I can't read from the graph. (Can you make that graph log/log?)

What is the linf comparison to? A cutoff of 20 KB/s? No cutoff? There are relays appearing in the upper plot with speeds < 20 KB/s.

comment:42 Changed 6 years ago by iang

Can you also plot total advertised bandwidth with the same x-axis? A rough eyeing of the top figure suggests that the ~2100 relays with bandwidths below 1 MB/s contribute a total of ~500 MB/s, but it seems to me that that should produce more than a negligible change in probability distribution.

Changed 6 years ago by karsten

comment:43 Changed 6 years ago by karsten

Replying to iang:

What's going on at the right end of the linf graph there?

You mean why is it skyrocketing and then dropping to almost zero? I think when there's only 1 relay left, the probability of it being picked grows to 100%, so linf is 100% minus its previous probability of being picked. And when no relay is left, linf goes down to the maximum probability of a relay being picked in the pristine consensus that now cannot be picked anymore.

Other than that, the plot shows that setting the cutoff to 1 MB/s (using only the top 400 relays or so) would affect the choice of relays in a tiny amount I can't read from the graph. (Can you make that graph log/log?)

Attached. (I left the original graph in and added another graph for log/log, because number of relays looks funny on a log scale and there's no easy way to use different scales for both sub graphs.)

What is the linf comparison to? A cutoff of 20 KB/s? No cutoff? There are relays appearing in the upper plot with speeds < 20 KB/s.

No cutoff, that is, comparing to the pristine consensus where any relay could be picked. That's how linf is defined in the script right now. We can change that, but new results would be at least 9+ hours away.

Changed 6 years ago by karsten

comment:44 Changed 6 years ago by karsten

Replying to iang:

Can you also plot total advertised bandwidth with the same x-axis? A rough eyeing of the top figure suggests that the ~2100 relays with bandwidths below 1 MB/s contribute a total of ~500 MB/s, but it seems to me that that should produce more than a negligible change in probability distribution.

Attached. Please note that the first graphs were previously wrongly labeled. They showed data from 2011-11-19 23:00:00, not 2012-10-31 23:00:00. The new PDF shows the correct data, including total excluded advertised bandwidth.

comment:45 Changed 6 years ago by iang

OK, so my eyeballing of ~500 MB/s excluded at a 1 MB/s cutoff turned out to be pretty darned close. ;-)

So at that cutoff, about 15% of the network bandwidth disappears. But that 15% was spread (highly unevenly) over 2100 relays. Each of those relays, according to the linf figure, contributed a maximum of about 0.5% of the bandwidth, and in turn, the remaining relays see at most 0.5% extra users. (NOTE: that's 0.5% of *all* the users, not 0.5% of what it had before.)

OK, here's the plot I'm interested in now: x-axis: bandwidth of relay (log scale). y-axis: one line showing the probability distribution of relay selection with a 20 KB/s cutoff, and one with a 1 MB/s cutoff. Feel free to throw other intermediate values in there as well. We'll probably need a version with a linear y-axis and one with a log y-axis.

Is that easy to do?

Changed 6 years ago by karsten

Changed 6 years ago by karsten

comment:46 Changed 6 years ago by karsten

Replying to iang:

OK, here's the plot I'm interested in now: x-axis: bandwidth of relay (log scale). y-axis: one line showing the probability distribution of relay selection with a 20 KB/s cutoff, and one with a 1 MB/s cutoff. Feel free to throw other intermediate values in there as well. We'll probably need a version with a linear y-axis and one with a log y-axis.

This PDF (accidentally uploaded twice, *-c.2.pdf is the same file) now contains two new graphs of cumulative probability distributions, along with the existing graphs.

comment:47 Changed 6 years ago by iang

Can you make it a pdf instead of a cdf? Just scatter plot (bandwidth, probability) for each relay and join up the points with lines.

Changed 6 years ago by karsten

comment:48 Changed 6 years ago by karsten

Attached. It's not an actual probability distribution function though, because multiple relays can have exactly the same advertised bandwidth, and I figured you don't want a graph with probabilities of those relays being summed up. (Unless you actually wanted such a graph, in which case I could easily make one.)

comment:49 Changed 6 years ago by iang

That's indeed just what I was looking for.

One more? Same as above. but y axis is the ratio (prob with 1 MB cutoff / prob with 20 KB cutoff).

I expect 0 below 1 MB/s and a fairly constant value (~ 1.15 I think?) above 1 MB/s. Is that what we see?

If so, then the question is: since many relays are capped, what happens if 15% more users try to use them?

Changed 6 years ago by karsten

comment:50 Changed 6 years ago by karsten

Replying to iang:

That's indeed just what I was looking for.

Great!

One more?

Sure, this is fun! :)

Same as above. but y axis is the ratio (prob with 1 MB cutoff / prob with 20 KB cutoff).

I expect 0 below 1 MB/s and a fairly constant value (~ 1.15 I think?) above 1 MB/s. Is that what we see?

Attached. I cut out relays below 1 MiB/s, because we set them to exactly 0.0, so we'd run into div/0 there. 1.087879 is the exact constant value for all relays above 1 MiB/s.

comment:51 Changed 6 years ago by iang

Wouldn't it be 0/something, not something/0? In any event, yes, the >1 MB/s ones are the interesting ones.

So if we put the cutoff even as high as 1 MB/s, the remaining ~400 routers see a 9% increase in usage, and no one's on a crappy router.

Hmm.

comment:52 in reply to:  51 Changed 6 years ago by karsten

Replying to iang:

Wouldn't it be 0/something, not something/0?

Erm, yes, you're right. ;)

comment:53 Changed 6 years ago by mo

Cc: mo added

comment:54 Changed 6 years ago by cypherpunks

Post by Paul, who still doesn't have a proper account:
Been talking to Ian about this today here at Dagstuhl. I don't think all the effects of significantly shrinking the set of nodes that are ever chosen has been considered. If the network shrinks to c. 1/4 it's current size, this has the potential for tremendous psychological impact on users, relay volunteers, some adversaries, funders, etc. There is thus a big difference between switching a lot of nodes to be never chosen vs. changing the distributions to make them more rarely chosen. Instead of changing the fast flag, it would then make more sense to alter the bandwidth weighting. And the more gradual the change in probability of being chosen, the less any nodes will naturally count as the group that has been simply excluded. If performance is best served by more of a step function, then perhaps something in between will still significantly improve performance statistics without, e.g., resulting in graphs showing a 75% drop in the number of nodes with the fast flag when the change is rolled out.

comment:55 Changed 5 years ago by keb

In Canada at least, the assumption that home relay users can provide less than 500KB/s bandwidth may be obsolete by next year. Cable and DSL carriers are pushing connections with either 2Megabit or 10Megabit upload as their "standard" package, albeit with total bytes transferred restrictions resulting in hibernations.
http://www.rogers.com/web/link/hispeedBrowseFlowDefaultPlans
http://www.bell.ca/Bell_Internet/Internet_access
However, that might mean users' expectations of speed from Tor will also be higher, at least in Canada.

comment:56 Changed 5 years ago by proper

Cc: adrelanos@… added

Please consider the psychological effects.

  • Please never let users who run Vidalia for years on slow connections somehow find out in press: your connection is considered too slow, you wasted one year of electricity running Vidalia with volunteer option, without volunteering anything.
  • In any case someone won't volunteer anymore: please show them a big fat warning, so they won't waste their uptime.
  • As Paul suggested: please keep the numbers. Try to talk them into becoming bridges or just select them very rarely.
  • And finally, please don't tell people, residential connections aren't of a big help, keep the community spirit alive!

comment:57 Changed 5 years ago by iang

Encouraging users on slow connections to be bridges seems to make much more sense than encouraging them to be relays, no? Even "all (stable?) clients are automatically bridges" makes plausible sense, whereas "all (stable?) clients are automatically relays" will make the consensus melt.

comment:58 in reply to:  56 Changed 5 years ago by zenaan

Cc: zen@… added

Replying to proper:

Please consider the psychological effects.

  • Please never let users who run Vidalia for years on slow connections somehow find out in press: your connection is considered too slow, you wasted one year of electricity running Vidalia with volunteer option, without volunteering anything.
  • In any case someone won't volunteer anymore: please show them a big fat warning, so they won't waste their uptime.

We ought be mindful of the future - The future may bring exciting possibilities for even low-bandwidth relays:

  • parallel (torrent-like, i2p-like) pathways/cells
  • distributed data store possibilities
  • every relay a small encrypted (opt-in or opt-out) data store provider perhaps - I guess ala freenet, but different
  • distributed redundant hash table(s) - greater redundancy is usually not a bad thing for a censorship-resistant distributed data store, I thought...
  • build a network-of-trust (GPG sort of style) on Tor network - certain data models may depend on greater number of nodes in future, and be weakened by reduction of node count

The point is - tor is not finished! We have a _long_ way to go to fully decentralise communications authority amongst the broader community. Let's definitely _not_ pre-empt our future by reducing possibilities.

  • As Paul suggested: please keep the numbers. Try to talk them into becoming bridges or just select them very rarely.

Those who have a spirit of contribution, will be grateful they can contribute, even if only a little. In the future, they may be able to make bigger contributions.

  • And finally, please don't tell people, residential connections aren't of a big help, keep the community spirit alive!

A big Ack!

A small help is a big help, is what we ought to say!

In addition: Building community, and building the future "more significant contributors" - which might be in many terms - financial, bandwidth, useful hidden services, brilliant ideas for future development, or even development itself. Every contributor starts somewhere!

  • Emphasize the genuine contribution directions, as said above such as bridge; I believe the website appears to do this pretty well now - anecdotal, but I recently became an exit relay, and the website's encouragement steered me there as "this is needed, especially full exits" (the website wording can still be improved here - less fear, more reference to established legal precedents regarding 'carriers' in various jurisdictions), to steer people in the most useful directions.
  • Parallel paths may make those slow relays useful in the future. So discouraging people to do something towards their freedom and others' freedom is counter-productive to the long term health of the broader community.
  • Contributer Graduation - building the community of those who wish to contribute, in time some will graduate to be bigger contributors.

Again, future thinking.

A small help is a big help.

Everyone doing a little bit makes the future jobs easier.

We have not yet solved all the problems, so definitely benefit from more people putting attention, action and in many cases future intention, towards our broader goals.

Optimising bandwidth for current tor facilities is good.

Maximising our community base from which future development and technology can build is very good.

Last edited 5 years ago by zenaan (previous) (diff)

comment:59 Changed 4 years ago by arma

I appear to have raised what is turning into a similar topic on #13822.

comment:60 Changed 14 months ago by karsten

Resolution: wontfix
Status: needs_informationclosed

Closing tickets in Metrics/Analysis that have been created 5+ years ago and not seen progress recently, except for the ones that "nickm-cares" about.

Note: See TracTickets for help on using tickets.