Opened 5 years ago

Closed 5 years ago

#12690 closed defect (fixed)

Raise the bandwidth threshold for being a guard

Reported by: asn Owned by: nickm
Priority: Medium Milestone: Tor: 0.2.6.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-guard tor-auth
Cc: Actual Points:
Parent ID: #11480 Points:
Reviewer: Sponsor:

Description

From proposal236:

   From dir-spec.txt:
      "Guard" -- A router is a possible 'Guard' if its Weighted Fractional
       Uptime is at least the median for "familiar" active routers, and if
       its bandwidth is at least median or at least 250KB/s.

   When this proposal becomes effective, authorities should change the
   bandwidth threshold for being a guard node to 2000KB/s instead of
   250KB/s.

Child Tickets

Attachments (1)

perf_cdf_guard_bw_desc_1500.png (107.9 KB) - added by asn 5 years ago.

Download all attachments as: .zip

Change History (17)

comment:1 Changed 5 years ago by asn

Some notes:

Here is the relevant torspec entry:

   "Guard" -- A router is a possible 'Guard' if its Weighted Fractionall
   Uptime is at least the median for "familiar" active routers, and if
   its bandwidth is at least median or at least 250KB/s.

        To calculate weighted fractional uptime, compute the fraction
        of time that the router is up in any given day, weighting so that
        downtime and uptime in the past counts less.

        A node is 'familiar' if 1/8 of all active nodes have appeared more
        recently than it, OR it has been around for a few weeks.

And here is the corresponding piece of code from dirserv.c:

  if (node->is_fast &&
      ((options->AuthDirGuardBWGuarantee &&
        routerbw_kb >= options->AuthDirGuardBWGuarantee/1000) ||
       routerbw_kb >= MIN(guard_bandwidth_including_exits_kb,
                       guard_bandwidth_excluding_exits_kb)) &&
      is_router_version_good_for_possible_guard(ri->platform)) {
    long tk = rep_hist_get_weighted_time_known(
                                      node->identity, now);
    double wfu = rep_hist_get_weighted_fractional_uptime(
                                      node->identity, now);
    rs->is_possible_guard = (wfu >= guard_wfu && tk >= guard_tk) ? 1 : 0;
  } else {
    rs->is_possible_guard = 0;
  }

We note that the requirement for Fast in the implementation does not match the spec.

I'm wondering what should happen about the if its bandwidth is at least the median or at least 250KB/s (or 2000KB/s as it will soon be). Do we still like the at least the median? We should probably see what the median is on the real network, to see how far it is from 2000KB/s.

Also, Roger, here is the part about the testing network you worried about:

  if (options->TestingTorNetwork &&
      routerset_contains_routerstatus(options->TestingDirAuthVoteGuard,
                                      rs, 0)) {
    rs->is_possible_guard = 1;
Last edited 5 years ago by asn (previous) (diff)

comment:2 Changed 5 years ago by arma

I've just been looking at exactly this code too. Here's the answer on moria1 currently:

Jul 23 19:50:01.941 [info] dirserv_compute_performance_thresholds(): Cutoffs: For Stable, 814436 sec uptime, 1144433 sec MTBF. For Fast: 16 kilobytes/sec. For Guard: WFU 98.000%, time-known 691200 sec, and bandwidth 335 or 265 kilobytes/sec. We have enough stability data.

So currently the median bw is 265 kilounits (which is higher than the 250 minimum!).

comment:3 Changed 5 years ago by arma

I changed moria1 to take the 3/4 rather than 1/2 of relays, and now it's
Jul 23 22:50:01.391 [info] dirserv_compute_performance_thresholds(): Cutoffs: For Stable, 824828 sec uptime, 1157130 sec MTBF. For Fast: 15 kilobytes/sec. For Guard: WFU 98.000%, time-known 691200 sec, and bandwidth 2980 or 2060 kilobytes/sec. We have enough stability data.

So the 3/4 spot is right around the 2000 kilounits that we want.

For comparison, that moves moria1 to voting 1182 Guard flags, compared to the 2565 in the consensus.

comment:4 Changed 5 years ago by arma

Status: newneeds_review

See my ticket12690 branch, intended for maint-0.2.5.

In particular, somebody should see if my arithmetic wants more bulletproofing.

Also I keep thinking I should lower the number to 1500 kilounits, to get more guards, because surely 1500 is close enough to 2000, and if we cut out a lot of our non-exit guards then the only remaining place for them is the middle hop, which is exactly where diversity isn't so helpful.

comment:5 Changed 5 years ago by arma

For comparison, we have 1350 guards when I lower the number to 1500 kilounits. That's a few more but not many more.

Changed 5 years ago by asn

comment:6 Changed 5 years ago by asn

Looks good to me.

Also, see my branch arma-ticket12690 for some unittests on the third quartile functionality.

I think lowering the number to 1500 kilounits should be OK. I made a CDF graph for the 1500 kilounits threshold attachment:perf_cdf_guard_bw_desc_1500.png . You can compare it with the one I made for 2000:
https://people.torproject.org/~asn/guards2/perf_cdf_guard_bw_desc.png

As you can see, the one-guard curve of the 1500 graph is a bit slower than the 2000 graph, but the difference is not great and it's definitely better than one guard without any bw restrictions. The main difference is in the [0, 0.1] probability range, for the "unlucky" clients that pick the 1500kb/s guards.

If you need help on how to read those graphs, see the Performance implications of switching to 1 guard paragraph of:
https://lists.torproject.org/pipermail/tor-dev/2014-March/006458.html

comment:7 Changed 5 years ago by arma

So 10% of the clients get crummier performance with 1500 vs 2000. Let's leave it at 2000 for now then.

comment:8 Changed 5 years ago by arma

(There's some room for cool analysis here, where we notice that the first hop is the fast stable relays, the third hop is the exit relays, and the middle hop is the fast not-stable relays (the ones that will be guards but aren't yet guards). And the slow relays get left out in the cold. But that's already how things work in practice, since we choose relays by capacity. Carry on.)

comment:9 Changed 5 years ago by cypherpunks

do clients pick new guards when their existing guards lose the guard flag?

comment:10 in reply to:  9 Changed 5 years ago by arma

Replying to cypherpunks:

do clients pick new guards when their existing guards lose the guard flag?

Yes. Or more precisely, they go to the next on their list. That's not ideal for a variety of reasons, but I think this change is still a net win.

comment:11 Changed 5 years ago by nickm_mobile

This patch seems plausible to me.

comment:12 Changed 5 years ago by andrea

I think this patch looks okay to me.

comment:13 in reply to:  6 Changed 5 years ago by arma

Replying to asn:

Also, see my branch arma-ticket12690 for some unittests on the third quartile functionality.

asn, your patch makes us sort bandwidths_kb twice, and also doesn't replace the

      find_nth_uint32(bandwidths_excluding_exits_kb,
                      n_active_nonexit, n_active_nonexit*3/4);

call.

comment:14 in reply to:  12 Changed 5 years ago by arma

Milestone: Tor: 0.2.5.x-finalTor: 0.2.6.x-final
Owner: set to nickm
Status: needs_reviewassigned

Replying to andrea:

I think this patch looks okay to me.

I have merged my patch into maint-0.2.5.

I'm going to leave this ticket open and reassign it to Nick, so he can either merge asn's unit tests into master, or not, or whatever he thinks is smart.

Thanks!

comment:15 Changed 5 years ago by arma

Status: assignedneeds_review

comment:16 Changed 5 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

Cherry-picked the tests.

Note: See TracTickets for help on using tickets.