Opened 10 years ago

Closed 10 years ago

Last modified 9 years ago

#1566 closed task (wontfix)

Calculate directory request shares from descriptor archives

Reported by: karsten Owned by: karsten
Priority: Medium Milestone:
Component: Metrics/CollecTor Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

We should not rely on the dirreq-v3-share that directory mirrors report, but (be able to) calculate directory request shares from the descriptor archives. Three possible approaches are:

  1. advertised bandwidth / sum of advertised bandwidth
  2. measured bandwidth / sum of measured bandwidth
  3. weighted measured bandwidth / sum of weighted measured bandwidth

The first approach is used by 0.2.0.x clients to decide which directory mirror to pick for requesting a network status consensus (among other things). For every relay with non-zero directory port, the weight is the advertised bandwidth (minimum of bandwidth rate and observed bandwidth) as reported by the relay in its server descriptor.

The second approach is used by 0.2.1.x clients and is based on measured bandwidths as written to consensuses. (To be precise, the Bandwidth lines in consensuses are only measured bandwidths if at least three votes contain Measured lines for relays; otherwise, the Bandwidth lines contain the advertised bandwidths as used by 0.2.0.x clients.)

The third approach is used by 0.2.2.x clients and is based on the measured bandwidths plus bandwidth weights as written to consensuses. Before summing up bandwidths, they are weighted depending on a directory's flags: Wbg for Guard, Wbe for Exit, Wbd for Guard+Exit, and Wbm for neither Guard nor Exit.

The graphs in the attachment show the calculated directory request shares for the first and second approach plus the reported dirreq-v3-shares before bandwidth weights were introduced to consensuses. New graphs will follow for the third described approach.

We need to decide if these calculated shares are stable enough to estimate user numbers from them. If not, we should find out what causes volatility, or rather, what made dirreq-v3-share such a nice stable metrics so far.

Child Tickets

Attachments (3)

dirreq-shares.png (91.0 KB) - added by karsten 10 years ago.
Directory request shares calculated for directory mirror trusted
recurring-users.png (74.7 KB) - added by karsten 10 years ago.
Recurring user estimates based on calculated directory request shares
weighted-measured-bw-dirreq-share.pdf (105.9 KB) - added by karsten 10 years ago.
Directory request shares calculated from weighted measured bandwidth (4 graphs)

Download all attachments as: .zip

Change History (7)

Changed 10 years ago by karsten

Attachment: dirreq-shares.png added

Directory request shares calculated for directory mirror trusted

Changed 10 years ago by karsten

Attachment: recurring-users.png added

Recurring user estimates based on calculated directory request shares

comment:1 Changed 10 years ago by karsten

Mike says on #tor-dev:

TorCtl.PathSupport.BwWeightedGenertor has
another implementation of the old 0.2.1.x bandwidth
weighting mechanisms, if you want an additional reference
other than smartlist_choose_by_bandwidth()

Changed 10 years ago by karsten

Directory request shares calculated from weighted measured bandwidth (4 graphs)

comment:2 Changed 10 years ago by karsten

I added a third file with an analysis of directory request shares based on weighted measured bandwidth. While the bandwidth weights look pretty stable to me (pages 1 and 2), trusted's measured bandwidth and the resulting dirreq-v3-share (pages 3 and 4) seem rather useless for estimating user numbers.

comment:3 Changed 10 years ago by karsten

Resolution: wontfix
Status: newclosed

This task is obsolete.

A recent comparison of directory request statistics and entry statistics has shown that our current approach of estimating user numbers from directory requests is not very reliable. We should instead switch to user numbers based on entry statistics.

comment:4 Changed 9 years ago by karsten

Milestone: Calculate directory request shares

Removing the milestone. This isn't how milestones are supposed to work. "Calculate directory request shares" is a task or maybe a project, but not a milestone. The next time we should use a parent ticket instead. I'm not creating one now, because nobody cares anymore.

Note: See TracTickets for help on using tickets.