Analysis Ticket Results


What would be the result of avoiding slower relays? Paraphrased from iang based on linf-min-adv-bw-2012-11-27-a.pdf: if a 1MB/s cutoff was used, the ~400 faster relays might see 9% more usage and 0.5% extra users (from the pool of all Tor users) while more than 2100 relays, representing 15% of network bandwidth (distributed highly unevenly among them) would not be used.

The figure illustrates l_inf for various cutoff values. l_inf is the largest difference between the base and new consensus weight fractions among all relays with the new consensus weight fraction calculated using the new consensus sum. The difference is zero if the relay was excluded due to the cutoff.


Why might a consensus not be generated? A consensus may not be reached due to poor vote timing, arising, in part, from clock jumps or connectivity.

deviant-consensus-times.png suggests a knock-on effect for missed consensuses. The consensus archive does not archive consensus generated at the half-hour mark.


How might can one determine if a slow entry guard is used? If torperf is used, torperf-guard-bandwidths.png suggests that the slowest guards yield different results, but the data is otherwise noisy. Circuit timeout times are considered as a metric?


What is the distribution of ratios for measured bandwidth to self-reported (advertised) bandwidth? bandwidth-comparison-measured-votes-2011-06-29.png suggests that self-reported bandwidth is lower than measured for a majority of nodes in the guard and guard+exit(non-default policy) categories while good agreement generally exists between votes and the consensus.


Bridge to pool (https, email, unassigned (other/manual)) assignments are sanitized for future exploration (e.g., determine if the unassigned users notice and try to switch to one of the other pools).


Old metrics repository is archived and a new metrics-tasks repository is created for analysis tasks, which are to be organized by ticket number.


Should the uptime requirement for HSDir node be increased to over 24 hours, for example, to 24.5 or 25 hours? Nodes tend to disappear on 24 hour cycles. hsdir-set-instability-graph.pdf suggests less than 10% unreachability, based on 1/6 of the 3 hour values, to account for uptime noise, is a good starting point. Reducing unreachability rate is an open question.


Bridge data is standardized to lower the barrier for data analysis work. User-supplied information is noted to be inconsistent.


Torperf runs with non-standard circuit timeout values and custom guard nodes were dropped. Torperf is a tool to evaluate Tor network performance based on the download times of files containing random data.


Refer to #2649. The 24 hour cycle is also noted.


Based on past user count estimates for a given country, what user count estimate could be considered to be an outlier representing a possible censorship event? An outlier would be a value below the expected user count range obtained from a trend defined by the top 50 countries with the most users (minus outliers, probability 0.9999, normal distribution) and a Poisson distribution with the mean centered about the estimated user count of the individual country of interest (0.9999 probability) using a seven day window. The outlier detection scheme is imprecise due to noise, odd 7-day patterns, and validation difficulties.

Sample graph shows data for Iran (IR) from January 2010 to July 2011.


data-2011-03-14.pdf, a technical paper describing available metrics data, also includes bridge pool assignments.


Unrealized 3D visualization of data based on the Google Earth plugin due to assistance that did not materialize.


Java tool to verify signatures for network consensus and server descriptor documents.


torperf-bwscanners.pdf overlays intervals of bandwidth scanner failures on torperf graphs.


How stable are bridges over time? still-running-bridges.png suggests a gradual decay over the period of a month (January 2011) for both bridges and bridges that have not changed IPs (the more useful variant).


What happens to bridges in the unassigned pool? Unresolved; what does happen?


How is relay churn and uptime? relay-stability-2011-06-30.pdf provides an overview with simulated flag assignments based on previous data.


What are the network connection speeds as inferred from network status download times reported by directory mirrors? client-speed-trends.png suggests the data is noisy with the median speed around 100-150 KiB/s and 10th percentile at between 10-25 KiB.


How do bridges with an uptime of below 24 hours, which are omitted from stats reporting, affect the usage data overview? stats-coverage-bridges.2.png suggests a large majority of bridges are reporting statistics, whether weighted on written bytes (90%) or uptime for a given day, based on bandwidth history (70%) or running flag (80%). Not included are bridges that were previously relays and bridges that were up for less than 24 hours.


Are onion keys rotated after 7 days, as expected, based on relay descriptor archives? onion-key-lifetimes.png suggests that the key replacement is on a 7 day cycle for data from May 2011, with some exceptions (19 of > 15000 onion keys).


When do relays with the HSDir flag disappear? hsdir-sessions-sim.png for May 2010 to April 2011 suggests that once earning the HSDir flag, less than half of the relays remain after 24 hours, decreasing to 20 percent after 3 days. With regard to #2649, the low resolution data may be inconclusive about the effect of requiring an uptime of slightly greater than 24 hours before assigning the flag.


Why do user spikes exist in user count estimates? daily-users.pdf suggests spikes are correlated with drops in the estimated fraction of directory requests seen by reporting directory mirrors. Increasing the proportion of reporting directory mirrors is expected to resolve this issue.


How much would a cloud bridge operating the 99% percentile for bandwidth use (100GiB per month; 25KiB/s) cost? About 30 USD/month on Amazon based on a week of live testing, serving 200 estimated users.


Based on stats sent by a bridge, can the bridge be determined to be blocked in a given country? Possibly; in-country verification is needed for confirmation. blocking-2011-09-15.pdf describes one approach using an absolute threshold (32) to determine blocking based on bridges that have reported over some number of estimated users (100) from a given country (China: CN).

The individual bridge trends also do not match the aggregate trend.


Should descriptors be published by relays and bridges after a stats cycle to avoid lost stats due to different cycle intervals (24 hours for stats; 18 hours for descriptors)? delay.png illustrates the delay between the end of a stats interval and stats publication. #3261 suggests the missing statistics are minimal and a change is not needed at this time.


What fraction of exit IPs are different from those in their descriptors, based on exit capacity? Around 15% according to different-exit-address.png, which cross-references data from the consensus documents and exit lists from January to February 2012.


What metric(s) should be used to ensure that stable and usable bridges are distributed to users? According to bridge-stability-2011-10-31.pdf, a metric that takes into the Weighted Mean Time Between Address Change and Weighted-Fractional Uptime would be suitable as both uptime and IP stability are taken into account. In this scheme, bridges that are better than the median in both areas would be considered stable. Back-testing data from July 2010 to July 2011 suggests roughly one in three bridges would be considered stable.


How should the bridge authority infrastructure (BridgeDB, metrics-db, authority) be changed to scale beyond 10000 bridge descriptors? Testing on generated data, described in bridge-scaling.png, suggests that increases in terms of space and time for BridgeDB and metrics-db are generally linear. BridgeDB should be modified to not have blocking descriptor loads while metrics-db should handle descriptor sanitization in an asynchronous way as the tested processing time exceeds the cron job time interval. Authority scaling is to be considered when necessary.


Should consensus download times be archived? Not at the moment as the data can be noisy and inconclusive. download-stats.png shows download times from a US site while download-stats-comparison.png compares data from US and European sites.


What is the distribution of advertised bandwidths? advbw-cdf.tar provides CDFs for February 2011 to January 2012, suggesting limited variation over time. In addition, the mean advertised bandwidth is likely higher than the median advertised bandwidth.


How many bytes do directory authorities use for answering requests? dirauth-written-dirreq-bytes.png indicates some variation over time (December to mid-January 2012), but suggests around 100 KiB/s.


What do graphs of consensus weight fraction for specific nodes over time look like? consensus-weights-Amunet1.png and consensus-weights-noiseexit01a.png show variation and decreases over time (January to April 2012). path-selection-weights-2012-07-17.png includes path selection probabilities for July 2012.


How should bridge users be counted? counting-daily-bridge-users-2012-10-24.pdf suggests applying the a variant of the estimation method used for direct connections (i.e., extrapolating to get a directory download count by using reported counts and the fraction of reporting bridges, dividing by 10 to get estimated user counts, and separating countries based on the proportion of unique IPs seen). bridge-dirreq-stats-2012-10-02.png bridge-dirreq-stats-2012-10-02.png shows the values used in the described calcuation for a about year ending October 2012.

Fewer than expected consensus documents for a given day appears to result in more directory requests.


What does the entropy of consensus weights over time look like? entropy-august.png provides a comparison between the current entropy values and maximum expected entropy values based on the number of nodes.


Python script to aggregate of consensus weight of running relays by country.


What is the chance of selecting an exit relay in the top k nodes, based on bandwidth? exit-probability-cdf-b.png suggests around 80% if k is 50, 50% if k is 20, and 30 if k is 10, based on data from 2010 to 2012.


How should multiple GeoIP files be handled to quickly answer historical IP lookup requests? Use a Java-based compilation program and Python lookup script.


Compile exits that are fast and almost-fast.


The set of fast exits has 95+ Mbit/s bandwidth rate, 5000+ KB/s advertised bandwidth, accepts ports 80/443/554/1755, and at most 2 relays per /24 network.

The set of almost fast exits has 80+ Mbit bandwidth rate, 2000+ KB/s advertised bandwidth, accepts ports 80/443, does not have (95+ Mbit/s bandwidth rate and 5000+ KB/s advertised bandwidth and allows ports 80/443/554/1755), and has as many relays per /24 network as there are.

The set of fast exits without network restriction has 95+ Mbit/s bandwidth rate, 5000+ KB/s advertised bandwidth,accepts ports 80/443/554/1755, and has as many relays per /24 network as there are."


How should families be grouped? Group families based on the largest mutual relationship unit and merge overlapping families for a first order approximation.


How usable are past consensus documents based on the fraction of current relays and their corresponding consensus weights? 2012_frac_cw.png suggests the fraction of current relays is generally at least 0.7, with medians comfortably above 0.8 while 2012_frac_relays.png suggests the fraction of corresponding consensus weights is generally at least 0.8, with medians above 0.9, even after 7 days, for the year 2012.

Dips are noted at 12 and 36 hours, while jumps are noted at 24 and 48 hours, suggesting 24 hour cycles.


What's the distribution of relay lifetime and uptime? Some simple scripts to generate graphs of these distributions using Onionoo data.

Last modified 6 years ago Last modified on Aug 9, 2013, 1:07:43 PM