Obfuscation Simulation

We'd like to improve directory-request statistics by obfuscating values on relays before they are reported to the directory authorities. A possible obfuscation method is to add Laplace noise to request counts for all ~250 countries, so that it's unclear whether a request was actually made by a user or is just noise.

But before we do this we need to find out whether obfuscated values would still be useful enough to estimate user numbers in the Tor network. Let's run a simulation using archived descriptors.

Summary of findings and results


Detailed results

Graph: Absolute difference to user number estimates per country and day when obfuscating directory-request statistics

CSV files

Code for simulation

git clone -b dirreqstats
cd metrics-web
tar xf libs-for-metrics-web.tar
mv lib shared/
cd modules/clients/
./     # this takes a while, and it produces quite some noise on the console!
R --slave -f compare-simulations.R

Related work

