wiki:org/teams/NetworkTeam/PrivCountInTor

Version 4 (modified by teor, 2 weeks ago) (diff)

Add proposals, put notes up higher

PrivCount in Tor

PrivCount makes Tor relay statistics more secure. It secure aggregates and adds noise to Tor relay statistics, which makes it much harder to identify individual tor users from their network usage. PrivCount uses differential privacy to ensure that the final statistics hide individual users' activity.

Background

Proposals

Notes

Next Steps

Optimise for "simplest possible decisions at first" so that we can deploy it.

Revise Tickets

Ticket Resolution Summary Component Milestone Modified Owner Reporter Cc Parent ID
#26637 Privcount noise generation implemented and deployed Core Tor/Tor Tor: 0.3.5.x-final 5 days ago nickm teor #22898
#25669 Privcount: blinding and encryption should be finished up Core Tor/Tor Tor: 0.3.5.x-final 16 hours ago nickm teor, nickm, chelseakomlo #22898
#25381 Add crypto_rand_double_sign() in C and Rust Core Tor/Tor Tor: unspecified 5 days ago teor teor catalyst, chelseakomlo #26637
#23061 crypto_rand_double() should produce all possible outputs on platforms with 32-bit int Core Tor/Tor Tor: 0.3.5.x-final 4 days ago teor teor catalyst #26637

Noise

  • we'll need to do measurements with an actual client implementation to discover an appropriate action bound for our desired anonymity set size/security bounds.
    • should we protect the average case statistically, or some factor of the average?
  • need detailed spec on what stats and their noise levels
  • versioning for stats when we want to change and/or tweak noisiness
    • With PrivCount mixed versioning is tricky because the total noise across all statistics on all reporting relays determines user privacy
      • do we just pick the latest counter version, as long as enough relays support it? (it's not safe to report multiple copies of counters)
    • if a statistic's version is too old or we believe its noise to be insufficient to maintain privacy:
      • we should have a mechanism for telling those clients to simply not report that data
      • or we could increase the noise on old statistics
    • how do we impose a delay when the noise parameters change? (this delay ensures differential privacy even when the old and new counters are compared)
      • or should we try to monotonically increase counter noise?
  • we still need to specify how to allocate noise between counters, between relay partitions (to avoid outliers), between relays, and between consensuses
    • need threat modelling and decisions on potential bad relays that decide to stop adding noise to their collected statistics
      • the proposed attack is that a relay could not add noise in order to discover more from the collected data from other relays
        • we could not care because any relay which wanted to be malicious could more effectively do so by exposing their own users
        • we could add additional noise based on consensus weight
        • we can allocate noise based on the number of N relays in the network such that each relay gets 1/Nth of the noise
          • if the noise budget is X, and each relay adds X*1/R to the noise, where R is the number of relays that support PrivCount in Tor. (Technically it's X*sqrt(1/R), because noise standard deviation isn't additive.)
      • another proposed attack is that a relay adds infinite noise to destroy the statistic for the day
        • with N relay partitions, we can resist O(N) malicious or broken relays destroying stats for a day, but we probably need better resistance than that
        • the tally reporters can run a large-noise round where they add large additional noise to each relay (multiple times the total network noise), total each relay individually, and eliminate the ones that are larger than the expected noise. This leaks an extra bit of information (large/not large) to a malicious tally reporter.
  • run a simulation of splitting the noise from 6000 relays, and work out if the integer truncation makes our noise too low. (Or we could just use ceil() to be safe.)
  • if we ran privcount on all our current statistics, how many of them would we not be able to collect anymore because it's not possible to add sufficient noise.

Cryptography

  • Should we do a multi-level thing for the signing keys? That is, have an identity key for each TR and each DC, and use those to sign short-term keys?

Configuration

  • How to tell the DCs the parameters of the system, including:
    • who the TRs are, and what their keys are?
    • what the counters are, and how much noise to add to each?
    • when the collection intervals start and end?

Transmission

  • What to say about persistence on the DC side?
  • How data is uploaded to DCs?

Aggregation

  • How the TRs agree on which DCs' counters to collect?

Tickets

PrivCount Parent Ticket #22898:

Ticket Resolution Summary Component Milestone Modified Owner Reporter Cc Parent ID
#26637 Privcount noise generation implemented and deployed Core Tor/Tor Tor: 0.3.5.x-final 5 days ago nickm teor #22898
#25669 Privcount: blinding and encryption should be finished up Core Tor/Tor Tor: 0.3.5.x-final 16 hours ago nickm teor, nickm, chelseakomlo #22898
#25263 Fix the hidden service statistics noise (and the privcount noise by extension) Core Tor/Tor Tor: 0.3.5.x-final 5 days ago teor teor #22898
#25153 Specify how PrivCount in Tor statistics are configured and interpreted Core Tor/Tor Tor: unspecified 5 days ago teor teor #22898
#15272 Think of more research questions that we can answer with statistics Core Tor/Tor Tor: unspecified 5 days ago asn isabela, karsten, dgoulet, teor #22898

All tickets tagged PrivCount:

Ticket Resolution Summary Component Milestone Modified Owner Reporter Cc Parent ID
#25669 Privcount: blinding and encryption should be finished up Core Tor/Tor Tor: 0.3.5.x-final 16 hours ago nickm teor, nickm, chelseakomlo #22898
#25381 Add crypto_rand_double_sign() in C and Rust Core Tor/Tor Tor: unspecified 5 days ago teor teor catalyst, chelseakomlo #26637
#25153 Specify how PrivCount in Tor statistics are configured and interpreted Core Tor/Tor Tor: unspecified 5 days ago teor teor #22898
#24468 Measure HSDir usage to guide parameter choices Core Tor/Tor Tor: unspecified 3 months ago teor teor
#24047 Add new stats for v2 and v3 onion service traffic Core Tor/Tor Tor: unspecified 3 months ago teor
#23573 Do we want to close all connections when tor closes? Core Tor/Tor Tor: 0.3.5.x-final 12 days ago teor #25510
#23523 Handle extreme values better in add_laplace_noise() Core Tor/Tor Tor: unspecified 3 months ago teor teor catalyst #25263
#23501 Refactor rep_hist_format_hs_stats() to add noise when counters are initialised Core Tor/Tor Tor: unspecified 3 months ago teor #25263
#23416 Document the precision and limits of sample_laplace_distribution() Core Tor/Tor Tor: unspecified 3 months ago teor teor #25263
#23415 sample_laplace_distribution() should take multiple random inputs Core Tor/Tor Tor: unspecified 3 weeks ago teor teor #25263
#23414 rep_hist_format_hs_stats() should add noise, then round Core Tor/Tor Tor: unspecified 3 weeks ago teor teor #25263
#23126 HSDirs should publish some count about new-style onion addresses Core Tor/Tor Tor: unspecified 10 months ago arma
#23061 crypto_rand_double() should produce all possible outputs on platforms with 32-bit int Core Tor/Tor Tor: 0.3.5.x-final 4 days ago teor teor catalyst #26637
#22422 Add noise to PaddingStatistics Core Tor/Tor Tor: unspecified 3 months ago mikeperry teor karsten, mikeperry, robgjansen, amj703
#20594 hs: Make HSDir produce HS statistics for prop224 Core Tor/Tor Tor: unspecified 3 months ago dgoulet
#18268 Make Tor aware of the top-30 destinations of Tor Exit traffic Core Tor/Tor Tor: unspecified 13 months ago naif
#18082 Log separate HS extra-info stats for Single Onion Services Core Tor/Tor Tor: unspecified 13 months ago teor mikeperry
#17627 Add missing controller events so we can link every step of the HS dance Core Tor/Tor Tor: unspecified 13 months ago robgjansen
#17366 Track consensus fetch times per country? Core Tor/Tor Tor: unspecified 13 months ago arma mrphs
#16997 Gather and report metrics for the number of channels a relay is servicing. Core Tor/Tor Tor: unspecified 13 months ago yawning
#15272 Think of more research questions that we can answer with statistics Core Tor/Tor Tor: unspecified 5 days ago asn isabela, karsten, dgoulet, teor #22898
#13987 Apply laplace noise to other statistics Core Tor/Tor Tor: unspecified 14 months ago nickm karsten, asn, dgoulet
#13792 HS statistics for private tor network to gather info on services, clients and relays Core Tor/Tor Tor: unspecified 13 months ago dgoulet karsten, asn, rob.g.jansen@…
#13466 Collect aggregate stats of ntor-using hidden service interactions vs tap-using interactions Core Tor/Tor Tor: unspecified 13 months ago arma #13195
#13195 Collect aggregate stats around hidden service descriptor publishes and fetches Core Tor/Tor Tor: unspecified 13 months ago arma
#13194 Track time between ESTABLISH_RENDEZVOUS and RENDEZVOUS1 cell Core Tor/Tor Tor: unspecified 13 months ago arma
#8786 Add extra-info line that tracks the number of consensus downloads over each pluggable transport Core Tor/Tor Tor: unspecified 6 months ago asn
#7509 Publish and use circuit success rates in extrainfo descriptors Core Tor/Tor Tor: unspecified 2 months ago mikeperry arma, aagbsn@… #5456

Related pages