Changes between Version 1 and Version 2 of org/teams/NetworkTeam/PrivCountInTor


Ignore:
Timestamp:
Jul 4, 2018, 1:30:45 AM (5 months ago)
Author:
teor
Comment:

Summarise noise section, and add a list of tickets that are next

Legend:

Unmodified
Added
Removed
Modified
  • org/teams/NetworkTeam/PrivCountInTor

    v1 v2  
    77Optimise for "simplest possible decisions at first" so that we can deploy it.
    88
     9=== Revise Tickets ===
     10
     11[[TicketQuery(order=id,desc=1,format=table,col=resolution|summary|component|milestone|modified|owner|reporter|cc|parent,id=25669&or&id=26637&or&id=23061&or&id=25381,status!=closed)]]
     12
    913=== Noise ===
    1014
    1115* need to design api for allocating noise using an optimisation method that aaron created. for that we need an action bound and estimated value. the estimated value is not a security parameter; the action bound is
     16  * See https://github.com/privcount/privcount/blob/master/privcount/statistics_noise.py
    1217
    1318* we'll need to do measurements with an actual client implementation to discover an appropriate action bound for our desired anonymity set size/security bounds.
     
    1823* versioning for stats when we want to change and/or tweak noisiness
    1924  * With PrivCount mixed versioning is tricky because the total noise across all statistics on all reporting relays determines user privacy
    20   * if a statistic's version is too old or we believe its noise to be insufficient to maintain privacy, we should have a mechanism for telling those clients to simply not report that data
    21   * or we could increase the noise on old statistics
     25    - do we just pick the latest counter version, as long as enough relays support it? (it's not safe to report multiple copies of counters)
     26  * if a statistic's version is too old or we believe its noise to be insufficient to maintain privacy:
     27    * we should have a mechanism for telling those clients to simply not report that data
     28    * or we could increase the noise on old statistics
    2229  - how do we impose a delay when the noise parameters change? (this delay ensures differential privacy even when the old and new counters are compared)
    2330    - or should we try to monotonically increase counter noise?
    24   - what happens in networks where some relays report some counters, and other relays report other counters?
    25     - do we just pick the latest counter version, as long as enough relays support it? (it's not safe to report multiple copies of counters)
    2631
    2732* we still need to specify how to allocate noise between counters, between relay partitions (to avoid outliers), between relays, and between consensuses
    2833  * need threat modelling and decisions on potential bad relays that decide to stop adding noise to their collected statistics
    2934    * the proposed attack is that a relay could not add noise in order to discover more from the collected data from other relays
    30       * with N relay partitions, we can resist O(N) malicious or broken relays destroying stats for a day, but we probably need better resistance than that
    3135      * we could not care because any relay which wanted to be malicious could more effectively do so by exposing their own users
    3236      * we could add additional noise based on consensus weight
    3337      * we can allocate noise based on the number of N relays in the network such that each relay gets 1/Nth of the noise
    3438        * if the noise budget is X, and each relay adds X*1/R to the noise, where R is the number of relays that support PrivCount in Tor. (Technically it's X*sqrt(1/R), because noise standard deviation isn't additive.)
    35   * optimise noise across counters, for example:
    36     * https://github.com/privcount/privcount/blob/master/privcount/statistics_noise.py
     39    * another proposed attack is that a relay adds infinite noise to destroy the statistic for the day
     40      * with N relay partitions, we can resist O(N) malicious or broken relays destroying stats for a day, but we probably need better resistance than that
     41      * the tally reporters can run a large-noise round where they add large additional noise to each relay (multiple times the total network noise), total each relay individually, and eliminate the ones that are larger than the expected noise. This leaks an extra bit of information (large/not large) to a malicious tally reporter.
    3742
    3843* run a simulation of splitting the noise from 6000 relays, and work out if the integer truncation makes our noise too low. (Or we could just use ceil() to be safe.)