I've got implementations that match the current prop280, but we'll need to make some additional fixes, including:
- Whatever changes we make in prop280.
- Actually initializing the privcount subsytem
- Collecting initial statistics.
The current implementations are in privcount_nm_v2_032
and privcount_nm_v2_shake_032
.
T2. sample_unit_gaussian()
can't use both r * sin(theta)
and r * cos(theta)
unless they are independent samples. And I'm not sure if they are.
T3. We should update the comments to say that y
must be strictly less than 1.0, or log() produces infinity.
T4. We should update the comments to say that x must be strictly less than 1.0, or sin(TWO_PI*x) would produce three zeroes, and two of every other value.
Replying to teor:
T4. We should update the comments to say that x must be strictly less than 1.0, or sin(TWO_PI*x) would produce three zeroes, and two of every other value.
(This isn't strictly true, because of the granularity of floating-point numbers. But using x < 1.0 is correct.)
Replying to teor:
T2.
sample_unit_gaussian()
can't use bothr * sin(theta)
andr * cos(theta)
unless they are independent samples. And I'm not sure if they are.
In order to guarantee differential privacy, we need to:
- sample at the scale of the noise (not unit scale)
- add the noise to the signal
- round the noisy signal
This is the "snapping" mitigation from "On Significance of the Least Significant Bits For Differential Privacy" by Ilya Mironov
https://pdfs.semanticscholar.org/2f2b/7a0d5000a31f7f0713a3d20919f9703c9876.pdf
I think we're ok here, because the results are the same as the ones we'd get by snapping.
But if there's a transform that takes stddev and yields more precision, we should probably use it (rather than just multiplying stddev * r * sin(theta)
).
See also https://trac.torproject.org/projects/tor/ticket/23061#comment:33 for the output values from this function (if it used crypto_rand_double()).
T5. The function documentation of sample_unit_gaussian() should answer the following questions:
- what is the range of outputs?
- what is the precision of the outputs?
sample_gaussian() should document, and possible enforce:
- what is the maximum
stddev
value that can be used within the limits of double precision arithmetic?- to preserve differential privacy, the low bits have to be obscured by the noise. So this can be at most 2^{53}.
- what is the minimum
stddev
value that can be used within the limits of double precision arithmetic?- (zero provides no differential privacy)
T6. sample_unit_gaussian() and sample_gaussian() belong with sample_laplace() (or whatever it's called). I may end up doing this when I fix all the random double stuff.
This is now prop288.
I think our current plan is that I will end up (re)writing PrivCount in Tor in Rust.
This is now planned for rust, via #25669
T1:
sample_unit_gaussian()
should usecrypto_rand_double()
to generatex
andy
. See #23061.