fraction value computation for clients.csv and its descriptions don't seem to match
Maybe, I'm missing something obvious, but the calculation for frac
in clients.csv
doesn't seem to calculate what is stated in the web-site description (nor in the sql comment), web-site:
"frac: Fraction of relays or bridges in percent that the estimate is based on."
...
-- Estimated fraction of nodes reporting directory requests, which is
-- used to extrapolate observed requests to estimated total requests in
-- the network. The closer this fraction is to 1.0, the more precise
-- the estimation.
CAST(a.frac * 100 AS INTEGER) AS frac,
-- Finally, the estimate number of users.
CAST(a.rrx / (a.frac * 10) AS INTEGER) AS users
-- Implement the estimation method in a subquery, so that the ugly
-- formula only has to be written once.
FROM (
SELECT date, node, country, transport, version, rrx, nrx,
(hrh * nh + hh * nrh) / (hh * nn) AS frac <--------------------<<<
FROM aggregated WHERE hh * nn > 0.0) a
-- Only include estimates with at least 10\% of nodes reporting directory
-- request statistics.
WHERE a.frac BETWEEN 0.1 AND 1.0
...
The arrow points at the fraction of reported directory requests (or responses for bridges) of the total (estimated) sum of directory requests (responses for bridges), but not the fraction of nodes reporting directory requests of the total number of nodes.