Provide per-bridge usage statistics in Onionoo
Bridge operators often wonder if their bridge is useful and if they're actually helping users in censoring countries. In theory, we have data available that would tell them how many users connect to their bridge every day. We could include aggregate these data and make them available in Onionoo, similar to how we aggregate bandwidth histories and path-selection weights histories. Atlas, Globe, and other web applications could then visualize these statistics.
Here's the data we have, coming from an extra-info descriptor (only partially shown here):
extra-info bzoum C19309EB35EBC06CFDD5B6E5BED937184DF7D10C
dirreq-stats-end 2013-11-21 16:56:40 (86400 s)
dirreq-v3-resp ok=744,not-enough-sigs=0,unavailable=0,not-found=0,not-modified=200,busy=0
bridge-ips ir=560,us=200,sy=192,??=184,my=64,ru=48,gb=40,in=40,de=24,fr=24,th=24,au=16,bd=16,br=16,ca=16,cn=16,eg=16,fi=16,it=16,jp=16,mx=16,nl=16,pk=16,ro=16,se=16,tr=16,a1=8,ae=8,af=8,at=8,be=8,bg=8,bh=8,bn=8,by=8,ch=8,cl=8,co=8,cr=8,cu=8,cy=8,cz=8,dk=8,dz=8,ee=8,es=8,ge=8,gr=8,gt=8,gy=8,hk=8,hu=8,id=8,ie=8,il=8,ke=8,kr=8,kz=8,lb=8,lu=8,lv=8,ma=8,mk=8,mu=8,ng=8,no=8,nz=8,om=8,pa=8,pe=8,ph=8,pl=8,pt=8,rs=8,sa=8,sg=8,sk=8,sn=8,tn=8,tw=8,ua=8,uz=8,ve=8,ye=8
bridge-ip-versions v4=1680,v6=0
bridge-ip-transports <OR>=8,obfs2=968,obfs3=712
The most relevant number is in line dirreq-v3-resp
telling that this bridge gave out 744 consensuses to clients. We assume that clients make 10 such requests per day, so there were on average 74 clients connected to this bridge in the 24 hours until 2013-11-21 16:56:40 (from dirreq-stats-end
line).
We can now multiply 74 with the fraction of connecting IP addresses coming from a given country from the bridge-ips
line, e.g., 560 of 2104 IP addresses coming from Iran means 74 * 560 / 2104 = 20 clients. The same goes for 74 * 1680 / 1680 = 74 users using IP version 4 (from bridge-ip-versions
line) and 74 * 712 / 1688 = 31 users using transport obfs3 (from bridge-ip-transports
line).
All the math here is the same that we also use to estimate total user numbers in the network. The only difference is that, for total numbers in the network, we have to compensate for bridges not reporting statistics.
Note that we cannot apply this approach to relays, because not every relay is a directory mirror. So, a directory mirror would show many more daily clients that it actually has, whereas other relays would show no clients at all. Directory guards probably make this even more difficult for directory mirrors. But the described approach should work fine for bridges.
One problem that remains is that I don't know a good data format for these statistics. There are so many countries in the world that we cannot include graph data for every single one of them in Onionoo documents. Ideally, the new documents containing these usage statistics shouldn't be significantly larger than bandwidth or weights documents. I have no clue yet how to achieve that.
Another problem is that Atlas, Globe, and other clients would have to visualize these new data. Cc'ing phw and rndm to get some feedback on possible data formats.