Calculate the fraction of dist.torproject.org traffic for Tor Browser downloads and updates
We'd like to know what fraction of dist.torproject.org traffic is caused by updates, because it would be easy to move that traffic elsewhere.
Here's one way to find out: We go through sanitized web server logs from dist.torproject.org written by aroides in March 2016. This time frame covers a large and a small (in terms of incremental update size) Tor Browser release on March 8 and 18 respectively. We sum up response bytes by file extension, where .exe, .tar.xz, and .dmg are counted as Tor Browser downloads, .mar as Tor Browser updates, and other extensions as being unrelated to Tor Browser.
$ cat dist.torproject.org-access.log-201603?? | cut -d" " -f7,10 | grep " [None..None](../compare/None...None)*$" > dist.torproject.org-access.log-201603-part
$ cat dist.torproject.org-access.log-201603-part | grep "\.exe " | cut -d" " -f2 | paste -sd+ - | bc
62650080788327
$ cat dist.torproject.org-access.log-201603-part | grep "\.tar\.xz " | cut -d" " -f2 | paste -sd+ - | bc
12497704216145
$ cat dist.torproject.org-access.log-201603-part | grep "\.dmg " | cut -d" " -f2 | paste -sd+ - | bc
8352205765328
$ cat dist.torproject.org-access.log-201603-part | grep "\.mar " | cut -d" " -f2 | paste -sd+ - | bc
29958084372393
$ cat dist.torproject.org-access.log-201603-part | cut -d" " -f2 | paste -sd+ - | bc
113689403444481
Results:
=Extension = | = Bytes= | = Fraction= |
---|---|---|
.exe | 62650080788327 | 55% |
.tar.xz | 12497704216145 | 11% |
.dmg | 8352205765328 | 7% |
.mar | 29958084372393 | 26% |
other | 231328302288 | 0% |
total | 113689403444481 | 100% |
In words, 73% of dist.torproject.org traffic is caused by downloads, 26% by updates.
Thoughts?