It would be great to have sybilhunter's churn and uptime visualisations on the Metrics website. The churn plots are time series, just like the ones we already have on Metrics. Uptime visualisations are jpeg images. We could have weekly or monthly uptime images, and daily churn diagrams.
Sybilhunter is a Go program that expects as input files that are structured like CollecTor's archives. It should be straightforward to run it over cron.
Karsten, I don't know ggplot2. Could you help with plotting the churn values? The format is quite simple. Every line represents the churn changes for the current consensus, and starts with a timestamp, which is then followed by flag-specific churn values in the interval [0, 1].
As I understand it, at least the following two steps are necessary to incorporate both visualisations:
Modify ./website/etc/metrics.json.
Write a shell script for the cron job to run.
Is there anything else we need?
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Here's a CSV file for the churn values: https://nymity.ch/sybilhunting/churn-values/churn-all.csv.bz2 (3.6 MiB). For each flag, there are four columns, two of which are interesting to us: NewFLAG and GoneFLAG. NewFLAG denotes the churn for new relays while GoneFLAG denotes the churn for relays that left the network. If this is difficult to process for you, then I'm happy to change the output format.
The Go code is available here:
https://gitweb.torproject.org/user/phw/sybilhunter.git/
Go directly compiles to statically-linked ELF binaries, so we can build a binary somewhere else and then copy it to the metrics machines. To build sybilhunter, run:
go get git.torproject.org/user/phw/sybilhunter.git
Hmm, I had some trouble fetching that .csv file. The server seems quite overloaded, and the downloaded file was partially corrupt. But I think I got the overall picture.
However, I noticed that you didn't implement the wide-to-long suggestion I mentioned a few months ago on metrics-team@, and I think that would make the graphing code somewhat easier. How likely is it that you'll find the time to work on that issue?
But we probably shouldn't block on that for putting stuff on Tor Metrics. How's this approach:
We start by adding two new "link" pages to your churn and uptime visualizations. Can you send me text similar to the "oxford-anonymous-internet" link page but for these two new link pages? We'll need this text anyway even when we take the next steps below.
We add a new page type to Metrics called "gallery" which displays image files from a local directory directly on the Tor Metrics site. We need this type anyway for the uptime visualizations even when we replace the churn visualizations by something more interactive below. We'd produce these images exactly how you're currently producing them on your server but on the metrics server. Once we deploy this gallery pages, we'll replace the corresponding link pages, though we'd keep the URLs unchanged.
We write some R/ggplot2 code to make the churn visualizations somewhat more interactive by letting users select start and end date, flag type, and displayed metric (absolute numbers, fractions, etc.).
However, I noticed that you didn't implement the wide-to-long suggestion I mentioned a few months ago on metrics-team@, and I think that would make the graphing code somewhat easier. How likely is it that you'll find the time to work on that issue?
I just added that feature. It's in the following branch:
We start by adding two new "link" pages to your churn and uptime visualizations. Can you send me text similar to the "oxford-anonymous-internet" link page but for these two new link pages? We'll need this text anyway even when we take the next steps below.
Here's what I would add to website/etc/metrics.json:
{ "id": "uptimes", "title": "Monthly uptime of Tor relays", "tags": [ "Relays" ], "type": "Graph", "level": "Advanced", "description": "<p>The following image illustrates the uptime of Tor relays for the past month. Each row of pixels denotes one consensus (that is, one hour), and each column denotes one relay. Black pixels mean that a relay was online, and white means offline. So, each pixel denotes if a given relay was online or offline at a given hour. We use red pixels to highlight relays with identical uptime patterns.</p>", "function": "plot_uptimes", "parameters": [ "start", "end" ], "data": [ "servers-data" ], "related": [ "networkchurn" ] }, { "id": "networkchurn", "title": "Network churn rate by relay flag", "tags": [ "Relays" ], "type": "Graph", "level": "Advanced", "description": "<p>The following graph shows the churn rate of the Tor network by <a href=\"about.html#relay\">relay</a> flag. The churn rate, a value in the interval [0,1] captures the rate of relays joining and leaving the network.</p>", "function": "plot_networkchurn", "parameters": [ "start", "end" ], "data": [ "servers-data" ], "related": [ "uptimes", "networksize", "relayflags" ] },
We add a new page type to Metrics called "gallery" which displays image files from a local directory directly on the Tor Metrics site. We need this type anyway for the uptime visualizations even when we replace the churn visualizations by something more interactive below. We'd produce these images exactly how you're currently producing them on your server but on the metrics server. Once we deploy this gallery pages, we'll replace the corresponding link pages, though we'd keep the URLs unchanged.
We write some R/ggplot2 code to make the churn visualizations somewhat more interactive by letting users select start and end date, flag type, and displayed metric (absolute numbers, fractions, etc.).
Sounds good to me. Please let me know if there's anything I can do to help.
Neat! Yes, looks great! I didn't start writing code for this, but I don't see any problems with your data format right now.
[...] Here's what I would add to website/etc/metrics.json:
I rewrote your text a bit to fit more seamlessly into the rest of Metrics (well, I hope). Please take a look at my task-19183 branch.
I also attached two screenshots of the new pages (which are not yet deployed on the main Metrics instance yet):
Please let me know if you spot any problems or want me to change something. Like, want me to pick a different month as example? Happy to make such changes.
Oh, would you be able to update your image galleries? The latest graphs there are from 2016-01, and I bet people will ask for recent months when these pages go online.
Please let me know if you spot any problems or want me to change something. Like, want me to pick a different month as example? Happy to make such changes.
It looks good to me. Thanks for your work.
Oh, would you be able to update your image galleries? The latest graphs there are from 2016-01, and I bet people will ask for recent months when these pages go online.
I did it for now, for the uptime images, but I don't have plans to do that in the future. I'm just providing code and past analyses, but I don't want to sign up for providing continuous visualisations.
Please let me know if you spot any problems or want me to change something. Like, want me to pick a different month as example? Happy to make such changes.
It looks good to me. Thanks for your work.
Thanks for looking. Pushed to master and deployed. Leaving this ticket open for the next steps.
Oh, would you be able to update your image galleries? The latest graphs there are from 2016-01, and I bet people will ask for recent months when these pages go online.
I did it for now, for the uptime images, but I don't have plans to do that in the future. I'm just providing code and past analyses, but I don't want to sign up for providing continuous visualisations.
Fair enough. Yet one more reason to get the next steps here done soon. :)
Trac: Owner: phw to karsten Status: new to assigned