Add sybilhunter's visualisations to Metrics website

added churn component::metrics/website owner::metrics-team priority::medium severity::normal status::assigned sybilhunter type::enhancement uptime visualization labels

Can you give me a sample input file for the churn values, so that I can write some R/ggplot2 for that?

And can you include a link to the Go code that's supposed to run on the metrics machine?

Here's a CSV file for the churn values: https://nymity.ch/sybilhunting/churn-values/churn-all.csv.bz2 (3.6 MiB). For each flag, there are four columns, two of which are interesting to us: NewFLAG and GoneFLAG. NewFLAG denotes the churn for new relays while GoneFLAG denotes the churn for relays that left the network. If this is difficult to process for you, then I'm happy to change the output format.

The Go code is available here: https://gitweb.torproject.org/user/phw/sybilhunter.git/ Go directly compiles to statically-linked ELF binaries, so we can build a binary somewhere else and then copy it to the metrics machines. To build sybilhunter, run:

go get git.torproject.org/user/phw/sybilhunter.git

To create churn values, run:

sybilhunter -data path/to/collector/archive/ -churn -startdate 2016-06-01 -enddate 2016-06-02 2>/dev/null

Trac:
Cc: karsten to karsten, phw

Hmm, I had some trouble fetching that .csv file. The server seems quite overloaded, and the downloaded file was partially corrupt. But I think I got the overall picture.

However, I noticed that you didn't implement the wide-to-long suggestion I mentioned a few months ago on metrics-team@, and I think that would make the graphing code somewhat easier. How likely is it that you'll find the time to work on that issue?

But we probably shouldn't block on that for putting stuff on Tor Metrics. How's this approach:

We start by adding two new "link" pages to your churn and uptime visualizations. Can you send me text similar to the "oxford-anonymous-internet" link page but for these two new link pages? We'll need this text anyway even when we take the next steps below.
We add a new page type to Metrics called "gallery" which displays image files from a local directory directly on the Tor Metrics site. We need this type anyway for the uptime visualizations even when we replace the churn visualizations by something more interactive below. We'd produce these images exactly how you're currently producing them on your server but on the metrics server. Once we deploy this gallery pages, we'll replace the corresponding link pages, though we'd keep the URLs unchanged.
We write some R/ggplot2 code to make the churn visualizations somewhat more interactive by letting users select start and end date, flag type, and displayed metric (absolute numbers, fractions, etc.).

Replying to karsten:

However, I noticed that you didn't implement the wide-to-long suggestion I mentioned a few months ago on metrics-team@, and I think that would make the graphing code somewhat easier. How likely is it that you'll find the time to work on that issue?

I just added that feature. It's in the following branch:

git clone -b long-format https://git.torproject.org/user/phw/sybilhunter.git

The default output is now the long format. Here's an example:

Date,Authority,BadExit,Exit,Fast,Guard,HSDir,Named,Running,Stable,Unnamed,V2Dir,Valid,NewChurn,GoneChurn
2016-05-31T01:00:00Z,T,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0.00000,0.00000
2016-05-31T01:00:00Z,NA,T,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0.00000,0.00000
2016-05-31T01:00:00Z,NA,NA,T,NA,NA,NA,NA,NA,NA,NA,NA,NA,0.00457,0.00457
2016-05-31T01:00:00Z,NA,NA,NA,T,NA,NA,NA,NA,NA,NA,NA,NA,0.00480,0.00315
2016-05-31T01:00:00Z,NA,NA,NA,NA,T,NA,NA,NA,NA,NA,NA,NA,0.00552,0.00184
2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,T,NA,NA,NA,NA,NA,NA,0.00300,0.00030
2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,T,NA,NA,NA,NA,NA,NaN,NaN
2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,T,NA,NA,NA,NA,0.00643,0.00514
2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,NA,T,NA,NA,NA,0.00118,0.00067
2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,NA,NA,T,NA,NA,NaN,NaN
2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,T,NA,0.00349,0.00501
2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,T,0.00643,0.00514

Is that something you can work with?

We start by adding two new "link" pages to your churn and uptime visualizations. Can you send me text similar to the "oxford-anonymous-internet" link page but for these two new link pages? We'll need this text anyway even when we take the next steps below.

Here's what I would add to website/etc/metrics.json:

  {
    "id": "uptimes",
    "title": "Monthly uptime of Tor relays",
    "tags": [
      "Relays"
    ],
    "type": "Graph",
    "level": "Advanced",
    "description": "<p>The following image illustrates the uptime of Tor relays for the past month.  Each row of pixels denotes one consensus (that is, one hour), and each column denotes one relay.  Black pixels mean that a relay was online, and white means offline.  So, each pixel denotes if a given relay was online or offline at a given hour.  We use red pixels to highlight relays with identical uptime patterns.</p>",
    "function": "plot_uptimes",
    "parameters": [
      "start",
      "end"
    ],
    "data": [
      "servers-data"
    ],
    "related": [
      "networkchurn"
    ]
  },
  {
    "id": "networkchurn",
    "title": "Network churn rate by relay flag",
    "tags": [
      "Relays"
    ],
    "type": "Graph",
    "level": "Advanced",
    "description": "<p>The following graph shows the churn rate of the Tor network by <a href=\"about.html#relay\">relay</a> flag. The churn rate, a value in the interval [0,1] captures the rate of relays joining and leaving the network.</p>",
    "function": "plot_networkchurn",
    "parameters": [
      "start",
      "end"
    ],
    "data": [
      "servers-data"
    ],
    "related": [
      "uptimes",
      "networksize",
      "relayflags"
    ]
  },

We add a new page type to Metrics called "gallery" which displays image files from a local directory directly on the Tor Metrics site. We need this type anyway for the uptime visualizations even when we replace the churn visualizations by something more interactive below. We'd produce these images exactly how you're currently producing them on your server but on the metrics server. Once we deploy this gallery pages, we'll replace the corresponding link pages, though we'd keep the URLs unchanged.

We write some R/ggplot2 code to make the churn visualizations somewhat more interactive by letting users select start and end date, flag type, and displayed metric (absolute numbers, fractions, etc.).

Sounds good to me. Please let me know if there's anything I can do to help.

Trac:

Replying to phw:

{{{ Date,Authority,BadExit,Exit,Fast,Guard,HSDir,Named,Running,Stable,Unnamed,V2Dir,Valid,NewChurn,GoneChurn 2016-05-31T01:00:00Z,T,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0.00000,0.00000 2016-05-31T01:00:00Z,NA,T,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0.00000,0.00000 2016-05-31T01:00:00Z,NA,NA,T,NA,NA,NA,NA,NA,NA,NA,NA,NA,0.00457,0.00457 2016-05-31T01:00:00Z,NA,NA,NA,T,NA,NA,NA,NA,NA,NA,NA,NA,0.00480,0.00315 2016-05-31T01:00:00Z,NA,NA,NA,NA,T,NA,NA,NA,NA,NA,NA,NA,0.00552,0.00184 2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,T,NA,NA,NA,NA,NA,NA,0.00300,0.00030 2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,T,NA,NA,NA,NA,NA,NaN,NaN 2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,T,NA,NA,NA,NA,0.00643,0.00514 2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,NA,T,NA,NA,NA,0.00118,0.00067 2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,NA,NA,T,NA,NA,NaN,NaN 2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,T,NA,0.00349,0.00501 2016-05-31T01:00:00Z,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,T,0.00643,0.00514 }}}

Is that something you can work with?

Neat! Yes, looks great! I didn't start writing code for this, but I don't see any problems with your data format right now.

[...] Here's what I would add to website/etc/metrics.json:

I rewrote your text a bit to fit more seamlessly into the rest of Metrics (well, I hope). Please take a look at my task-19183 branch.

I also attached two screenshots of the new pages (which are not yet deployed on the main Metrics instance yet):

Please let me know if you spot any problems or want me to change something. Like, want me to pick a different month as example? Happy to make such changes.

Oh, would you be able to update your image galleries? The latest graphs there are from 2016-01, and I bet people will ask for recent months when these pages go online.

Please let me know if you spot any problems or want me to change something. Like, want me to pick a different month as example? Happy to make such changes.

It looks good to me. Thanks for your work.

Oh, would you be able to update your image galleries? The latest graphs there are from 2016-01, and I bet people will ask for recent months when these pages go online.

I did it for now, for the uptime images, but I don't have plans to do that in the future. I'm just providing code and past analyses, but I don't want to sign up for providing continuous visualisations.

Replying to phw:

Please let me know if you spot any problems or want me to change something. Like, want me to pick a different month as example? Happy to make such changes.

It looks good to me. Thanks for your work.

Thanks for looking. Pushed to master and deployed. Leaving this ticket open for the next steps.

Oh, would you be able to update your image galleries? The latest graphs there are from 2016-01, and I bet people will ask for recent months when these pages go online.

I did it for now, for the uptime images, but I don't have plans to do that in the future. I'm just providing code and past analyses, but I don't want to sign up for providing continuous visualisations.

Fair enough. Yet one more reason to get the next steps here done soon. :)

Trac:
Owner: phw to karsten
Status: new to assigned

Handing over to metrics-team, because I'm not currently working on this.

Trac:
Owner: karsten to metrics-team

Add sybilhunter's visualisations to Metrics website

Child items ...

Activity