Opened 6 years ago

Closed 2 years ago

#13731 closed project (wontfix)

Brainstorm ideas for possible visualisations

Reported by: hellais Owned by: hellais
Priority: Medium Milestone:
Component: Archived/Ooni Version:
Severity: Normal Keywords: archived-closed-2018-07-04
Cc: asn, sysrqb, kudrom, aagbsn, infinity0, joelanders, otr, shidash, david415, dawuud Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


In the last OONI dev meeting we said that we should start coming up with interesting visualisations for the data we have.

This ticket is to brainstorm of those ideas.

Child Tickets

Change History (14)

comment:1 Changed 6 years ago by kudrom


The first set of visualisations should be one that would allow us to think about all the reports collected thus far. Right now it's nearly impossible to have a general idea of the data recorded: the kind of reports that are collected, where and when. Therefore it's impossible to pick one (or a group of) nettest/s and build some thoughtful visualization with it.

So i'll work on a simple set of interactive visualizations that will present in the first place a histogram of nettests' reports recorded and...

1) if you click on a bar (a nettest), a pie with the distribution of countries and a timeline of when the data was recorded will be drawn.

1.1) if you click on a country, a timeline for that country will be drawn.

2) Links to the nettest specs will be provided when the user selects a nettest in the point #1.

3) Probably it could be useful to provide some help to query the mongodb database that will be updated with each user selection, for example the fields that filter the data selected.

The country pie could be replaced with a heat map, but at the moment it's more simple to draw a pie and see how the data behaves.

This way the user of this visualization (for the moment us) can take the decission of what nettest he's going to use for his visualization.

I'll start in a couple of days so any feedback is appreciated.

I need to know if the mongodb is up and running and where and also a fully populated reports table, so maybe this is a good oportunity to expand the sanitise and import scripts to work with all the ooni data and not only the bridge reachability reports. Nonetheless,  I saw that a reports.{json,yaml} is in the root of the collector, so maybe i can play with that data until the mongodb reports table is fully populated.

Also, the visualization is kind of splitted in two halves, the first one is the export phase of the current pipeline architecture, in which i aggregate the mongodb data in a useful format for my visualization;  the second one is the proper visualization, in which i use the exported data and build some pretty graphics with it. I think this two phases are the same for every visualization build with ooni, so i think that some documentation should be written to ease the future analysts the use of ooni to extract and aggregate data.


comment:2 Changed 6 years ago by otr

Some ideas on the HTTP/DNS probe measurements that we have and how to correlate them into meaningful information about the censorship techniques employed by different providers:

  • charts about usage of blocking techniques (e.g. HTTP redirection, DNS spoofing, access lists, and so on)
  • vendor statistics (which software/appliances do the censors use?)
  • percentage of blocked sites from reference lists (pie charts...) - useful for people choosing a good provider that does not do filtering
  • statistics on captchas (maybe a map of sites which use cloudflare and query you for a CAPTCHA when you surf to them via tor)

comment:3 Changed 6 years ago by otr

Something we did not mention here before (maybe because it is one self-evident thing to do) Is to create per country overviews (as in one page per country). Currently we list several countries in one page. We have this big amount of raw data on a per country basis but no easily humanly digestable information. Using the per country format we could include more data dimensions that analyse the country regarding the techniques listed above, and probably many more.

comment:5 Changed 6 years ago by hellais

Thanks for the valuable feedback. I made a mockup of what kudrom suggested and added a couple more parts to it.

You can see the mockup here:

In particular I added a calendar to see the measurements done in the month that is selected by the user and a view next to the calendar that shows all the reports collected in a particular day. This will allow the user to then click on any item in that view and display the detailed information of that particular report.

comment:6 Changed 6 years ago by hellais

I am going to take note of some useful references:

Something similar with the number of reports per test type per region/country would be nice:

Examples of d3.js to generate pie charts:

Integrating d3.js and angular.js through directives:

Very powerful d3.js plugin for showing a timeline + data associated to that slice of the timeline:

This looks pretty epic, it's like the joy division album!

Very basic world map, made by the author of d3.js:

comment:7 Changed 6 years ago by kudrom

I like the calendar idea to refine selections made in the timeline, and also the crossfilter plugin to enhance the capabilities of the user when using the visualization, however i'll work on the timeline in the near future because currently i'm working on the histogram and map interactions.
By the way, i've working on this for the last four days on, in a week i expect to have some POC, right now it's in the beginnings.
The angular integration could be awesome to implement the updates between the data and the visualizations when the user interacts with them, but currently i'm more interested in understanding how to better use d3js to provide this idea of interactive visualizations, so in the near future i'll also work on this integration when i understand better how to use d3js plus angular in a way that is reusable, efficient and all of that stuff

comment:8 Changed 6 years ago by hellais

I made a list of the information that we may be interested in visualising:

They were grouped into 3 logical categories:

  • Overview: for high level information about the reports collected
  • Test specific
  • Country specific

comment:9 Changed 6 years ago by hellais

This is the main input list we use:

Last edited 6 years ago by hellais (previous) (diff)

comment:10 Changed 6 years ago by hellais

Some more information on the outcome of the workshop can be found in this email:

Here is the chart we produced that rationalises the questions users interested in exploring the data will be asking:

comment:11 Changed 6 years ago by joelanders

I messed around with the bridge reachability timeline a bit:

(the numbers at each date are the number of measurements we have at that point)

My mongo -> json export script is here:

I think this could be a nice visualization if, at each measurement point (day), we represented the successful connections as a (say) green box growing up from the centerline, and the failed connections as a red box growing down from the centerline.

Then, imagine you specify a few control countries. When you click some timeline, the timelines corresponding to that bridge's reachability from your control countries pop up right under where you clicked.

This avoids the difficulty of distilling {subject-,control-}{successes,failures} (4 numbers) into a single color.

(ed.) early sketch of the red/green idea:

Last edited 6 years ago by joelanders (previous) (diff)

comment:12 Changed 5 years ago by hellais

Much progress has been done on this.

Since there is many information in this ticket I am not going to close it, but we have implemented a new visualisation on that shows the reports collected per country.

comment:13 Changed 3 years ago by teor

Severity: Normal

Set all open tickets without a severity to "Normal"

comment:14 Changed 2 years ago by teor

Keywords: archived-closed-2018-07-04 added
Resolution: wontfix
Status: newclosed

Close all tickets in archived components

Note: See TracTickets for help on using tickets.