Opened 5 years ago

Last modified 3 months ago

#12547 needs_information task

Get analysed data from bridge reachability tests to tor-devs

Reported by: hellais Owned by: sysrqb
Priority: Medium Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Normal Keywords: ooni, bridge-reachability
Cc: asn, vmon, sysrqb, dcf, hellais, isis, phw Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

This means setting up a web server on the post processing machine (the one running the collector) with some access control so that tor devs can read the reports.

Child Tickets

Change History (9)

comment:1 Changed 5 years ago by isis

Keywords: ooni bridge-reachability added

Correct me if I'm wrong, but I think we've all decided that (for now at least) OONI people will be running the collector mentioned in #12545, which means that this access-controlled web interface would also be on that machine rather than BridgeDB.

Does the OONI collector already have a builtin web interface like this? If so, does it come with its own access control mechanisms, or would it need to be run behind e.g. Apache?

comment:2 in reply to:  1 Changed 5 years ago by hellais

Replying to isis:

Correct me if I'm wrong, but I think we've all decided that (for now at least) OONI people will be running the collector mentioned in #12545, which means that this access-controlled web interface would also be on that machine rather than BridgeDB.

Yes that is correct. I replied also on that ticket, thanks for raising the question.

Does the OONI collector already have a builtin web interface like this? If so, does it come with its own access control mechanisms, or would it need to be run behind e.g. Apache?

Currently the collector does not do any publishing of the collected reports, it just writes them to a directory that then needs some logic to expose them for use by, for example, the BridgeDB team.

I have set everything up so that it all runs inside of isolated Docker containers so this would be a mater of setting up another docker container that exposes the directory where the reports are written to in some way. I think running an apache or nginx is indeed the simplest solution. If we decide that we need something a bit more complicated we can also write an ad-hoc web application, but for the moment I would keep it as simple as possible.

comment:3 Changed 5 years ago by asn

I assume that originally the data will be exposed as an HTTP dir
listing, similar to: https://ooni.torproject.org/reports/0.1/

That's fine since it allows Tor developers to quickly search for
bridget results (to see if/when bridges got blocked) and it can also
act as an initial interface for people to fetch data to test their
visualization ideas.

A question here is how the directory listing should look like, to make
it easier for humans to go through it. My use case, is going over the
results to find whether a bridge was blocked in a given day. Here is a
directory listing format that could be helpful for this use case:

bridge_reachability/<country>/<day>/br_<timestamp>_<country>.yaml

for example:

bridge_reachability/CN/2014-09-03/br_2014-09-03T15:40:00_US.yaml
bridge_reachability/US/2014-09-05/br_2014-09-05T22:10:00_US.yaml
bridge_reachability/US/2014-09-05/br_2014-09-05T23:10:00_US.yaml

Any other ideas?

This use case will potentially be deprecated after visualization is
done. The visualization should make it easy to check whether a bridge
was blocked in a given date, without having to go over all the reports
of that day.

comment:4 Changed 5 years ago by isis

Status: newneeds_information

I think this ticket is mostly done? Could someone please point to the HTTP dir that we should be looking in for the most recent reports (or where they would be in the future)?

comment:5 in reply to:  4 Changed 5 years ago by hellais

Replying to isis:

I think this ticket is mostly done? Could someone please point to the HTTP dir that we should be looking in for the most recent reports (or where they would be in the future)?

We recently ran into a disk space issue on the main repository of ooni reports, hence are now migrating it to a new machine, though it's taking a while to copy all the data and then re-run the pipeline on it.

When this process is complete, though, all the reports will be published:

http://reports.ooni.nu (near realtime updates, currently empty)
https://ooni.torproject.org/reports/0.1/ (updated daily, currently does not have the bridge reachability data)
https://213.138.109.232/ (mirror of the above site hosted by ORG)

We have so far data from Russia, China, Iran from October 2014 to February 2015 and counting.

I have noticed, however, that some of the bridges matt gave us have gone offline, so I believe we need a tighter integration with bridgedb so that we update the dead bridges with fresh ones periodically.

Is there an ETA on #13570?

comment:6 Changed 4 years ago by isis

Cc: isis added

comment:7 Changed 21 months ago by teor

Severity: Normal

Set all open tickets without a severity to "Normal"

comment:8 Changed 3 months ago by phw

Cc: phw added
Parent ID: #12544

hellais, is this something we should still be pursuing?

(Removing the closed parent ID because otherwise trac won't let me comment.)

comment:9 Changed 3 months ago by hellais

I suppose it's still something that I think would be beneficial to do.

Currently we only do a reachability test on desktop for tor bridges based on the IPs inside of: https://github.com/OpenObservatory/ooni-resources/blob/master/bridge_reachability/tor-bridges-ip-port.csv.

You can now fetch all of these in the OONI dataset (api.ooni.io) and figure out which ones are blocked where.

We have plans to support this also on mobile and here are some relevant issues on the OONI side:
https://github.com/ooni/probe/issues/802
https://github.com/ooni/probe/issues/804
https://github.com/ooni/orchestra/issues/51

Last edited 3 months ago by hellais (previous) (diff)
Note: See TracTickets for help on using tickets.