Opened 8 years ago

Closed 6 years ago

#7435 closed task (fixed)

Devise strategy for getting inputs to the users that want to run tests

Reported by: hellais Owned by: hellais
Priority: Medium Milestone:
Component: Archived/Ooni Version:
Severity: Keywords: ooni_research
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Users that are interested in testing for censorship will need to run certain tests with inputs. These inputs can be lists of URLs of sets of keywords.

I see the following problems with making the lists of URLs to be tested public:

  • Hosting such URLs and keywords can be a liability
  • The censor will be aware of what urls we are testing for censorship

Ideas?

Child Tickets

Change History (4)

comment:1 Changed 8 years ago by naif

I think that users should be given the ability to not just to retrieve URL lists and keywords, but also to suggest/propose new ones to be added for a given categorization (ie: country/language sets).

To be able to achieve such kind of efficiency (crowd sourcing censorship keywords/url checking), the lists are to be public.

Users willing to apply private URL/keywords, should be given the ability to setup their own in some kind of very easy way (how?).

comment:2 Changed 8 years ago by jonmtoz

Should the lists be crowdsourced and open to the public? I think doing so could have serious negative ramifications: If anyone can add any url to a list, an attacker can set up a honeypot, post the url to the publicly available url list, wait for probers to connect to the honeypot, and then target them. An adversary capable of observing the internet connections of an entire country can perform a passive attack by simply reading the entire input list and then targeting probers from its own country, since they will be identifiable for making all of these connections in a short period of time. Some alternatives to having publicly modifiable lists that are available to everyone could be:

  1. Allowing only members of the Tor Project to modify a list (this might make liability more of an issue but will improve prober safety).
  1. Restricting the amount of items on the list that can be seen by one person. This can be done by using the same strategies used in the distribution of Tor bridges, such as rate limiting through CAPTCHAs in order to prevent an attacker from learning what exactly a prober will do. The same technique might also be able to be used to create crowdsourced lists of keywords and URLs since it mitigates the danger that an attacker can modify the lists sufficiently enough to expose scanners.

It might be best for there to be multiple lists each with different known degrees of risk, since one probing in Germany faces far less of a risk by using a crowdsourced list than someone in Syria. Maybe this should be taken into account if there are different lists for different countries and languages.

comment:3 in reply to:  2 Changed 6 years ago by hellais

I guess it better to reply late than never...

Replying to jonmtoz:

Should the lists be crowdsourced and open to the public? I think doing so could have serious negative ramifications: If anyone can add any url to a list, an attacker can set up a honeypot, post the url to the publicly available url list, wait for probers to connect to the honeypot, and then target them. An adversary capable of observing the internet connections of an entire country can perform a passive attack by simply reading the entire input list and then targeting probers from its own country, since they will be identifiable for making all of these connections in a short period of time. Some alternatives to having publicly modifiable lists that are available to everyone could be:

I think that it's not an issue to have the list of sites probes scan public, since it will eventually be public once the report is published (and this is a requirement for allowing the reproducibility of ooniprobe tests) so an adversary could read it from there.

In general I also think it's a bad idea to have some data that if somebody learns it's content some people end up in danger. Our goal with ooni is to never put ourselves in the situation where some data we have, if disclosed to the wrong parties, could increase the risk for users.

Regarding adding honeypot URLs to the list of sites to scan to identity users is a risk, but I see the greater risk being that somebody can add to the list of sites to scan something that will get us in trouble for hosting the content of such site (by publishing it in the report) or can endanger users, because it illegal even to visit such URL.

  1. Allowing only members of the Tor Project to modify a list (this might make liability more of an issue but will improve prober safety).
  1. Restricting the amount of items on the list that can be seen by one person. This can be done by using the same strategies used in the distribution of Tor bridges, such as rate limiting through CAPTCHAs in order to prevent an attacker from learning what exactly a prober will do. The same technique might also be able to be used to create crowdsourced lists of keywords and URLs since it mitigates the danger that an attacker can modify the lists sufficiently enough to expose scanners.

It might be best for there to be multiple lists each with different known degrees of risk, since one probing in Germany faces far less of a risk by using a crowdsourced list than someone in Syria. Maybe this should be taken into account if there are different lists for different countries and languages.

For the above reasons what we currently do is implement solution 1. in the two you described. The lists we use for testing are currently the alexa top 1k sites and the URL lists provided by citizen lab: https://github.com/citizenlab/test-lists.

Our way of provisioning probes with new URLs to scan is have them run the following two scripts:

ooniresources
oonideckgen

oonideckgen in particular builds a test deck with the appropriate URLs (and DNS resolvers) for the country of the user.

Closing this as implemented.

comment:4 Changed 6 years ago by hellais

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.