Opened 7 years ago

Closed 3 years ago

#6149 closed project (wontfix)

"Censorship-timeline" for Tor

Reported by: phw Owned by:
Priority: Medium Milestone:
Component: Circumvention/Censorship analysis Version:
Severity: Normal Keywords: dpi archive censorship block SponsorZ
Cc: asn, runa, arma, g.koppen@…, cass Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

It was shortly discussed on #tor-dev that some sort of "censorship-timeline" for Tor would be helpful. In particular, this should provide:

  • Detailed technical analyses of the censorship mechanisms in place (DPI fingerprints and manufacturers, traceroutes, ...)
  • Code and data to reproduce all experiments
  • Tor patches and standalone tools to evade the censorship devices

After all, this timeline should serve as a comprehensive archive for all people interested in how Tor is getting blocked. It should make it easy to answer questions such as "What happened to Tor in country X back in Y?".

There are also some open questions:

  • How should the data be structured? In form of a timeline? Or country based? Something else?
  • What data should be published and when? Full disclosure too early in the process helps the censors.
  • How should it be presented? In a wiki page or a standalone web site?

Child Tickets

Change History (27)

comment:1 Changed 7 years ago by runa

Packet captures can be sensitive and we probably don't want to publish them online for everyone to see. Maybe we should put them in a private git.tpo repo for now?

comment:2 in reply to:  1 ; Changed 7 years ago by hellais

Replying to runa:

Packet captures can be sensitive and we probably don't want to publish them online for everyone to see. Maybe we should put them in a private git.tpo repo for now?

It depends what the packet captures contain. If they are the packet captures of what a censorship event looks like as long as you strip the src IP they should be fine.

comment:3 Changed 7 years ago by hellais

I recommend the tests that get written are made as an OONI test and abstracted in a way that they can be reused in other circumstances. I recommend to structure the Tests using the template on the OONI wiki: https://trac.torproject.org/projects/tor/wiki/doc/OONI/Tests/TestTemplate

comment:4 in reply to:  2 ; Changed 7 years ago by runa

Replying to hellais:

Replying to runa:

Packet captures can be sensitive and we probably don't want to publish them online for everyone to see. Maybe we should put them in a private git.tpo repo for now?

It depends what the packet captures contain. If they are the packet captures of what a censorship event looks like as long as you strip the src IP they should be fine.

I'd say that the source IP address is pretty useful to have. I don't know if there is a way to sanitize client and bridge pcap files without removing data that is useful to the person analyzing the files.

comment:5 Changed 7 years ago by hellais

I recommend this data gets collected in the to be formed censorship wiki. This is a project started with other researchers in the field of censorship at rightscon: https://trac.torproject.org/projects/tor/wiki/doc/OONI/censorshipwiki.

We can then, if it becomes non practical to have it in a wiki, move it to a standalone website.

I think having both timeline and per country indexes would be of great use. I don't see why one should exclude the other. They will end up anyways being event specific so there is no reason to go for one over the other.

comment:6 in reply to:  4 ; Changed 7 years ago by hellais

Replying to runa:

Replying to hellais:

Replying to runa:

Packet captures can be sensitive and we probably don't want to publish them online for everyone to see. Maybe we should put them in a private git.tpo repo for now?

It depends what the packet captures contain. If they are the packet captures of what a censorship event looks like as long as you strip the src IP they should be fine.

I'd say that the source IP address is pretty useful to have. I don't know if there is a way to sanitize client and bridge pcap files without removing data that is useful to the person analyzing the files.

Just put the ASN in place of the source ip. I don't think that makes the data at all less useful.

comment:7 Changed 7 years ago by runa

Sebastian created a git repository for this project; https://gitweb.torproject.org/censorship-timeline.git

comment:8 in reply to:  6 ; Changed 7 years ago by arma

Replying to hellais:

Just put the ASN in place of the source ip. I don't think that makes the data at all less useful.

Be very careful when thinking you've anonymized data. For example, if you take out the IP address, but you leave in a checksum of the previous thing that included the IP address, it is not hard to recompute the IP address.

comment:9 Changed 7 years ago by hellais

Some good suggestions WRT sanitizing the pcap logs appeared on IRC:

< rransom> Runa, hellais: Keep in mind that country + IP header checksum is probably sufficient to recover redacted packet IP addresses.

< radii> hellais: then, it's important that in anonymized.pcap, all the frames for 192.168.1.100 map to a random key, say 3.4.5.6; while the frames for 192.168.1.101 map to a different random key, 8.7.6.5

< radii> if you just rand() for every packet, you lose way too much information and can't reconstruct TCP streams anymore (among many other problems)

comment:10 in reply to:  8 Changed 7 years ago by phw

Replying to arma:

Replying to hellais:

Just put the ASN in place of the source ip. I don't think that makes the data at all less useful.

Be very careful when thinking you've anonymized data. For example, if you take out the IP address, but you leave in a checksum of the previous thing that included the IP address, it is not hard to recompute the IP address.

Depending on how sensitive the data is, even port numbers can be a problem since we have to assume that data might be captured and stored by the censor for later analysis. Anonymizing traffic traces is a hard problem and in most cases it might be better to just provide the tools to quickly reproduce traffic traces.

comment:11 Changed 7 years ago by asn

We should also probably consider moving to a database design in the future, so that people can search by-country, or by-year, or by-DPI-box-manufacturer. But I guess that with the current amount of data, the wiki is a fine start.

BTW, I think failsafe pcap sanitization is pretty much a lost cause, except if someone audits all packets by hand to make sure that no application-layer leaks exist (assuming that we plugged all the network/transport-layer leaks). I agree with 'phw' that providing the tools to quickly reproduce traffic traces is a good idea.

comment:12 Changed 7 years ago by runa

Milestone: Sponsor Z: March 1, 2013

comment:13 Changed 7 years ago by asn

I added some stuff to censorshipwiki (https://trac.torproject.org/projects/tor/wiki/doc/OONI/censorshipwiki). I tried to make up a general template for censorship incidents; it needs structure improvement, more data and polishing.

comment:14 Changed 7 years ago by gk

Cc: g.koppen@… added

comment:15 in reply to:  13 ; Changed 7 years ago by phw

Replying to asn:

I added some stuff to censorshipwiki (https://trac.torproject.org/projects/tor/wiki/doc/OONI/censorshipwiki). I tried to make up a general template for censorship incidents; it needs structure improvement, more data and polishing.

I gave it a little bit more structure and data. However, just one wiki page might not be the best way to organize all the data since it becomes confusing rather quickly.

I experimented a little bit with timeline software. You can see an example here: http://www.7c0.org/tldemo/index2.html It's free, written in Javascript and relatively easy to extend using an XML file: http://www.7c0.org/tldemo/example1.xml

One possibility would be to use this timeline software for visualization and link to single trac pages which then cover all the censorship incidents in detail.

Any thoughts?

comment:16 in reply to:  15 Changed 7 years ago by hellais

Replying to phw:

Replying to asn:

I added some stuff to censorshipwiki (https://trac.torproject.org/projects/tor/wiki/doc/OONI/censorshipwiki). I tried to make up a general template for censorship incidents; it needs structure improvement, more data and polishing.

I gave it a little bit more structure and data. However, just one wiki page might not be the best way to organize all the data since it becomes confusing rather quickly.

You can use as many wiki pages as you want. I restructured the data to be on a country by country basis. If we end up having too much information for country we can create sub pages for the countries.

I experimented a little bit with timeline software. You can see an example here: http://www.7c0.org/tldemo/index2.html It's free, written in Javascript and relatively easy to extend using an XML file: http://www.7c0.org/tldemo/example1.xml

One possibility would be to use this timeline software for visualization and link to single trac pages which then cover all the censorship incidents in detail.

I think we can achieve something similar with just a master trac page that has this information. If we want to do it the right way we may want to find a good trac plugin that does it, but I would try not to depend too much on external infrastructure.

comment:17 Changed 7 years ago by ioerror

I think that we should not bother to anonymize the data - only post data where it's safe to share the entire payload of a pcap. That way, we don't have to deal with secret repositories or any weird bullshit.

comment:18 Changed 7 years ago by runa

Milestone: Sponsor Z: March 1, 2013

I've started collecting binaries, patches, notes etc in censorship-timeline.git.

comment:19 Changed 7 years ago by runa

Milestone: Sponsor Z: November 1, 2013

comment:20 Changed 7 years ago by karsten

Keywords: SponsorZ added
Milestone: Sponsor Z: November 1, 2013

Switching from using milestones to keywords for sponsor deliverables. See #6365 for details.

comment:21 Changed 7 years ago by runa

Throwing in this blurb relevant for SponsorZ-stuff: As of right now, we have logs and network captures from six or seven different blocking events. What I would like to do is to analyze the data we have and see if there are any similarities between them, heuristics on spoofed packets, number of TCP resets, and so on. This could help answer questions such as "Is Ethiopia using the same type of device as the Philippines?", "Does Kazakhstan have a filter similar to the one used in the UAE?", and will hopefully make future packet analysis projects a bit easier.

comment:22 Changed 7 years ago by phobos

we should not be storing pcaps for any time longer than necessary to determine how tor is being blocked in country. Our systems will be cracked at some point, and we will lose control of the pcap files.

comment:23 Changed 7 years ago by asn

Note( to self):

Another thing I would like to see in the censorshipwiki are the changes that Tor has done to its source code to dodge censorship. Also, the tor versions where the changes were introduced.

This is interesting both from a history perspective and for understanding how a specific Tor version can be blocked.

comment:24 in reply to:  23 Changed 7 years ago by phw

Replying to asn:

Note( to self):

Another thing I would like to see in the censorshipwiki are the changes that Tor has done to its source code to dodge censorship. Also, the tor versions where the changes were introduced.

This is interesting both from a history perspective and for understanding how a specific Tor version can be blocked.

That's a good idea. I added the page "Changes in Tor" to the Censorship Wiki and started by covering the cipher list change introduced in version 0.2.3.17-beta.

comment:25 Changed 6 years ago by runa

Owner: runa deleted
Status: newassigned

comment:26 Changed 6 years ago by runa

Status: assignednew

comment:27 Changed 3 years ago by cass

Cc: cass added
Resolution: wontfix
Severity: Normal
Status: newclosed

It looks like the censorship wiki isn't being maintained (for a few years now) and the needs are now being addressed by OONI.

Note: See TracTickets for help on using tickets.