TTL Walking. This means that we do a UDP, TCP, ICMP traceroute to a certain destination that we hypothesize is being blocked or the traffic to it is being intercepted. If there is a noticable discrepancy between the traceroutes to common ports (0, 53, 80, 123, 443), we presume that filtering is going on and it is being performed on an (IP, port) pair basis.
Keyword injection. This means injecting keywords into certain data or header fields of packets and detect if behavior changes between "good keywords" and "bad keywords". We know for a fact that for example China is doing keyword detection in skype and it is trivial to obtain the list of "bad keywords".
DNS Probing. This means taking a set of hostnames and trying to resolve them with a set of DNS resolvers. If there is a difference between the result for the same hostname across different DNS resolvers then something wrong is happening. This technique has been used in Italy to detect and map censorship across the country.
HTTP requests. This means manipulating HTTP request headers and checking if they are being mangled by the intercepting proxy. An example of what can be done is capitalization of certain request fields. The back-end server that receives the result should check to see if the capitalization remains or it has been removed by the proxy. Another method is to simply send requests and check for added headers in the response. This technique was used to detect the squid proxy in use on Amtrak and VIARail.
URL lists. This is simply doing a GET request to a certain HTTP server and checking if the returned content matches what is expected. This is basically what most censorship detection tools do (Herdict, alkasir, etc.)
Network latency. This means checking if the latency of the connection to a certain server is congruent with its location. This method generally does not perform as well as the others as it requires the discrepancy to be very visible, but it has been used successfully in countries such as Lebanon.
The open questions of the HTTP Host scan apply also to this test. How do we verify in an efficient manner that a certain site is being blocked?
The naive way to do so is to make a connection over Tor and check if that matches the one that is made over the live network, this has some problems though, for example if the site is geolocalized it will be different for Tor.
Another simple approach is to have a database of content lengths of websites, but this also will fail if the censored page is very similar to the real web page.
Another approach is to find a smart fuzzy matching algorithm for the Test page.
This issue probably deserves it's own ticket here: #6180 (closed)
Should we write a new state machine to overcome Alice<-->Switzerland communication problems with TCP?
Is there any other group who has collected more information about common alterations to packet headers done by various devices?
You should read the OONI Architecture and see how a test like switzerland can be written using it's components.
I would imagine that something useful would be the control channel.
I also have some questions on the test writeup you made:
An input is supposed to be a list of items. In this case you define input as "An active connection between two clients", what do you mean exactly by this?
What are "asynchronous keys"?
Also answering some of your questions:
The goal of OONI is not visualization of data, that should be the task of others.
This should be implemented using TestHelpers and control channels.
I have been talking to Karsten and Roger, who have asked if I would be interested in setting up continuous testing of bridge reachability, and if they still would be interested in my doing so, I will be proposing sometime this week a timeline for now through November 1st. Anyone who is interested on testing bridge reachability should contact me, and all feedback is appreciated.
As an OONI developer, I can either keep BridgeT up-to-date with the work I am doing, or I can create a new bridge test specifically for the Sponsor F deliverable.
I think that we have specified all generalized censorship detection tests, and though we may decide to specify further tests in the future, such future tests will probably be specific to less commonly used protocols, or constitute experiments on certain edge cases, which are beyond the scope of the deliverable. In other words, while it would be hella sweet to test for all of the things, we should be satisfied with (at least for this) with testing for 99% of the things. We're done with this one, right?
You should read the OONI Architecture and see how a test like switzerland can be written using it's components.
I would imagine that something useful would be the control channel.
I have read it, and if I recall correctly the control channel will be implemented as a way for clients to make reports through Tor. Is that correct? If that is the case, that will likely work, assuming that both clients can run Tor.
An input is supposed to be a list of items. In this case you define input as "An active connection between two clients", what do you mean exactly by this?
Switzerland, as it was designed, was passive. I can either keep it that way, or include a traffic generator, in which case it would require no input.
What are "asynchronous keys"?
Derp. That is a very stupid, late-night typo which should read "asymmetric keys". Thanks for catching that.
Also answering some of your questions:
The goal of OONI is not visualization of data, that should be the task of others.
Ah, but visualisation and comparison are not strictly the same thing. In this case, comparing two streams (which requires matching them up despite the fact that each viewpoint contains differently organized information) is integral to the actual test. This comparison doesn't necessarily need to be displayed visually, and while that would be rather neat, I wasn't planning to implement any data visualisation.
Trac: Resolution: N/Ato fixed Status: needs_review to closed
I have been talking to Karsten and Roger, who have asked if I would be interested in setting up continuous testing of bridge reachability, and if they still would be interested in my doing so, I will be proposing sometime this week a timeline for now through November 1st. Anyone who is interested on testing bridge reachability should contact me, and all feedback is appreciated.
As an OONI developer, I can either keep BridgeT up-to-date with the work I am doing, or I can create a new bridge test specifically for the Sponsor F deliverable.
I think that we have specified all generalized censorship detection tests, and though we may decide to specify further tests in the future, such future tests will probably be specific to less commonly used protocols, or constitute experiments on certain edge cases, which are beyond the scope of the deliverable. In other words, while it would be hella sweet to test for all of the things, we should be satisfied with (at least for this) with testing for 99% of the things. We're done with this one, right?
I have started porting bridget to the new framework and I think most of it is done. Adding TCPConnect scanning and marco testing should not be hard.
Though you should create a separate ticket for this discussion as it's not directly related with our July deliverable for OONI.
You should read the OONI Architecture and see how a test like switzerland can be written using it's components.
I would imagine that something useful would be the control channel.
I have read it, and if I recall correctly the control channel will be implemented as a way for clients to make reports through Tor. Is that correct? If that is the case, that will likely work, assuming that both clients can run Tor.
The control channel will be basically implemented as a Tor Hidden Service on the backend that exposes a certain HTTP API. The clients need tor anyways so we are not introducing any added dependencies.
An input is supposed to be a list of items. In this case you define input as "An active connection between two clients", what do you mean exactly by this?
Switzerland, as it was designed, was passive. I can either keep it that way, or include a traffic generator, in which case it would require no input.
To run in OONI it should probably be generating traffic as that is usually our style. If we want to start making also passive tests we should change a bit of the design and architecture of the OONIProbe.
Also answering some of your questions:
The goal of OONI is not visualization of data, that should be the task of others.
Ah, but visualisation and comparison are not strictly the same thing. In this case, comparing two streams (which requires matching them up despite the fact that each viewpoint contains differently organized information) is integral to the actual test. This comparison doesn't necessarily need to be displayed visually, and while that would be rather neat, I wasn't planning to implement any data visualisation.