Reachability tests for obfuscated bridges

added bridgedb-dist component::circumvention/bridgedb owner::isis parent::8615 priority::high pt resolution::fixed status::closed tor-bridge type::task labels

Trac:
Cc: N/A to karsten

I think 'a' is a good start, and may well be good enough.

It's not clear to me that the bridge authority is the right place to do complex reachability testing. The bridge authority's main jobs are to a) receive bridge descriptors, b) give them out to people who know the identity key, and c) export stuff so bridgedb can work. The design is pretty flexible about what 'stuff' means exactly.

In a world where bridges can work from one place but not another, and where making direct connections to all bridge addresses from a central place is a poor security idea, it seems we should move toward "bridges test themselves somehow, and the bridge authority trusts that they work as described." Then the bridge authority does a simple test for "is the bridge present or not" (though I'd like to get away from even that), and anything more complex comes in the form of inputs to bridgedb from ooni-style reachability tests.

(Should we rename this ticket, to constrain it to the bridge authority piece of reachability tests for obfsbridges?)

I've created an OONI ticket #6804 (closed) for this, and it is being added to the bridge reachability test #6414 (closed).

The way I have designed it thus far, it will work by doing:

$ ./ooniprobe.py bridget --bridges=bridge_ip_orport_list.txt \
       --transport='obfs2 exec /usr/local/bin/obfsproxy --managed'

exactly as one would set the ClientTransportPlugin string in a torrc.

Will that work? Is there anything else which would be helpful, like any other torrc strings which should be set, or maybe parsing a file of extra torrc options?

Also, because OONI and the bridge reachability tests are using txtorcon, and the spawned Tor calls exec, I am wondering what kinds of extra security checks I should use to make sure that exec doesn't get abused. If there isn't a way to make that safe, I will just include this option in the bridge reachability test in a separate repo, and not in the main OONI bridget test.

Trac:
Cc: karsten to karsten, isis@torproject.org

Replying to isis:

Also, because OONI and the bridge reachability tests are using txtorcon, and the spawned Tor calls exec, I am wondering what kinds of extra security checks I should use to make sure that exec doesn't get abused. If there isn't a way to make that safe, I will just include this option in the bridge reachability test in a separate repo, and not in the main OONI bridget test.

Does ‘OONI’ (I'm not sure what exactly that refers to) have a stated policy specifying which inputs to ooniprobe.py are allowed to be attacker-controlled, and which inputs must be received from a trusted source?

Replying to rransom:

Does ‘OONI’ (I'm not sure what exactly that refers to) have a stated policy specifying which inputs to ooniprobe.py are allowed to be attacker-controlled, and which inputs must be received from a trusted source?

OONI refers to ooniprobe, and all the other included code. We do not yet have such a policy, though we should. It is my understanding that ooniprobe.py should be able to be run by an unprivileged user, and including something which allows arbitrary code execution obviously allow a separate local privilege escalation exploit to be run, and then you know the rest.

I could do a check that the SHA1 of the PT binary file is correct for that architecture, but that seems extremely bulky and kludgy, and it wouldn't scale well as new PTs are developed. I'm leaning towards just commenting the PT test option out, with an explanation, so that people who want to use it can just go in and uncomment it.

Do you have any ideas or suggestions?

Replying to isis:

Replying to rransom:

Does ‘OONI’ (I'm not sure what exactly that refers to) have a stated policy specifying which inputs to ooniprobe.py are allowed to be attacker-controlled, and which inputs must be received from a trusted source?

OONI refers to ooniprobe, and all the other included code. We do not yet have such a policy, though we should. It is my understanding that ooniprobe.py should be able to be run by an unprivileged user, and including something which allows arbitrary code execution obviously allow a separate local privilege escalation exploit to be run, and then you know the rest.

Why would ooniprobe.py need to run all tests as root?

I could do a check that the SHA1 of the PT binary file is correct for that architecture, but that seems extremely bulky and kludgy, and it wouldn't scale well as new PTs are developed. I'm leaning towards just commenting the PT test option out, with an explanation, so that people who want to use it can just go in and uncomment it.

A hash of the main executable of a pluggable transport is not sufficient -- it might load and run scripts (as Vidalia 0.3.x and Firefox do).

Do you have any ideas or suggestions?

Don't make BridgeT setuid root.

Trac:
Keywords: pt deleted, pt tor-bridge added

Trac:
Component: Tor Bridge to Tor

Just discussed this with aagbsn. We decided, as a first step, to make a script that accepts a list of bridge descriptors, does a TCP connect scan on them, and spits out a filtered list of bridge descriptors containing only the alive ones. I'll put this in my TODO list.

Replying to asn:

Just discussed this with aagbsn. We decided, as a first step, to make a script that accepts a list of bridge descriptors, does a TCP connect scan on them, and spits out a filtered list of bridge descriptors containing only the alive ones. I'll put this in my TODO list.

I wrote some code for the above idea in branch obfsbridge_filtah at https://git.gitorious.org/bridgedb/bridgedb.git.

(Code was written fast and furiously, please don't shoot!)

Trac:
Status: new to needs_review

Ugh. Actually the correct code is in branch obfsbridge_filtah_take_2 at https://git.gitorious.org/bridgedb/bridgedb.git.

Trac:
Parent: N/A to #8615 (moved)
Component: Tor to BridgeDB

It might almost be better to create a file of unreachable/non-listening bridges. As far as I can think (right this second), with the current script, BridgeDB would have to parse this file occasionally (on some sort of schedule), determine which of the bridges are currently in a ring, add any bridges as unallocated if they are new, compute the non-intersecting set of bridges it currently has in its rings that that were not in the file (could not be reached) and remove them - that last step being fairly expensive.

Was there an easier way that you were thinking of using?

(Not having looked at the current version of bridget: )

From which host do we want to run it? From the same host as BridgeDB? Can we assume, initially, that we're not extremely bothered by the fact that a few bridges may be blocked because we probe them and they are within a censoring zone?
Are we only testing ORPort reachability or are we aiming to test all PTs that a bridge's extra-info line lists? If we're doing the former then there isn't much to do for this ticket, the latter if more interesting.
If we choose the latter option in (2) and if some ports are reachable and others are not, do we not list it as running or just remove the transports for which their ports are not reachable? The second option sounds much more reasonable.
If we track which bridges we distribute, can we use geoip stats to help determine a bridge's blocked status and its reachability?

Trac:
Milestone: Tor: unspecified to N/A
Priority: normal to major
Cc: karsten, isis@torproject.org to karsten, isis@torproject.org, Matthew.Finkel@gmail.com
Status: needs_review to new

Replying to sysrqb:

(Not having looked at the current version of bridget: )

From which host do we want to run it? From the same host as BridgeDB? Can we assume, initially, that we're not extremely bothered by the fact that a few bridges may be blocked because we probe them and they are within a censoring zone?

Running the tests from BridgeDB seems easier to deploy. In the future, we might want to consider some kind of decentralized system (similar to #6414 (closed)), but I wouldn't worry too much about this in the beginning.

Are we only testing ORPort reachability or are we aiming to test all PTs that a bridge's extra-info line lists? If we're doing the former then there isn't much to do for this ticket, the latter if more interesting.

Yeah, we probably want to take the latter approach here. That's why we were thinking of using BridgeT. I think Arturo's latest BridgeDB branch is: https://github.com/hellais/ooni-probe/tree/feature/bridget

If we choose the latter option in (2) and if some ports are reachable and others are not, do we not list it as running or just remove the transports for which their ports are not reachable? The second option sounds much more reasonable.

Yeah, going with the second behavior seems fine. If that ends up reporting too many broken bridges (for whatever reason), we can start doing the first behavior.

If we track which bridges we distribute, can we use geoip stats to help determine a bridge's blocked status and its reachability?

You mean by using the GeoIP stats of the bridge? Have you seen George Danezis' work on an automatic censorship-detection system? https://lists.torproject.org/pipermail/tor-dev/2011-September/002923.html https://lists.torproject.org/pipermail/tor-dev/2013-May/004802.html https://metrics.torproject.org/users.html?graph=direct-users&start=2013-03-20&end=2013-06-17&country=ir&events=on&dpi=72#direct-users

Even though the system's results are not too bad, we probably shouldn't rely on an anomaly detection system too much, except if we greatly improve the model we use.

Replying to sysrqb:

It might almost be better to create a file of unreachable/non-listening bridges. As far as I can think (right this second), with the current script, BridgeDB would have to parse this file occasionally (on some sort of schedule), determine which of the bridges are currently in a ring, add any bridges as unallocated if they are new, compute the non-intersecting set of bridges it currently has in its rings that that were not in the file (could not be reached) and remove them - that last step being fairly expensive.

As far as I know, BridgeDB currently does this, if you look here: https://gitweb.torproject.org/bridgedb.git/blob/HEAD:/lib/bridgedb/Bucket.py#l220

Though, I do not know where it is determining these bridges. (or if it is able to)

Replying to asn:

Replying to sysrqb:
(Not having looked at the current version of bridget: )

1) From which host do we want to run it? From the same host as BridgeDB? Can we assume, initially, that we're not extremely bothered by the fact that a few bridges may be blocked because we probe them and they are within a censoring zone?
Running the tests from BridgeDB seems easier to deploy. In the future, we might want to consider some kind of decentralized system (similar to #6414 (closed)), but I wouldn't worry too much about this in the beginning.

hellais' version of bridget.py should be suitable for running from BridgeDB. A distinction was not made in much of the discussion for #6414 (closed), and determining if a given bridge is blocked from a specific country is much more difficult than determining if it the host is online or not from BridgeDB. From BridgeDB, if we take as an assumption that BridgeDB is not under surveillance, is is simple to test if the bridge is up. hellais' script will do that, though when I tested it, it did not run at that commit -- I do not recall if this was an error in ooni-probe, perhaps due to all the recent changes for director.py, or if it was an error in bridget.py.

Note, FWIW, my apprehension over merging hellais' script as "bridget" is largely due to the above confusion, as running it from a given country might tell you if the bridge is blocked from your country -- though it would likely get the bridge blocked if it was not already. Hence, putting it into ooni-probe labelled as a "bridge reachability test" would be an extreme misnomer, as it would function more as a "bridge blocker" in censoring countries.

I would suggest that a non-ooni version of hellais' script go into BridgeDB instead of going into ooni-probe, as it would fit this function precisely. Some minor amount of work would need to be done to add in the bits of Twisted which are covered by ooni-probe, so that hellais' bridget would work standalone.

Replying to isis:

Replying to sysrqb:

It might almost be better to create a file of unreachable/non-listening bridges. As far as I can think (right this second), with the current script, BridgeDB would have to parse this file occasionally (on some sort of schedule), determine which of the bridges are currently in a ring, add any bridges as unallocated if they are new, compute the non-intersecting set of bridges it currently has in its rings that that were not in the file (could not be reached) and remove them - that last step being fairly expensive.

As far as I know, BridgeDB currently does this, if you look here: https://gitweb.torproject.org/bridgedb.git/blob/HEAD:/lib/bridgedb/Bucket.py#l220

Though, I do not know where it is determining these bridges. (or if it is able to)

Replying to asn:
Replying to sysrqb:
(Not having looked at the current version of bridget: )

1) From which host do we want to run it? From the same host as BridgeDB? Can we assume, initially, that we're not extremely bothered by the fact that a few bridges may be blocked because we probe them and they are within a censoring zone?
Running the tests from BridgeDB seems easier to deploy. In the future, we might want to consider some kind of decentralized system (similar to #6414 (closed)), but I wouldn't worry too much about this in the beginning.
hellais' version of bridget.py should be suitable for running from BridgeDB. A distinction was not made in much of the discussion for #6414 (closed), and determining if a given bridge is blocked from a specific country is much more difficult than determining if it the host is online or not from BridgeDB. From BridgeDB, if we take as an assumption that BridgeDB is not under surveillance, is is simple to test if the bridge is up. hellais' script will do that, though when I tested it, it did not run at that commit -- I do not recall if this was an error in ooni-probe, perhaps due to all the recent changes for director.py, or if it was an error in bridget.py.

Note, FWIW, my apprehension over merging hellais' script as "bridget" is largely due to the above confusion, as running it from a given country might tell you if the bridge is blocked from your country -- though it would likely get the bridge blocked if it was not already. Hence, putting it into ooni-probe labelled as a "bridge reachability test" would be an extreme misnomer, as it would function more as a "bridge blocker" in censoring countries.

I understand this issue. Unfortunately, I don't know how to fix it easily. Looking at the chaos of #6414 (closed) and #5028 (closed), it seems that building a distributed bridge scanner is not easy to do and probably not worth spending our time on (BridgeDB has much more urgent tasks to be done).

Furthermore, even if we fix this on the BridgeDB layer, the bridge authority will keep on doing direct reachability tests. This has been the case for years and it's public knowledge (#8 (closed) of https://blog.torproject.org/blog/research-problems-ten-ways-discover-tor-bridges) and it still has not been used by a censor.

If we are still worrying about this attack vector, doing our reachability tests over Tor might make it a bit harder to harvest bridge addresses and it's also easy to do.

(To be clear, I also believe that in the future we should look into some kind of distributed bridge scanning, but at the moment it kind of looks like a luxury item when at the same time BridgeDB misses some essential features.)

I would suggest that a non-ooni version of hellais' script go into BridgeDB instead of going into ooni-probe, as it would fit this function precisely. Some minor amount of work would need to be done to add in the bits of Twisted which are covered by ooni-probe, so that hellais' bridget would work standalone.

If Arturo's script is easy to be refactored to be standalone, I find it a reasonable plan. However, if making it standalone requires non-negligible engineering time, I would spend it somewhere else and instead just use OONI in BridgeDB.

Replying to asn:

I understand this issue. Unfortunately, I don't know how to fix it easily. Looking at the chaos of #6414 (closed) and #5028 (closed), it seems that building a distributed bridge scanner is not easy to do and probably not worth spending our time on (BridgeDB has much more urgent tasks to be done).

Furthermore, even if we fix this on the BridgeDB layer, the bridge authority will keep on doing direct reachability tests. This has been the case for years and it's public knowledge (#8 (closed) of https://blog.torproject.org/blog/research-problems-ten-ways-discover-tor-bridges) and it still has not been used by a censor.

If we are still worrying about this attack vector, doing our reachability tests over Tor might make it a bit harder to harvest bridge addresses and it's also easy to do.

Yep. Agreed.

If we get worried about exits collecting the addresses, we could also chain a bunch of bridges together by just sending RELAYEXTEND cells, then adding two public relays to the end of the chain to meet the default RefuseUnknownExit setting, and exiting to check.torproject.org or something else innocuous. But that is some pretty serious paranoia, and not necessary yet. :)

(To be clear, I also believe that in the future we should look into some kind of distributed bridge scanning, but at the moment it kind of looks like a luxury item when at the same time BridgeDB misses some essential features.)

I would suggest that a non-ooni version of hellais' script go into BridgeDB instead of going into ooni-probe, as it would fit this function precisely. Some minor amount of work would need to be done to add in the bits of Twisted which are covered by ooni-probe, so that hellais' bridget would work standalone.

If Arturo's script is easy to be refactored to be standalone, I find it a reasonable plan. However, if making it standalone requires non-negligible engineering time, I would spend it somewhere else and instead just use OONI in BridgeDB.

Looking into it right now. I'll report back in a bit with either a script or a problem.

Trac:
Owner: N/A to isis
Status: new to accepted

Replying to isis:

Replying to asn:

I understand this issue. Unfortunately, I don't know how to fix it easily. Looking at the chaos of #6414 (closed) and #5028 (closed), it seems that building a distributed bridge scanner is not easy to do and probably not worth spending our time on (BridgeDB has much more urgent tasks to be done).

Furthermore, even if we fix this on the BridgeDB layer, the bridge authority will keep on doing direct reachability tests. This has been the case for years and it's public knowledge (#8 (closed) of https://blog.torproject.org/blog/research-problems-ten-ways-discover-tor-bridges) and it still has not been used by a censor.

If we are still worrying about this attack vector, doing our reachability tests over Tor might make it a bit harder to harvest bridge addresses and it's also easy to do.

Yep. Agreed.

If we get worried about exits collecting the addresses, we could also chain a bunch of bridges together by just sending RELAYEXTEND cells, then adding two public relays to the end of the chain to meet the default RefuseUnknownExit setting, and exiting to check.torproject.org or something else innocuous. But that is some pretty serious paranoia, and not necessary yet. :)

(To be clear, I also believe that in the future we should look into some kind of distributed bridge scanning, but at the moment it kind of looks like a luxury item when at the same time BridgeDB misses some essential features.)

I would suggest that a non-ooni version of hellais' script go into BridgeDB instead of going into ooni-probe, as it would fit this function precisely. Some minor amount of work would need to be done to add in the bits of Twisted which are covered by ooni-probe, so that hellais' bridget would work standalone.

If Arturo's script is easy to be refactored to be standalone, I find it a reasonable plan. However, if making it standalone requires non-negligible engineering time, I would spend it somewhere else and instead just use OONI in BridgeDB.

Looking into it right now. I'll report back in a bit with either a script or a problem.

Hm. Any news on this one?

This was fixed by sysrqb, asn, hellais, and several other volunteers at the recent OONI bridge reachability hackathon. Thanks to all of you who contributed to fixing this!

The development version of the graphs and reports are currently available here, although this may change in the near future: https://transparencytoolkit.org/bridge-reachability/

Trac:
Status: accepted to closed
Resolution: N/A to fixed
Keywords: pt tor-bridge deleted, pt, tor-bridge, bridgedb-dist added

closed

mentioned in issue #6414 (closed)

mentioned in issue #6804 (closed)

mentioned in issue #7349 (moved)

mentioned in issue #17159 (moved)

moved to tpo/anti-censorship/bridgedb#6396 (closed)

mentioned in issue tpo/anti-censorship/pluggable-transports/trac#17159 (closed)

Reachability tests for obfuscated bridges

Child items ...

Activity