Opened 7 years ago

Closed 5 years ago

#6396 closed task (fixed)

Reachability tests for obfuscated bridges

Reported by: asn Owned by: isis
Priority: High Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Keywords: pt, tor-bridge, bridgedb-dist
Cc: karsten, isis@…, Matthew.Finkel@… Actual Points:
Parent ID: #8615 Points:
Reviewer: Sponsor:

Description

Bridge authorities are supposed to do reachability tests to bridges.

This becomes problematic when pluggable transports are deployed since the bridge authority will need a pluggable transport to properly communicate with an obfuscated bridge.

Here are some solutions:

a) All the pluggable transports of a bridge will be considered reachable if the ORPort of the bridge is reachable.

b) The above + a TCP scan on each transport port to make sure that the port is open.

c) The bridge authority supports many the known and widely used pluggable transports and does robust reachability tests on all the transport ports. When it does not recognize a transport, it falls back to a) or b).

As part of #4568 it was decided to go with a) for now. In the future, we should probably drift towards c) since it seems to be the right thing to do (Thandy deployment would also make it a bit easier).

Any ahas, opinions, thoughts or possible solutions?

Child Tickets

Change History (20)

comment:1 Changed 7 years ago by karsten

Cc: karsten added

comment:2 Changed 7 years ago by arma

I think 'a' is a good start, and may well be good enough.

It's not clear to me that the bridge authority is the right place to do complex reachability testing. The bridge authority's main jobs are to a) receive bridge descriptors, b) give them out to people who know the identity key, and c) export stuff so bridgedb can work. The design is pretty flexible about what 'stuff' means exactly.

In a world where bridges can work from one place but not another, and where making direct connections to all bridge addresses from a central place is a poor security idea, it seems we should move toward "bridges test themselves somehow, and the bridge authority trusts that they work as described." Then the bridge authority does a simple test for "is the bridge present or not" (though I'd like to get away from even that), and anything more complex comes in the form of inputs to bridgedb from ooni-style reachability tests.

(Should we rename this ticket, to constrain it to the bridge authority piece of reachability tests for obfsbridges?)

comment:3 Changed 7 years ago by isis

Cc: isis@… added

I've created an OONI ticket #6804 for this, and it is being added to the bridge reachability test #6414.

The way I have designed it thus far, it will work by doing:

$ ./ooniprobe.py bridget --bridges=bridge_ip_orport_list.txt \
       --transport='obfs2 exec /usr/local/bin/obfsproxy --managed'

exactly as one would set the ClientTransportPlugin string in a torrc.

Will that work? Is there anything else which would be helpful, like any other torrc strings which should be set, or maybe parsing a file of extra torrc options?

Also, because OONI and the bridge reachability tests are using txtorcon, and the spawned Tor calls exec, I am wondering what kinds of extra security checks I should use to make sure that exec doesn't get abused. If there isn't a way to make that safe, I will just include this option in the bridge reachability test in a separate repo, and not in the main OONI bridget test.

comment:4 in reply to:  3 ; Changed 7 years ago by rransom

Replying to isis:

Also, because OONI and the bridge reachability tests are using txtorcon, and the spawned Tor calls exec, I am wondering what kinds of extra security checks I should use to make sure that exec doesn't get abused. If there isn't a way to make that safe, I will just include this option in the bridge reachability test in a separate repo, and not in the main OONI bridget test.

Does ‘OONI’ (I'm not sure what exactly that refers to) have a stated policy specifying which inputs to ooniprobe.py are allowed to be attacker-controlled, and which inputs must be received from a trusted source?

comment:5 in reply to:  4 ; Changed 7 years ago by isis

Replying to rransom:

Does ‘OONI’ (I'm not sure what exactly that refers to) have a stated policy specifying which inputs to ooniprobe.py are allowed to be attacker-controlled, and which inputs must be received from a trusted source?

OONI refers to ooniprobe, and all the other included code. We do not yet have such a policy, though we should. It is my understanding that ooniprobe.py should be able to be run by an unprivileged user, and including something which allows arbitrary code execution obviously allow a separate local privilege escalation exploit to be run, and then you know the rest.

I could do a check that the SHA1 of the PT binary file is correct for that architecture, but that seems extremely bulky and kludgy, and it wouldn't scale well as new PTs are developed. I'm leaning towards just commenting the PT test option out, with an explanation, so that people who want to use it can just go in and uncomment it.

Do you have any ideas or suggestions?

comment:6 in reply to:  5 Changed 7 years ago by rransom

Replying to isis:

Replying to rransom:

Does ‘OONI’ (I'm not sure what exactly that refers to) have a stated policy specifying which inputs to ooniprobe.py are allowed to be attacker-controlled, and which inputs must be received from a trusted source?

OONI refers to ooniprobe, and all the other included code. We do not yet have such a policy, though we should. It is my understanding that ooniprobe.py should be able to be run by an unprivileged user, and including something which allows arbitrary code execution obviously allow a separate local privilege escalation exploit to be run, and then you know the rest.

Why would ooniprobe.py need to run all tests as root?

I could do a check that the SHA1 of the PT binary file is correct for that architecture, but that seems extremely bulky and kludgy, and it wouldn't scale well as new PTs are developed. I'm leaning towards just commenting the PT test option out, with an explanation, so that people who want to use it can just go in and uncomment it.

A hash of the main executable of a pluggable transport is not sufficient -- it might load and run scripts (as Vidalia 0.3.x and Firefox do).

Do you have any ideas or suggestions?

Don't make BridgeT setuid root.

comment:7 Changed 7 years ago by nickm

Keywords: tor-bridge added

comment:8 Changed 7 years ago by nickm

Component: Tor BridgeTor

comment:9 Changed 6 years ago by asn

Just discussed this with aagbsn. We decided, as a first step, to make a script that accepts a list of bridge descriptors, does a TCP connect scan on them, and spits out a filtered list of bridge descriptors containing only the alive ones. I'll put this in my TODO list.

comment:10 in reply to:  9 Changed 6 years ago by asn

Status: newneeds_review

Replying to asn:

Just discussed this with aagbsn. We decided, as a first step, to make a script that accepts a list of bridge descriptors, does a TCP connect scan on them, and spits out a filtered list of bridge descriptors containing only the alive ones. I'll put this in my TODO list.

I wrote some code for the above idea in branch obfsbridge_filtah at https://git.gitorious.org/bridgedb/bridgedb.git.

(Code was written fast and furiously, please don't shoot!)

comment:11 Changed 6 years ago by asn

Ugh. Actually the correct code is in branch obfsbridge_filtah_take_2 at https://git.gitorious.org/bridgedb/bridgedb.git.

comment:12 Changed 6 years ago by asn

Component: TorBridgeDB
Parent ID: #8615

comment:13 Changed 6 years ago by sysrqb

It might almost be better to create a file of unreachable/non-listening bridges. As far as I can think (right this second), with the current script, BridgeDB would have to parse this file occasionally (on some sort of schedule), determine which of the bridges are currently in a ring, add any bridges as unallocated if they are new, compute the non-intersecting set of bridges it currently has in its rings that that were not in the file (could not be reached) and remove them - that last step being fairly expensive.

Was there an easier way that you were thinking of using?

comment:14 Changed 6 years ago by sysrqb

Cc: Matthew.Finkel@… added
Milestone: Tor: unspecified
Priority: normalmajor
Status: needs_reviewnew

(Not having looked at the current version of bridget: )

1) From which host do we want to run it? From the same host as BridgeDB? Can we assume, initially, that we're not extremely bothered by the fact that a few bridges may be blocked because we probe them and they are within a censoring zone?

2) Are we only testing ORPort reachability or are we aiming to test all PTs that a bridge's extra-info line lists? If we're doing the former then there isn't much to do for this ticket, the latter if more interesting.

3) If we choose the latter option in (2) and if some ports are reachable and others are not, do we not list it as running or just remove the transports for which their ports are not reachable? The second option sounds much more reasonable.

4) If we track which bridges we distribute, can we use geoip stats to help determine a bridge's blocked status *and* its reachability?

comment:15 in reply to:  14 Changed 6 years ago by asn

Replying to sysrqb:

(Not having looked at the current version of bridget: )

1) From which host do we want to run it? From the same host as BridgeDB? Can we assume, initially, that we're not extremely bothered by the fact that a few bridges may be blocked because we probe them and they are within a censoring zone?

Running the tests from BridgeDB seems easier to deploy. In the future, we might want to consider some kind of decentralized system (similar to #6414), but I wouldn't worry too much about this in the beginning.

2) Are we only testing ORPort reachability or are we aiming to test all PTs that a bridge's extra-info line lists? If we're doing the former then there isn't much to do for this ticket, the latter if more interesting.

Yeah, we probably want to take the latter approach here. That's why we were thinking of using BridgeT.
I think Arturo's latest BridgeDB branch is:
https://github.com/hellais/ooni-probe/tree/feature/bridget

3) If we choose the latter option in (2) and if some ports are reachable and others are not, do we not list it as running or just remove the transports for which their ports are not reachable? The second option sounds much more reasonable.

Yeah, going with the second behavior seems fine. If that ends up reporting too many broken bridges (for whatever reason), we can start doing the first behavior.

4) If we track which bridges we distribute, can we use geoip stats to help determine a bridge's blocked status *and* its reachability?

You mean by using the GeoIP stats of the bridge? Have you seen George Danezis' work on an automatic censorship-detection system?
https://lists.torproject.org/pipermail/tor-dev/2011-September/002923.html
https://lists.torproject.org/pipermail/tor-dev/2013-May/004802.html
https://metrics.torproject.org/users.html?graph=direct-users&start=2013-03-20&end=2013-06-17&country=ir&events=on&dpi=72#direct-users

Even though the system's results are not too bad, we probably shouldn't rely on an anomaly detection system too much, except if we greatly improve the model we use.

comment:16 in reply to:  13 ; Changed 6 years ago by isis

Replying to sysrqb:

It might almost be better to create a file of unreachable/non-listening bridges. As far as I can think (right this second), with the current script, BridgeDB would have to parse this file occasionally (on some sort of schedule), determine which of the bridges are currently in a ring, add any bridges as unallocated if they are new, compute the non-intersecting set of bridges it currently has in its rings that that were not in the file (could not be reached) and remove them - that last step being fairly expensive.

As far as I know, BridgeDB currently does this, if you look here: https://gitweb.torproject.org/bridgedb.git/blob/HEAD:/lib/bridgedb/Bucket.py#l220

Though, I do not know where it is determining these bridges. (or if it is able to)

Replying to asn:

Replying to sysrqb:

(Not having looked at the current version of bridget: )

1) From which host do we want to run it? From the same host as BridgeDB? Can we assume, initially, that we're not extremely bothered by the fact that a few bridges may be blocked because we probe them and they are within a censoring zone?

Running the tests from BridgeDB seems easier to deploy. In the future, we might want to consider some kind of decentralized system (similar to #6414), but I wouldn't worry too much about this in the beginning.

hellais' version of bridget.py should be suitable for running from BridgeDB. A distinction was not made in much of the discussion for #6414, and determining if a given bridge is blocked from a specific country is much more difficult than determining if it the host is online or not from BridgeDB. From BridgeDB, if we take as an assumption that BridgeDB is not under surveillance, is is simple to test if the bridge is up. hellais' script will do that, though when I tested it, it did not run at that commit -- I do not recall if this was an error in ooni-probe, perhaps due to all the recent changes for director.py, or if it was an error in bridget.py.

Note, FWIW, my apprehension over merging hellais' script as "bridget" is largely due to the above confusion, as running it from a given country might tell you if the bridge is blocked from your country -- though it would likely get the bridge blocked if it was not already. Hence, putting it into ooni-probe labelled as a "bridge reachability test" would be an extreme misnomer, as it would function more as a "bridge blocker" in censoring countries.

I would suggest that a non-ooni version of hellais' script go into BridgeDB instead of going into ooni-probe, as it would fit this function precisely. Some minor amount of work would need to be done to add in the bits of Twisted which are covered by ooni-probe, so that hellais' bridget would work standalone.

comment:17 in reply to:  16 ; Changed 6 years ago by asn

Replying to isis:

Replying to sysrqb:

It might almost be better to create a file of unreachable/non-listening bridges. As far as I can think (right this second), with the current script, BridgeDB would have to parse this file occasionally (on some sort of schedule), determine which of the bridges are currently in a ring, add any bridges as unallocated if they are new, compute the non-intersecting set of bridges it currently has in its rings that that were not in the file (could not be reached) and remove them - that last step being fairly expensive.

As far as I know, BridgeDB currently does this, if you look here: https://gitweb.torproject.org/bridgedb.git/blob/HEAD:/lib/bridgedb/Bucket.py#l220

Though, I do not know where it is determining these bridges. (or if it is able to)

Replying to asn:

Replying to sysrqb:

(Not having looked at the current version of bridget: )

1) From which host do we want to run it? From the same host as BridgeDB? Can we assume, initially, that we're not extremely bothered by the fact that a few bridges may be blocked because we probe them and they are within a censoring zone?

Running the tests from BridgeDB seems easier to deploy. In the future, we might want to consider some kind of decentralized system (similar to #6414), but I wouldn't worry too much about this in the beginning.

hellais' version of bridget.py should be suitable for running from BridgeDB. A distinction was not made in much of the discussion for #6414, and determining if a given bridge is blocked from a specific country is much more difficult than determining if it the host is online or not from BridgeDB. From BridgeDB, if we take as an assumption that BridgeDB is not under surveillance, is is simple to test if the bridge is up. hellais' script will do that, though when I tested it, it did not run at that commit -- I do not recall if this was an error in ooni-probe, perhaps due to all the recent changes for director.py, or if it was an error in bridget.py.

Note, FWIW, my apprehension over merging hellais' script as "bridget" is largely due to the above confusion, as running it from a given country might tell you if the bridge is blocked from your country -- though it would likely get the bridge blocked if it was not already. Hence, putting it into ooni-probe labelled as a "bridge reachability test" would be an extreme misnomer, as it would function more as a "bridge blocker" in censoring countries.

I understand this issue. Unfortunately, I don't know how to fix it easily. Looking at the chaos of #6414 and #5028, it seems that building a distributed bridge scanner is not easy to do and probably not worth spending our time on (BridgeDB has much more urgent tasks to be done).

Furthermore, even if we fix this on the BridgeDB layer, the bridge authority will keep on doing direct reachability tests. This has been the case for years and it's public knowledge (#8 of https://blog.torproject.org/blog/research-problems-ten-ways-discover-tor-bridges) and it still has not been used by a censor.

If we are still worrying about this attack vector, doing our reachability tests over Tor might make it a bit harder to harvest bridge addresses and it's also easy to do.

(To be clear, I also believe that in the future we should look into some kind of distributed bridge scanning, but at the moment it kind of looks like a luxury item when at the same time BridgeDB misses some essential features.)

I would suggest that a non-ooni version of hellais' script go into BridgeDB instead of going into ooni-probe, as it would fit this function precisely. Some minor amount of work would need to be done to add in the bits of Twisted which are covered by ooni-probe, so that hellais' bridget would work standalone.

If Arturo's script is easy to be refactored to be standalone, I find it a reasonable plan. However, if making it standalone requires non-negligible engineering time, I would spend it somewhere else and instead just use OONI in BridgeDB.

comment:18 in reply to:  17 ; Changed 6 years ago by isis

Owner: set to isis
Status: newaccepted

Replying to asn:

I understand this issue. Unfortunately, I don't know how to fix it easily. Looking at the chaos of #6414 and #5028, it seems that building a distributed bridge scanner is not easy to do and probably not worth spending our time on (BridgeDB has much more urgent tasks to be done).

Furthermore, even if we fix this on the BridgeDB layer, the bridge authority will keep on doing direct reachability tests. This has been the case for years and it's public knowledge (#8 of https://blog.torproject.org/blog/research-problems-ten-ways-discover-tor-bridges) and it still has not been used by a censor.

If we are still worrying about this attack vector, doing our reachability tests over Tor might make it a bit harder to harvest bridge addresses and it's also easy to do.

Yep. Agreed.

If we get worried about exits collecting the addresses, we could also chain a bunch of bridges together by just sending RELAYEXTEND cells, then adding two public relays to the end of the chain to meet the default RefuseUnknownExit setting, and exiting to check.torproject.org or something else innocuous. But that is some pretty serious paranoia, and not necessary yet. :)

(To be clear, I also believe that in the future we should look into some kind of distributed bridge scanning, but at the moment it kind of looks like a luxury item when at the same time BridgeDB misses some essential features.)

I would suggest that a non-ooni version of hellais' script go into BridgeDB instead of going into ooni-probe, as it would fit this function precisely. Some minor amount of work would need to be done to add in the bits of Twisted which are covered by ooni-probe, so that hellais' bridget would work standalone.

If Arturo's script is easy to be refactored to be standalone, I find it a reasonable plan. However, if making it standalone requires non-negligible engineering time, I would spend it somewhere else and instead just use OONI in BridgeDB.

Looking into it right now. I'll report back in a bit with either a script or a problem.

comment:19 in reply to:  18 Changed 6 years ago by asn

Replying to isis:

Replying to asn:

I understand this issue. Unfortunately, I don't know how to fix it easily. Looking at the chaos of #6414 and #5028, it seems that building a distributed bridge scanner is not easy to do and probably not worth spending our time on (BridgeDB has much more urgent tasks to be done).

Furthermore, even if we fix this on the BridgeDB layer, the bridge authority will keep on doing direct reachability tests. This has been the case for years and it's public knowledge (#8 of https://blog.torproject.org/blog/research-problems-ten-ways-discover-tor-bridges) and it still has not been used by a censor.

If we are still worrying about this attack vector, doing our reachability tests over Tor might make it a bit harder to harvest bridge addresses and it's also easy to do.

Yep. Agreed.

If we get worried about exits collecting the addresses, we could also chain a bunch of bridges together by just sending RELAYEXTEND cells, then adding two public relays to the end of the chain to meet the default RefuseUnknownExit setting, and exiting to check.torproject.org or something else innocuous. But that is some pretty serious paranoia, and not necessary yet. :)

(To be clear, I also believe that in the future we should look into some kind of distributed bridge scanning, but at the moment it kind of looks like a luxury item when at the same time BridgeDB misses some essential features.)

I would suggest that a non-ooni version of hellais' script go into BridgeDB instead of going into ooni-probe, as it would fit this function precisely. Some minor amount of work would need to be done to add in the bits of Twisted which are covered by ooni-probe, so that hellais' bridget would work standalone.

If Arturo's script is easy to be refactored to be standalone, I find it a reasonable plan. However, if making it standalone requires non-negligible engineering time, I would spend it somewhere else and instead just use OONI in BridgeDB.

Looking into it right now. I'll report back in a bit with either a script or a problem.

Hm. Any news on this one?

comment:20 Changed 5 years ago by isis

Keywords: bridgedb-dist added
Resolution: fixed
Status: acceptedclosed

This was fixed by sysrqb, asn, hellais, and several other volunteers at the recent OONI bridge reachability hackathon. Thanks to all of you who contributed to fixing this!

The development version of the graphs and reports are currently available here, although this may change in the near future: https://transparencytoolkit.org/bridge-reachability/

Note: See TracTickets for help on using tickets.