Opened 9 months ago

Last modified 2 months ago

#31701 needs_review defect

Reachability tests for new obfs4 bridges

Reported by: cohosh Owned by: cohosh
Priority: Medium Milestone:
Component: Circumvention/Obfs4 Version:
Severity: Normal Keywords: reachability, measurement, s30-o23a2
Cc: phw, cohosh Actual Points:
Parent ID: #31280 Points:
Reviewer: phw Sponsor: Sponsor30-can

Description

As a follow up to #29279, we can now set up some new reachability tests on a subset of the bridges we've gotten through our bridge campaign \o/

We probably don't want to test all of the new bridges in case these tests cause a bunch of bridges to get blocked when they otherwise wouldn't.

As mentioned in #29279:comment:9, we should sample bridges from our various distribution mechanisms (email, private, and HTTPS), and also from any finer grained partitions we have (email provider, subnet, etc.).

Child Tickets

Attachments (3)

obfs4-reachability-2019-10-03.pdf (39.1 KB) - added by cohosh 8 months ago.
obfs4-reachability-2020-01-07.pdf (39.2 KB) - added by cohosh 5 months ago.
obfs4-reachability-2020-03-09.pdf (321.4 KB) - added by cohosh 3 months ago.

Download all attachments as: .zip

Change History (13)

comment:1 Changed 9 months ago by phw

Sounds good to me. I will extract a few and send them your way. I'm particularly interested in learning if our moat bucket is being scraped too.

Changed 8 months ago by cohosh

comment:2 Changed 8 months ago by cohosh

Owner: set to cohosh
Status: newassigned

Oof, okay looks like many of our bridgeDB bridges are already unreachable in China.

I have the most data for moat bridges due to accidentally leaving some blank lines in bridge_lines.txt for the first two days. So we can see some more interesting behaviour for those. However, it looks like some moat bridges were blocked even before we started the tests.

Changed 5 months ago by cohosh

comment:3 Changed 5 months ago by cohosh

Well this is a bit weird. I just re-ran these tests and looks like at least 2 bridges that were previously unreachable in China are now reachable again.

So perhaps the block list populated by BridgeDB scraping is not static.

It also looks like there are some bridges that are no longer reachable in North America. Might be worth checking into that.

comment:4 Changed 5 months ago by sigvids

I just re-ran these tests and looks like at least 2 bridges that were previously unreachable in China are now reachable again.

I have seen some reports saying that the GFW will unblock blocked IP addresses after a period of time. One report for Outline (i.e., Shadowsocks) says unblocking can happen after as little as three days. However, if you start reusing the server for the same purpose, it will be blocked again:

https://github.com/Jigsaw-Code/outline-server/issues/193#issuecomment-405042583

It's possible that this unblocking rule applies also to IP addresses scraped from web/email/moat.

So perhaps the block list populated by BridgeDB scraping is not static.

Are the reachability tests based on a single connection, or on multiple connections with a realistic volume of traffic? It's possible that the GFW uses other detection methods in addition to scraping. A thread on Github suggests that blocking can be triggered by factors that include (1) volume of traffic, (2) traffic being fully encrypted, (3) very high entropy, and (4) use of popular VPS locations. The pattern is initially an IP/port ban, and then if you change ports multiple times, you get a full IP ban:

https://github.com/shadowsocks/shadowsocks-libev/issues/2288

It also looks like there are some bridges that are no longer reachable in North America. Might be worth checking into that.

Is it possible that the bridges that are no longer reachable in North America have been taken offline? I sometimes see complaints by volunteers that their bridges don't get any traffic. For example:

https://tor.stackexchange.com/questions/17398/no-traffic-on-obfs4-bridge

https://tor.stackexchange.com/questions/20216/why-is-my-tor-bridge-relay-not-getting-any-traffic

Are bridge operators giving up after a few months of minimal traffic?

Changed 3 months ago by cohosh

comment:5 Changed 3 months ago by cohosh

Just analyzed some more data from the probe point in China.

Unfortunately, I hadn't started a cronjob from a site in North America. However, if you compare the most recent results with the ones before them, you'll notice that all bridges that are consistently blocked were reported as being down from a NA point as well. It's reasonable to assume that all bridges that were intermittently unavailable in China were reachable from North America at some point in that time.

comment:6 in reply to:  4 Changed 3 months ago by cohosh

Replying to sigvids:

I just re-ran these tests and looks like at least 2 bridges that were previously unreachable in China are now reachable again.

I have seen some reports saying that the GFW will unblock blocked IP addresses after a period of time. One report for Outline (i.e., Shadowsocks) says unblocking can happen after as little as three days. However, if you start reusing the server for the same purpose, it will be blocked again:

https://github.com/Jigsaw-Code/outline-server/issues/193#issuecomment-405042583

It's possible that this unblocking rule applies also to IP addresses scraped from web/email/moat.

Thanks! This is a useful link. Indeed, blocking seems to be very intermittent for all of our bridges.

So perhaps the block list populated by BridgeDB scraping is not static.

Are the reachability tests based on a single connection, or on multiple connections with a realistic volume of traffic? It's possible that the GFW uses other detection methods in addition to scraping. A thread on Github suggests that blocking can be triggered by factors that include (1) volume of traffic, (2) traffic being fully encrypted, (3) very high entropy, and (4) use of popular VPS locations. The pattern is initially an IP/port ban, and then if you change ports multiple times, you get a full IP ban:

https://github.com/shadowsocks/shadowsocks-libev/issues/2288

You can see the test script here. This is run approximately 4x a day from our probe point.

We do download a large(ish) file, but it's possible it doesn't look like realistic traffic to a censor. As far as blocking based on use or suspicious traffic patterns, that's possible but as far as we know private obfs4 bridges are still working in China, which leads us to believe that they are not blocking based on traffic patterns.

It also looks like there are some bridges that are no longer reachable in North America. Might be worth checking into that.

Is it possible that the bridges that are no longer reachable in North America have been taken offline? I sometimes see complaints by volunteers that their bridges don't get any traffic. For example:

Yes I suspect it is because the bridges are misconfigured/unmaintained/down etc.

comment:7 Changed 3 months ago by cohosh

Cc: cohosh added
Reviewer: phw
Status: assignedneeds_review

Should I still be running these tests? I can keep them going but we might be seeing diminishing returns here in terms of information. Especially if we're moving to OONI eventually.

comment:8 Changed 3 months ago by gaba

Keywords: s30-o23a3 added
Parent ID: #31280
Sponsor: Sponsor30-can

comment:9 Changed 3 months ago by gaba

Keywords: s30-o23a2 added; s30-o23a3 removed

comment:10 Changed 2 months ago by sigvids

A Twitter user says that if you get 10 obfs4 bridges from BridgeDB, you can likely find one that works:

https://twitter.com/yeahwu404/status/1241753415337701376

Google Translate: "The obfs4 bridge broadcast by Tor currently is still available. Fill in about 10 and basically you can connect, everyone can try. Get the bridge on the following page:"

This suggests the BridgeDB scraping is not perfect.

Note: See TracTickets for help on using tickets.