If a bridge supports obfs4, don't give out its other flavors

added actualpoints::5 bridgedb component::circumvention/bridgedb owner::phw points::2 priority::high resolution::fixed severity::normal sponsor::19 status::closed type::defect labels

Replying to arma:

There's a FOCI 2018 paper looking at blocking of bridges inside China, and one of their conclusions is that China has moved from "block by IP:port" to "block to IP":

Second, it assumes that the FOCI paper is actually correct in its conclusions about how China has changed its blocking. I recall in the Q&A at the end of the presentation that some folks questioned the analysis, but I didn't follow it enough to form a solid opinion. But even if China isn't doing its censorship in this new way yet, now is a great time for bridgedb to become able to handle it.)

My, Lynn Tsai's, and Qi Zhong's monitoring of default Tor Browser bridges also reached this conclusion, that the GFW changed from single-port blocking to all-port blocking (at least for these special bridges). The change happened in October 2016.

https://www.bamsoftware.com/papers/thesis/#sec:china-perport https://www.bamsoftware.com/papers/thesis/#sec:china-allports

The blocking event of October 20, 2016 was noteworthy not only because it occurred before a release, but also because it affected more than one port on some bridges. See point ⓗ in Figure 5.2. When GreenBelt:7013 was blocked, so were GreenBelt:5881 (which had escaped blocking in the previous batch) and GreenBelt:12166 (which was awaiting deployment in the next batch). Similarly, when MaBishomarim:7920 and JonbesheSabz:4148 were blocked, so were the Orbot-reserved MaBishomarim:1984 and JonbesheSabz:1984 (point ⓚ), ending an eight-month unblocked streak.

Replying to arma:

Second, it assumes that the FOCI paper is actually correct in its conclusions about how China has changed its blocking.

Zhongjie Wang, Yue Cao, Zhiyun Qian, Chengyu Song, and Srikanth V. Krishnamurthy observed the same in their INTANG paper, §7.3:

Meanwhile, any hidden bridge nodes requested by the remaining 7 vantage points triggers active probing [‌13, 31‌] and are immediately blocked by the GFW, i.e., any node in China can no longer connect to this IP via any port. This is very different from what was previously reported i.e., the GFW only blocks the Tor port on that hidden bridge [‌31‌], and could cause collateral damage as the Amazon EC2 IPs are recycled. We test 5 different hidden bridge IPs and find no exceptions so far.

This test was done between May 10 and May 18, 2017, according to my correspondence with the authors.

Trac:
Keywords: N/A deleted, bridgedb added

Trac:
Points: N/A to 2
Status: new to assigned
Priority: Medium to High
Owner: sysrqb to N/A

Trac:
Owner: N/A to dgoulet

Linking #7349 (moved), which is like this ticket, but more comprehensive in that it would allow bridges not to expose their ORPort at all, not merely prevent BridgeDB from advertising it. That would also prevent mass-scanning discovery like

Replying to dcf:

Linking #7349 (moved), which is like this ticket, but more comprehensive

I would say differently comprehensive, but not a superset of this one.

In this ticket, I want us to e.g. stop giving out the obfs3 port when there is an obfs4 port.

More broadly, if a bridge supports both active-probing-resistant and active-probing-vulnerable options, then we should give out only the active-probing-resistant ones.

Trac:
Cc: N/A to phw

Trac:
Owner: dgoulet to phw

I made a first attempt at fixing this issue in my bug28655 branch. It's not yet ready to merge. I'd first like to hear what our seasoned BridgeDB veterans think.

Trac:
Status: assigned to needs_review

Small update: My patch broke several unit tests because they rely on an obfs4 bridge being willing to hand out its vanilla descriptor. A proper fix will require a minor change to leekspin (the tool that creates test descriptors for BridgeDB as part of its unit tests), which I'm working on now.

So far, all of leekspin's generated descriptors included obfs2, obfs3, obfs4, and scramblesuit. This broke BridgeDB's unit tests because all descriptors included a probing-resistant PT (obfs4 and scramblesuit), so BridgeDB wouldn't hand out its obfs2, obfs3, and vanilla bridges. To fix this, I made leekspin generate descriptors with various combinations of pluggable transports: https://gitweb.torproject.org/user/phw/leekspin.git/commit/?id=9af6c71b3b4aeb56c509df9ae6a16650f9b58dd2 Let me know if this patch looks good to you.

Unfortunately, the HTTPS unit tests still break -- apparently randomly. Here's one of my recent, failed builds: https://travis-ci.org/NullHypothesis/bridgedb/jobs/520411314

Since my leekspin patch creates pluggable transport combinations deterministically, I believe that the issue is somewhere in BridgeDB's HTTPS distribution mechanism.

I wanted to understand how many bridges would be affected by this patch. I took an assignments.log file that BridgeDB created on 2019-04-19. Here's what the file can tell us:

= Description =	= # =	= % =	= Command =
All bridges	968	100.0	`wc -l assignments.log`
Bridges that have a transport protocol	258	26.7	`grep -c transport assignments.log`
Bridges that have obfs4	249	25.7	`grep -c obfs4 assignments.log`
Bridges that have obfs4 or scramblesuit	251	26.0	`grep -c '\(obfs4\|scramblesuit\)' assignments.log`
Bridges with obfs4 or scramblesuit and obfs2, obfs3, or fte	38	3.9	`grep '(obfs4\|scramblesuit)' assignments.log

The numbers show that we have 251 bridges that are active probing-resistant. 38 (15%) of these bridges also run a transport that is not active probing resistant---obfs2, obfs3, and/or fte. After fixing this ticket, we will stop handing out these transports.

Replying to phw:

The numbers show that we have 251 bridges that are active probing-resistant. 38 (15%) of these bridges also run a transport that is not active probing resistant---obfs2, obfs3, and/or fte. After fixing this ticket, we will stop handing out these transports.

If I understand right, then all 251 (100%) will also stop handing out their vanilla ORport?

Replying to dcf:

Replying to phw:

The numbers show that we have 251 bridges that are active probing-resistant. 38 (15%) of these bridges also run a transport that is not active probing resistant---obfs2, obfs3, and/or fte. After fixing this ticket, we will stop handing out these transports.

If I understand right, then all 251 (100%) will also stop handing out their vanilla ORport?

Yes, that is correct.

I now have a working patch in my bug28655 branch: https://gitweb.torproject.org/user/phw/bridgedb.git/log/?h=bug28655

The patch also required a change in leekspin: https://gitweb.torproject.org/user/phw/leekspin.git/commit/?id=3bc9c660e8df80fe89693c8e4fad38955011bf20

Sorry this took so long to review. It looks good! I only have a few small nitpicks.

Reusing bridges is a little confusing (used first as a method parameter), can you use a different variable name?
Ugh, python. Sure, I guess that's correct. (no changes needed)
I agree extending SUPPORTED_TRANSPORTS or creating a new list config option like PROBING_RESISTANT_TRANSPORTS is a good idea. Hard-coding the list of probing resistant PTs in one place is not great, but hard-coding them in two places is asking for bugs :)

Overall, I think the bridgedb patch is good. I worry about making SUPPORTED_TRANSPORTS more complex. I think a simple str -> boolean mapping is easy to understand. The two ways I see doing this are not as obvious:

dict -> tuple as str -> (boolean, boolean) dict -> dict -> boolean as str -> str ({supported, probing resistant}) -> boolean)

Maybe you had another idea for this, as well :) but with these options, I think adding a new config variable would be more readable.

For the leekspin patch, I think it looks good. My only concern is in the description of the new argument. It says m out of n, but it's not immediately obvious what m is here. n is an actual argument (-n, --descriptors), but m is not a valid argument. Replacing <m> with <xp> would make it more readable, or somehow note m is xp: "make <m> (xp) out of all <n>".

Trac:
Status: needs_review to needs_revision

Replying to sysrqb:

Reusing bridges is a little confusing (used first as a method parameter), can you use a different variable name?

Yes, good point.

I agree extending SUPPORTED_TRANSPORTS or creating a new list config option like PROBING_RESISTANT_TRANSPORTS is a good idea. Hard-coding the list of probing resistant PTs in one place is not great, but hard-coding them in two places is asking for bugs :)

I created a separate PROBING_RESISTANT_TRANSPORTS in bridgedb.conf, right under SUPPORTED_TRANSPORTS. I was a bit undecided if this is something we should expose in the BridgeDB config because it's not meant to be configurable unless you really know what you're doing. That said, I agree that a separate config options seems to be the cleanest solution.

For the leekspin patch, I think it looks good. My only concern is in the description of the new argument. It says m out of n, but it's not immediately obvious what m is here. n is an actual argument (-n, --descriptors), but m is not a valid argument. Replacing <m> with <xp> would make it more readable, or somehow note m is xp: "make <m> (xp) out of all <n>".

Good point, also fixed.

The latest commit in my branch addresses your review: https://gitweb.torproject.org/user/phw/bridgedb.git/log/?h=bug28655 And here's the leekspin fix: https://gitweb.torproject.org/user/phw/leekspin.git/commit/?id=d34c804cd0f01af5206833e62c0dedec8565b235

Trac:
Status: needs_revision to needs_review

Replying to phw:

Replying to sysrqb:

I agree extending SUPPORTED_TRANSPORTS or creating a new list config option like PROBING_RESISTANT_TRANSPORTS is a good idea. Hard-coding the list of probing resistant PTs in one place is not great, but hard-coding them in two places is asking for bugs :)

I created a separate PROBING_RESISTANT_TRANSPORTS in bridgedb.conf, right under SUPPORTED_TRANSPORTS. I was a bit undecided if this is something we should expose in the BridgeDB config because it's not meant to be configurable unless you really know what you're doing. That said, I agree that a separate config options seems to be the cleanest solution.

Yeah, I agree with you about this not being a bridgedb-specific attribute. But, bridgedb must learn this somehow, so either "hard-coding the list in a (constant) variable" or "adding this as a config option" seem like the best choices. I don't have a strong opinion on which is better, I simply didn't like seeing the list of PTs hard-coded in multiple places :)

For the leekspin patch, I think it looks good. My only concern is in the description of the new argument. It says m out of n, but it's not immediately obvious what m is here. n is an actual argument (-n, --descriptors), but m is not a valid argument. Replacing <m> with <xp> would make it more readable, or somehow note m is xp: "make <m> (xp) out of all <n>".

Good point, also fixed.

Great, looks good. I think the commits can be squashed before merging.

Trac:
Status: needs_review to merge_ready

Thanks for the review, sysrqb. I squashed the commits into a single commit and merged it into develop and master. I also updated the config file in our bridgedb-admin repository -- and cleaned up a few local changes, so polyanthum is in sync with the master branch again.

I also pulled the patch on polyanthum and restarted BridgeDB, so we should be done with this ticket.

Trac:
Actualpoints: N/A to 5
Resolution: N/A to fixed
Status: merge_ready to closed