Opened 5 months ago

Closed 3 months ago

#30441 closed defect (worksforme)

Stop BridgeDB from handing out offline bridges

Reported by: phw Owned by: phw
Priority: Very High Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Major Keywords: user-feedback, blog, anti-censorship-roadmap
Cc: cohosh, phw, gk Actual Points:
Parent ID: Points: 2
Reviewer: Sponsor:

Description

BridgeDB currently hands out plenty of bridges (in all flavours) that are offline. We need to understand why this is the case, and stop it from doing that.

For example, I just got the obfs4 bridge 4C480695650EDB6BAB006DB9FD81F6173122E973 over HTTPS. Nothing responds on its obfs4 port and Metrics says that it's currently offline -- or used to be, a few hours ago, to be precise. The bridge's IP address is part of Serge's most recent networkstatus-bridges file, but the bridge does not have the Running flag and should not have been given out. Also, the bridge's fingerprint isn't part of BridgeDB's latest assignments.log file. According to all of this, I should not have been given that bridge.

Child Tickets

TicketStatusOwnerSummaryComponent
#30956closedteorPublish bridge ServerTransportPlugin lines, even when ExtraInfoStatistics are offCore Tor/Tor

Change History (19)

comment:1 Changed 5 months ago by phw

In d15fe16c and 978f9be8 we improved log messages to get a better understanding of what's going on. The latest run produced these log messages:

Trying to insert 1291 bridges into hashring, 1062 of which have the 'Running' flag...
Tried to insert 1280 bridges into hashring. Resulting hashring is of length 1061.

comment:2 Changed 5 months ago by phw

We discussed this on IRC and figured that the ~330 snap bridges may be the culprit to some extent. There's quite a bit of churn among them, so Serge may deem a snap bridge running at hour t and once a user tries to use it at hour t+1 it may already be offline.

comment:3 Changed 5 months ago by phw

Roger got a non-snap obfs4 bridge from BridgeDB that was also offline. Its vanilla port worked (and hence it had the 'Running' flag and was distributed by BridgeDB) but its obfs4 port would just reset connections. It may be that the problem of "BridgeDB hands out offline bridges" is really just a lot of smaller problems that come together.

comment:4 Changed 5 months ago by gk

Cc: gk added

comment:5 Changed 4 months ago by phw

Points: 2

comment:6 Changed 4 months ago by gk

What is the plan here? Any updates? That's still an often reported issue on our blog.

comment:7 in reply to:  6 ; Changed 4 months ago by phw

Replying to gk:

What is the plan here? Any updates? That's still an often reported issue on our blog.

  1. We need to understand if this affects all bridge types, or if it is limited to obfs4.
  2. In parallel, we should test if the TCP port of all of our obfs4 bridges is reachable. For those that aren't, we should contact the operator, or, as a last resort, remove them from BridgeDB.
  3. Make it easier for bridge operators to test if their obfs4 port is reachable. #30472 will help with this.

I'll try to make progress with this in the coming days.

comment:8 Changed 4 months ago by wayward

Keywords: user-feedback blog added

comment:9 Changed 4 months ago by phw

Keywords: anti-censorship-roadmap added

comment:10 in reply to:  7 ; Changed 4 months ago by phw

Replying to phw:

  1. In parallel, we should test if the TCP port of all of our obfs4 bridges is reachable. For those that aren't, we should contact the operator, or, as a last resort, remove them from BridgeDB.

I built a tool that takes Serge's bridge files as input and scans the TCP port of obfs4 bridges: https://github.com/NullHypothesis/bridgeauth-obfs4-scanner
I believe one problem is that Serge's cached-extrainfo and cached-extrainfo.new do not contain all bridges that are in networkstatus-bridges, so the results only represent a lower bound of unreachable obfs4 bridges.

Here's the output for a Serge dump from 2019-05-31 00:34:50:

[+] 1,304 bridges in network status; 1,024 (78.5%) have 'Running' flag.                                                                                      
[+] 581 (56.7%) of 1,024 bridges with 'Running' flag support obfs4.                                                                                          
[+] 75 (12.9%) of 581 running obfs4 bridges fail to establish TCP connection.                                                                                
[+] 47 (62.7%) of 75 unreachable obfs4 bridges have contact info.

I will send an email to the operators of these bridges and periodically re-run the script to catch new obfs4 bridges that are unreachable.

comment:11 Changed 4 months ago by phw

I sent an email to approximately 40 bridge operators whose obfs4 port is not reachable. About a dozen replied and took care of the issue. I will send another round of emails in a week or so. If we still don't hear back, we may have to add these bridges to BridgeDB's blacklisted-bridges file, along with the other ~50 unreachable obfs4 bridges that don't have contact information.

comment:12 Changed 3 months ago by phw

I emailed 25 operators again. I plan to blacklist these bridges if I don't hear back over the next few days.

I also emailed 11 operators whose obfs4 bridge advertised a private IP address.

comment:13 in reply to:  10 ; Changed 3 months ago by teor

Replying to phw:

I believe one problem is that Serge's cached-extrainfo and cached-extrainfo.new do not contain all bridges that are in networkstatus-bridges, so the results only represent a lower bound of unreachable obfs4 bridges.

In tor, extrainfo descriptors are only created when statistics are on.
But we could change that so we create extrainfo descriptors that just contain the PT lines, even when statistics are off.

That would be a relatively easy fix to tor.
Would you like us to open a ticket for it?

comment:14 in reply to:  13 ; Changed 3 months ago by phw

Replying to teor:

Replying to phw:

I believe one problem is that Serge's cached-extrainfo and cached-extrainfo.new do not contain all bridges that are in networkstatus-bridges, so the results only represent a lower bound of unreachable obfs4 bridges.

In tor, extrainfo descriptors are only created when statistics are on.
But we could change that so we create extrainfo descriptors that just contain the PT lines, even when statistics are off.


Does this mean that when a, say, obfs4 bridge turns off its statistics, we wouldn't know that it runs obfs4 because we never received the transport line in its extrainfo document? If so, this seems worth fixing.

Also, what config option controls these statistics?

comment:15 in reply to:  14 Changed 3 months ago by teor

Replying to phw:

Replying to teor:

Replying to phw:

I believe one problem is that Serge's cached-extrainfo and cached-extrainfo.new do not contain all bridges that are in networkstatus-bridges, so the results only represent a lower bound of unreachable obfs4 bridges.

In tor, extrainfo descriptors are only created when statistics are on.
But we could change that so we create extrainfo descriptors that just contain the PT lines, even when statistics are off.


Does this mean that when a, say, obfs4 bridge turns off its statistics, we wouldn't know that it runs obfs4 because we never received the transport line in its extrainfo document?

Yes, we made this change in #29018 in 0.4.1.1-alpha, so it's quite a recent change.

In 0.4.0 and earlier, ServerTransportPlugin lines and bridge statistics were unconditionally published in extrainfo documents.

If so, this seems worth fixing.

Also, what config option controls these statistics?

ExtraInfoStatistics. Some statistics also have their own options.

I created #30956 for this issue.

comment:16 Changed 3 months ago by phw

I blacklisted 53 bridges whose obfs4 port was unreachable. We should also try to reject them via the bridge authority because it causes scary log messages to appear in the bridges' log file. Hopefully some operators will then get back to us.

Blacklisting these bridges won't make things worse because after implementing #28655 we don't hand out these bridges' vanilla line.

comment:17 Changed 3 months ago by phw

I added a log message to BridgeDB that tells us how many bridge requests resulted in 0, 1, 2, and 3 bridge lines. Here are the results for a few hours worth of logs:

   # of bridges | # of requests
   -------------+--------------
              0 | 188 (14%)
              1 | 381 (29%)
              2 | 395 (38%)
              3 | 235 (18%)

(Interestingly, all requests that resulted in 0 bridges were HTTPS requests for obfs2, coming from Tor exit relays. BridgeDB no longer supports obfs2, which is why it responds with 0 bridges.)

Assuming that these numbers are correct, BridgeDB should be returning at least one bridge for every request it has seen over the last few hours. That clearly wasn't the case a few days ago but I wonder if it's the case now. The only thing that changed is that I added debug log messages and restarted BridgeDB a few times.

comment:18 Changed 3 months ago by phw

Status: assignedneeds_information

Setting the status to needs_information for now because BridgeDB is currently working fine as far as I can tell.

comment:19 Changed 3 months ago by phw

Resolution: worksforme
Status: needs_informationclosed

Closing because BridgeDB is still working as intended.

Note: See TracTickets for help on using tickets.