Automatically test the PTs of bridges

Trac:
Parent Ticket: #31280 (moved)

added component::circumvention owner::phw parent::31280 points::10 priority::medium s30-o23a3 severity::normal sponsor::30-must status::assigned type::defect labels

It wouldn't be that hard to teach Tor bridges to self-test their PT addresses via an Exit, like we already do with DirPorts. As a bonus step, we might even want to test that the port speaks the PT protocol. See #30477 (moved).

Trac:

Web UI mockup.

Here are some thoughts on building a new service that would allow both BridgeDB and bridge operators to test bridges:

The service should expose a web interface and an API: The web interface is meant for bridge operators who want to test their new bridges, and can replace our obfs4 port scan tool. The API is meant for BridgeDB, allowing it to test the bridges it knows about. We need to be sure that there's no interference that would allow web interface users to learn what bridges were tested over the API, so we may want to run two instances of this service.
The API should be a simple JSON-based REST API. Requests should go to, say bridges.torproject.org/api/test, and have the following request body: {{{ {bridge_line: "1.2.3.4:443"} }}} The service should then respond with something along the lines of: {{{ {functional: true, error: null, time: 4.215} }}} (Note that these are mere examples.)
The web interface can look like this.
The service can take as input bridge lines (vanilla or pluggable transports), spawns a Tor process with the given bridge line, and parses Tor's output to determine if the bridge bootstrapped correctly. This has the potential to be quite messy. What's a better design?
How should BridgeDB use this service? At the very least, it should periodically test all its bridges and log the ones that don't work, making it easy for BridgeDB's maintainers to contact the respective bridge operators.

Some additional feedback from dcf and cohosh after today's anti-censorship meeting:

There's potential for abuse. Exposing this service to the public means allowing anybody to use our machine to establish TLS connections (for vanilla Tor) and send garbage data (for obfs4) to arbitrary machines on the Internet. To prevent this, the service could first verify if the provided bridge is in BridgeDB, and only then proceed to test it.
If BridgeDB uses this service to test a bridge, and somehow propagates this information to CollecTor (so it can be listed on the bridge's status page), there may not be a need to expose it to the public.
BridgeDB should not hand out bridges that this service deems non-functional.

I pushed a work-in-progress version of my code to this repository: https://dip.torproject.org/phw/bridgestrap

Here's what remains to be done:

Add tests.
Clean up code and make it more robust.
Strip down the torrc (in tor.go) to its bare minimum.
Improve the way we're spawning tor and interacting with it.
Improve log messages.
Write BridgeDB code that interacts with this service.
Figure out what to do with the Web frontend, to prevent abuse.

It's still work-in-progress but I would appreciate a preliminary review of the code.

Trac:
Status: new to needs_review

Just took a look, and it looks great so far. The code is well written and the design makes sense to me.

I'm still doubtful that we'd need or want to expose the API to the public. I can see some value in allowing operators to use the web interface as a self-check after they set up the bridge, but I'm curious about how much use it will get. Do you have any insights into this from setting up #30472 (moved) before the bridge campaign?

I think by far the most useful part of this is for bridgedb to auto detect that new bridges are unreachable or that existing bridges have gone offline.

Replying to cohosh:

Just took a look, and it looks great so far. The code is well written and the design makes sense to me.

Thanks for your review.

I'm still doubtful that we'd need or want to expose the API to the public. I can see some value in allowing operators to use the web interface as a self-check after they set up the bridge, but I'm curious about how much use it will get. Do you have any insights into this from setting up #30472 (moved) before the bridge campaign?

We don't have any numbers because our port test tool is run over systemd and nobody has figured out how systemd can take the tool's logs (which are written to stderr) and log it somewhere else.

I agree that the primary purpose of this tool is to assist BridgeDB. I'm not married to the idea of exposing it to bridge operators but there's a clear need for operators to know if their bridge works. If BridgeDB tests new bridges and somehow communicates the result to metrics, operators can consult our relay search tool to get their answer.

Trac:
Status: needs_review to assigned
Owner: N/A to phw

Teor mentioned on OONI's bug tracker that there's a tor_api.h that makes it possible to start tor as a library. This may be helpful for bridgestrap, which currently uses golang's exec.CommandContext to start the tor binary.

Replying to phw:

Teor mentioned on OONI's bug tracker that there's a tor_api.h that makes it possible to start tor as a library. This may be helpful for bridgestrap, which currently uses golang's exec.CommandContext to start the tor binary.

There are probably still a few relaunch issues in tor_api.h, but it seems to work pretty well. Let us know if you find any bugs, and we'll fix them :-)

cohosh cc'ing myself on sponsor 30 work

changed time estimate to 80h

mentioned in issue #32938 (moved)

mentioned in issue #31280 (moved)

mentioned in issue tpo/anti-censorship/pluggable-transports/snowflake#32938 (closed)

moved to tpo/anti-censorship/trac#31874 (closed)

mentioned in issue tpo/anti-censorship/team#112

Automatically test the PTs of bridges

Child items 0

Activity