Opened 7 weeks ago

Closed 4 weeks ago

#30472 closed project (fixed)

Implement a mechanism for PT reachability testing

Reported by: phw Owned by: phw
Priority: High Milestone:
Component: Circumvention/Pluggable transport Version:
Severity: Major Keywords: reachability
Cc: Actual Points: 2
Parent ID: #30471 Points:
Reviewer: Sponsor: Sponsor19

Description

Non-vanilla bridges currently have no way to automatically test their reachability. Vanilla bridges self-test the reachability of their ORPort by creating a circuit that includes themselves, but we cannot do this for, say, obfs4. In practice, this is problematic because obfs4 operators won't know if their bridge is unreachable; for example due to NAT. In fact, BridgeDB is distributing obfs4 bridges that aren't actually reachable.

We need to build a mechanism that allows non-vanilla bridges to test their reachability. Ideally, something would create a circuit over the bridge while speaking its respective transport protocol but even a simple TCP or UDP-based reachability test would already go a long way.

Looking at the discussion over in #30331, tor seems to be the right component to trigger the reachability test. In its log files, it can then yell at the operator if the test failed. The question is: how should we design the mechanism that implements the reachability test?

One solution would be a simple HTTP API that takes as input an address, port, a transport type, and optional parameters, and then tells you if the given bridge is reachable, e.g.: the URL https://pt-reachable.torproject.org/obfs4/1.2.3.4/9002 may respond with something along the lines of obfs4_reachable: true. Ideally, if the reachability test fails, we should provide details, to help the operator figure out what went wrong.

Child Tickets

TicketStatusOwnerSummaryComponent
#30703closedphwHow to modify Apache config on polyanthum?Internal Services/Services Admin Team

Change History (13)

comment:1 Changed 7 weeks ago by arma

Summary: Implement a mechanism for PT rechability testingImplement a mechanism for PT reachability testing

comment:2 Changed 7 weeks ago by phw

We had a short discussion on IRC and concluded the following:

  • We don't want another central service that collects all the data.
  • A bridge can self-test by having its tor client establish a TCP connection with its obfs4 port (see #30477). Tor can then warn the operator in its log file if the test fails. Unfortunately, this won't help if we ever deploy a PT that speaks UDP.
  • Some operators will ignore their log files, though, so we will still be collecting unreachable obfs4 bridges. BridgeDB should therefore learn how to test all of its bridges by speaking their respective transport protocol. It should not hand out bridges that are unreachable or otherwise broken.

We were left wondering what obfs4 operators should do in the short term, before #30477 is done, to figure out if their bridge is reachable. One way forward would be a simple web page, hosted by us, that asks for an IP address, and a port as input. The service then tries to establish a TCP connection to the given tuple, and lets the user know if it succeeded or failed. The service doesn't need to log or remember anything, and we can run it on polyanthum, the host that also runs BridgeDB.

comment:3 Changed 6 weeks ago by arma

Yes, good summary.

I was wary at first of a little web app to help with testing, because it's yet another place to go break into or watch; but I think so long as we know it is a short term fix until the proper reachability testing gets into the version of Tor that people have, the usability boost makes it an acceptable risk.

I expect we have quite a few obfs4 bridges right now that are firewalled off -- and if we do a campaign to get more people to run obfs4 bridges, without a good easy intuitive step for "then check if it works" we'll have even more.

comment:4 Changed 6 weeks ago by phw

Status: assignedneeds_review

I built a small golang service that lets bridge operators test their obfs4 port. For now, the code is available at https://github.com/NullHypothesis/obfs4PortScan.

I set up a demo at https://nymity.ch:8081. After entering your bridge's IP address and port, the service tells you if the port is reachable or not. If the port is unreachable, the service tells you the error message it got. The tool also has a simple rate limiter that limits requests to an average of one per second, with bursts of up to five per second.

What can we improve?

comment:5 Changed 6 weeks ago by gaba

Sponsor: Sponsor19

comment:6 in reply to:  4 ; Changed 5 weeks ago by cohosh

Replying to phw:

I built a small golang service that lets bridge operators test their obfs4 port. For now, the code is available at https://github.com/NullHypothesis/obfs4PortScan.

I set up a demo at https://nymity.ch:8081. After entering your bridge's IP address and port, the service tells you if the port is reachable or not. If the port is unreachable, the service tells you the error message it got. The tool also has a simple rate limiter that limits requests to an average of one per second, with bursts of up to five per second.

Awesome! It worked for me :)

I just have a few minor comments:

  • A nicer way to express the timeout here would be timeout := 3* time.Second , but even better would be to set a commented constant at the beginning of the file.
  • In main() you could have the certificate and key files passed in as specific arguments such as --cert or --key as the broker does here. The advantage of this is making sure they are passed in the correct order (which should be documented outside of the usage function).
  • Do you also want timestamps in your logs?

As a more general note, is this meant to be used in an automated way for bridge operators to log and report to themselves when their port isn't reachable? Or as an occasional manual check? I know this is something temporary so maybe not a large consideration, but returning something other than a 200 OK if the port is unreachable would make it easier to write a client-side go program that performs this check automatically.

comment:7 in reply to:  6 ; Changed 5 weeks ago by phw

Replying to cohosh:

  • A nicer way to express the timeout here would be timeout := 3* time.Second , but even better would be to set a commented constant at the beginning of the file.


Good point, fixed in the following branch: https://github.com/NullHypothesis/obfs4PortScan/tree/fix/30472

  • In main() you could have the certificate and key files passed in as specific arguments such as --cert or --key as the broker does here. The advantage of this is making sure they are passed in the correct order (which should be documented outside of the usage function).


Also fixed in the same branch.

  • Do you also want timestamps in your logs?


I would like to keep timestamps because they tell us how much (ab)use the service is seeing. Do you see any issues with timestamps?

On a related note: I noticed that the http package can log error messages that include the client's IP address. I included snowflake's safe logger to prevent this from happening.

As a more general note, is this meant to be used in an automated way for bridge operators to log and report to themselves when their port isn't reachable? Or as an occasional manual check? I know this is something temporary so maybe not a large consideration, but returning something other than a 200 OK if the port is unreachable would make it easier to write a client-side go program that performs this check automatically.


At this point it's meant for occasional manual checks. I plan to add a link to the service to our obfs4proxy installation guide. I originally intended this service to be used as an API (see the bottom paragraph of the ticket's description) but it's not clear if we want yet another service that deals with bridge data. The better way forward may be to improve BridgeDB.

comment:8 in reply to:  7 ; Changed 5 weeks ago by cohosh

Replying to phw:

Replying to cohosh:

  • A nicer way to express the timeout here would be timeout := 3* time.Second , but even better would be to set a commented constant at the beginning of the file.


Good point, fixed in the following branch: https://github.com/NullHypothesis/obfs4PortScan/tree/fix/30472

I think the timeout input to isTCPPortReachable is redudant now.


  • Do you also want timestamps in your logs?


I would like to keep timestamps because they tell us how much (ab)use the service is seeing. Do you see any issues with timestamps?

As long as you're not logging IP addresses, this seems fine to me. You're also not exporting the data, it's mostly a consideration in the case that the machine or service is compromised. I don't see issues with an attacker getting ahold of the number of requests made and the times at which they are made. There are probably easier ways to find out whatever information they would hope to find out from these logs anyway.

On a related note: I noticed that the http package can log error messages that include the client's IP address. I included snowflake's safe logger to prevent this from happening.

Oh good point, I'm glad the package is useful here.

comment:9 in reply to:  8 Changed 5 weeks ago by phw

Replying to cohosh:

I think the timeout input to isTCPPortReachable is redudant now.


Yes, good catch.

As long as you're not logging IP addresses, this seems fine to me. You're also not exporting the data, it's mostly a consideration in the case that the machine or service is compromised. I don't see issues with an attacker getting ahold of the number of requests made and the times at which they are made. There are probably easier ways to find out whatever information they would hope to find out from these logs anyway.


Yes, agreed.

Ok, I'll move forward with setting up the service on polyanthum. Thanks for the reviews!

comment:10 Changed 4 weeks ago by phw

On IRC, we concluded that we should add another ProxyPass directive to our apache config on polyanthum, so bridge operators can access this service over a URL such as bridges.torproject.org/scan/.

We should also make the service start automatically at boot -- perhaps by creating a systemd script?

comment:11 in reply to:  10 Changed 4 weeks ago by arma

Replying to phw:

We should also make the service start automatically at boot -- perhaps by creating a systemd script?

If it's just run as a normal user, and doesn't need to be root or anything, you can start it on boot with a line in that user's crontab, like

@reboot (cd /home/tord/run; ../git/src/app/tor -f moria1-orrc)

That's much lighter-weight than a systemd script, and avoids having anything run as higher privileges.

comment:12 Changed 4 weeks ago by anarcat

i've document how to start services with systemd in https://help.torproject.org/tsa/doc/services/

i prefer this to cron as it allows much more flexibility and i can restart services when there are security upgrades.

comment:13 Changed 4 weeks ago by phw

Actual Points: 2
Resolution: fixed
Status: needs_reviewclosed

I'm closing this because our short-term solution is now in place. Here's a summary:

Note: See TracTickets for help on using tickets.