Opened 7 months ago
Closed 6 months ago
#30472 closed project (fixed)
Implement a mechanism for PT reachability testing
Reported by: | phw | Owned by: | phw |
---|---|---|---|
Priority: | High | Milestone: | |
Component: | Circumvention/Pluggable transport | Version: | |
Severity: | Major | Keywords: | reachability |
Cc: | Actual Points: | 2 | |
Parent ID: | #30471 | Points: | |
Reviewer: | Sponsor: | Sponsor19 |
Description
Non-vanilla bridges currently have no way to automatically test their reachability. Vanilla bridges self-test the reachability of their ORPort by creating a circuit that includes themselves, but we cannot do this for, say, obfs4. In practice, this is problematic because obfs4 operators won't know if their bridge is unreachable; for example due to NAT. In fact, BridgeDB is distributing obfs4 bridges that aren't actually reachable.
We need to build a mechanism that allows non-vanilla bridges to test their reachability. Ideally, something would create a circuit over the bridge while speaking its respective transport protocol but even a simple TCP or UDP-based reachability test would already go a long way.
Looking at the discussion over in #30331, tor seems to be the right component to trigger the reachability test. In its log files, it can then yell at the operator if the test failed. The question is: how should we design the mechanism that implements the reachability test?
One solution would be a simple HTTP API that takes as input an address, port, a transport type, and optional parameters, and then tells you if the given bridge is reachable, e.g.: the URL https://pt-reachable.torproject.org/obfs4/1.2.3.4/9002 may respond with something along the lines of obfs4_reachable: true
. Ideally, if the reachability test fails, we should provide details, to help the operator figure out what went wrong.
Child Tickets
Ticket | Status | Owner | Summary | Component |
---|---|---|---|---|
#30703 | closed | phw | How to modify Apache config on polyanthum? | Internal Services/Services Admin Team |
Change History (13)
comment:1 Changed 7 months ago by
Summary: | Implement a mechanism for PT rechability testing → Implement a mechanism for PT reachability testing |
---|
comment:2 Changed 7 months ago by
comment:3 Changed 7 months ago by
Yes, good summary.
I was wary at first of a little web app to help with testing, because it's yet another place to go break into or watch; but I think so long as we know it is a short term fix until the proper reachability testing gets into the version of Tor that people have, the usability boost makes it an acceptable risk.
I expect we have quite a few obfs4 bridges right now that are firewalled off -- and if we do a campaign to get more people to run obfs4 bridges, without a good easy intuitive step for "then check if it works" we'll have even more.
comment:4 follow-up: 6 Changed 7 months ago by
Status: | assigned → needs_review |
---|
I built a small golang service that lets bridge operators test their obfs4 port. For now, the code is available at https://github.com/NullHypothesis/obfs4PortScan.
I set up a demo at https://nymity.ch:8081. After entering your bridge's IP address and port, the service tells you if the port is reachable or not. If the port is unreachable, the service tells you the error message it got. The tool also has a simple rate limiter that limits requests to an average of one per second, with bursts of up to five per second.
What can we improve?
comment:5 Changed 7 months ago by
Sponsor: | → Sponsor19 |
---|
comment:6 follow-up: 7 Changed 7 months ago by
Replying to phw:
I built a small golang service that lets bridge operators test their obfs4 port. For now, the code is available at https://github.com/NullHypothesis/obfs4PortScan.
I set up a demo at https://nymity.ch:8081. After entering your bridge's IP address and port, the service tells you if the port is reachable or not. If the port is unreachable, the service tells you the error message it got. The tool also has a simple rate limiter that limits requests to an average of one per second, with bursts of up to five per second.
Awesome! It worked for me :)
I just have a few minor comments:
- A nicer way to express the timeout here would be
timeout := 3* time.Second
, but even better would be to set a commented constant at the beginning of the file. - In
main()
you could have the certificate and key files passed in as specific arguments such as--cert
or--key
as the broker does here. The advantage of this is making sure they are passed in the correct order (which should be documented outside of the usage function). - Do you also want timestamps in your logs?
As a more general note, is this meant to be used in an automated way for bridge operators to log and report to themselves when their port isn't reachable? Or as an occasional manual check? I know this is something temporary so maybe not a large consideration, but returning something other than a 200 OK
if the port is unreachable would make it easier to write a client-side go program that performs this check automatically.
comment:7 follow-up: 8 Changed 7 months ago by
Replying to cohosh:
- A nicer way to express the timeout here would be
timeout := 3* time.Second
, but even better would be to set a commented constant at the beginning of the file.
Good point, fixed in the following branch: https://github.com/NullHypothesis/obfs4PortScan/tree/fix/30472
- In
main()
you could have the certificate and key files passed in as specific arguments such as--cert
or--key
as the broker does here. The advantage of this is making sure they are passed in the correct order (which should be documented outside of the usage function).
Also fixed in the same branch.
- Do you also want timestamps in your logs?
I would like to keep timestamps because they tell us how much (ab)use the service is seeing. Do you see any issues with timestamps?
On a related note: I noticed that the http package can log error messages that include the client's IP address. I included snowflake's safe logger to prevent this from happening.
As a more general note, is this meant to be used in an automated way for bridge operators to log and report to themselves when their port isn't reachable? Or as an occasional manual check? I know this is something temporary so maybe not a large consideration, but returning something other than a
200 OK
if the port is unreachable would make it easier to write a client-side go program that performs this check automatically.
At this point it's meant for occasional manual checks. I plan to add a link to the service to our obfs4proxy installation guide. I originally intended this service to be used as an API (see the bottom paragraph of the ticket's description) but it's not clear if we want yet another service that deals with bridge data. The better way forward may be to improve BridgeDB.
comment:8 follow-up: 9 Changed 7 months ago by
Replying to phw:
Replying to cohosh:
- A nicer way to express the timeout here would be
timeout := 3* time.Second
, but even better would be to set a commented constant at the beginning of the file.
Good point, fixed in the following branch: https://github.com/NullHypothesis/obfs4PortScan/tree/fix/30472
I think the timeout
input to isTCPPortReachable is redudant now.
- Do you also want timestamps in your logs?
I would like to keep timestamps because they tell us how much (ab)use the service is seeing. Do you see any issues with timestamps?
As long as you're not logging IP addresses, this seems fine to me. You're also not exporting the data, it's mostly a consideration in the case that the machine or service is compromised. I don't see issues with an attacker getting ahold of the number of requests made and the times at which they are made. There are probably easier ways to find out whatever information they would hope to find out from these logs anyway.
On a related note: I noticed that the http package can log error messages that include the client's IP address. I included snowflake's safe logger to prevent this from happening.
Oh good point, I'm glad the package is useful here.
comment:9 Changed 7 months ago by
Replying to cohosh:
I think the
timeout
input to isTCPPortReachable is redudant now.
Yes, good catch.
As long as you're not logging IP addresses, this seems fine to me. You're also not exporting the data, it's mostly a consideration in the case that the machine or service is compromised. I don't see issues with an attacker getting ahold of the number of requests made and the times at which they are made. There are probably easier ways to find out whatever information they would hope to find out from these logs anyway.
Yes, agreed.
Ok, I'll move forward with setting up the service on polyanthum. Thanks for the reviews!
comment:10 follow-up: 11 Changed 6 months ago by
On IRC, we concluded that we should add another ProxyPass
directive to our apache config on polyanthum, so bridge operators can access this service over a URL such as bridges.torproject.org/scan/.
We should also make the service start automatically at boot -- perhaps by creating a systemd script?
comment:11 Changed 6 months ago by
Replying to phw:
We should also make the service start automatically at boot -- perhaps by creating a systemd script?
If it's just run as a normal user, and doesn't need to be root or anything, you can start it on boot with a line in that user's crontab, like
@reboot (cd /home/tord/run; ../git/src/app/tor -f moria1-orrc)
That's much lighter-weight than a systemd script, and avoids having anything run as higher privileges.
comment:12 Changed 6 months ago by
i've document how to start services with systemd in https://help.torproject.org/tsa/doc/services/
i prefer this to cron as it allows much more flexibility and i can restart services when there are security upgrades.
comment:13 Changed 6 months ago by
Actual Points: | → 2 |
---|---|
Resolution: | → fixed |
Status: | needs_review → closed |
I'm closing this because our short-term solution is now in place. Here's a summary:
- The service is now deployed at https://bridges.torproject.org/scan/.
- The code is available at https://github.com/NullHypothesis/obfs4PortScan (and, once #30715 is done, on our gitweb).
- We're using polyanthum's Apache reverse proxy as front for the service (see #30703), so we only need to listen on localhost.
- The service runs as user
bridgescan
(see #30714) and we're using a local systemd script to have it start at boot. - Our sysmon deployment is monitoring the service.
We had a short discussion on IRC and concluded the following:
We were left wondering what obfs4 operators should do in the short term, before #30477 is done, to figure out if their bridge is reachable. One way forward would be a simple web page, hosted by us, that asks for an IP address, and a port as input. The service then tries to establish a TCP connection to the given tuple, and lets the user know if it succeeded or failed. The service doesn't need to log or remember anything, and we can run it on polyanthum, the host that also runs BridgeDB.