Opened 7 months ago

Last modified 5 weeks ago

#29285 assigned project

Improve the PT spec and how PTs interface with Tor

Reported by: cohosh Owned by: phw
Priority: High Milestone:
Component: Circumvention/Pluggable transport Version:
Severity: Normal Keywords: anti-censorship-roadmap-august
Cc: cohosh, arma, gaba, brade, mcs, msherr Actual Points:
Parent ID: Points: 15
Reviewer: Sponsor: Sponsor28-must

Description (last modified by cohosh)

We want to make it easier for developers (and academics) to design and implement new pluggable transports and get them easily integrated with Tor so that we can have a well-functioning PT integration pipeline.

This is a large project that will consist of several things:

  • We need to assess pain points with the current PT spec and desired features from a variety of PT developers.
  • We might want to take a look at the PTv2 specification to see where features differ from our v1 and also which features seem to be liked or used by PT developers.
  • We should think about how bridge distribution should factor into the PT specification. For example, some transports such as meek and snowflake handle "bridge" information differently than transports whose bridges are distributed through BridgeDB. This results in a different interaction with Tor, and we might consider modifying the spec with the snowflake/broker model in mind (ticket #29296).

In general, we should improve our communication with the pluggable transports community to see what they need and figure out how to get more PTs integrated with Tor.

Child Tickets

Change History (13)

comment:1 Changed 7 months ago by teor

Owner: asn deleted
Status: newassigned

asn does not need to own any obfuscation tickets any more. Default owners are trouble.

comment:2 Changed 7 months ago by cohosh

Status: assignednew

tickets were assigned to asn, setting them as unassigned (new) again.

comment:3 Changed 6 months ago by cohosh

Description: modified (diff)
Summary: Improve the PT interface with TorImprove the PT spec and how PTs interface with Tor

comment:4 Changed 4 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

comment:5 Changed 3 months ago by phw

Cc: arma gaba added
Keywords: anti-censorship-roadmap added
Owner: set to phw
Points: 15
Priority: MediumHigh
Sponsor: Sponsor19Sponsor28-must
Status: newassigned

Here's an incomplete list of issues with our current spec:

  • iOS applications aren't allowed to fork a subprocess, which means that "obfs4proxy will likely run in the same process as the app and tor." Mike managed to work around this in iObfs.
  • Exposing a SOCKS proxy on Android is not future proof. In fact, even Unix domain sockets may be a problem in the future. Still, a domain socket would be better than SOCKS-based IPC and, according to Yawning, will facilitate sandboxing.
  • The PT should be able to communicate its bootstrap status to the invoking process.
  • The spec should incorporate the proposed dormant mode (see #28849).
  • Some PTs such as meek and snowflake don't rely on an IP address. The current workaround is to use awkward pseudo IP addresses (#18611).
  • Other transports may want to rely on multiple IP address; or at least listen on both an IPv4 and IPv6 address. We need to reconsider the outdated notion of a bridge line. (#11211)
  • Transports are not allowed to emit bytes with the high bit set to stdout in messages such as PROXY-ERROR, but there is no guidance for how to handle/escape such bytes if they happen to appear in a user-provided message or filename, for example.
  • SOCKS args can only hold a maximum of about 512 bytes (#10671).
  • There is an ambiguity in encoding of SOCKS args that end in a NUL byte (comment:11:ticket:29627).
  • There are multiple incompatible and hard-to-implement dictionary encodings.
    • §3.3.3 SMETHOD ARGS: comma-separated, must escape backslash, equals, and comma.
      key1=value1,key2=value2
      
    • §3.5 client per-connection arguments: semicolon-separated, must escape backslash, equals, and semicolon.
      key1=value1;key2=value2
      
    • §3.2.3 TOR_PT_SERVER_TRANSPORT_OPTIONS: semicolon-separated, technically a nested dictionary with each element additionally colon-prefixed with the transport it pertains to, must escape backslash, colon, and semicolon (but not equals—in this encoding it's impossible for a key to contain an equals sign).
      transport1:key1=value1;transport2:key2=value2
      
    • §3.2.2 TOR_PT_SERVER_BINDADDR: comma-separated, uses - instead of = to separate key and value, no escaping necessary because of data types.
      transport1-1.2.3.4:1234,transport2-5.6.7.8:5678
      
    • §3.3.4 and §3.3.5 LOG and STATUS: space-separated with C-style escapes, ambiguous as to when quotes are required.
      key1="value1" key2="value2"
      
    • If we're including PT 2.0, then there is also UTF-8 JSON (§1.4. Pluggable PT Client Per-Connection Arguments).
      {"key1": "value1", "key2": "value2"}
      
  • There is no way to run multiple instances of the same server transport with different options. This is because both TOR_PT_SERVER_BINDADDR and TOR_PT_SERVER_TRANSPORT_OPTIONS are keyed by transport name, with nothing to distinguish multiple instances that use the same name. It's an annoyance when, for example, you want to run multiple copies of obfs4 with different certificates for access control, or with different iat-mode settings. The only way to do it is to (1) run multiple independent instances of tor with their own configuration files, or (2) hack the PT source so that it recognizes multiple synonymous method names, e.g. obfs4a, obfs4b, obfs4c. There is a similar problem with torrc, in that ServerTransportPlugin, ServerTransportListenAddr, and ServerTransportOptions are also all keyed by transport name (#31228 and #11211).
    • In comparison, the PT spec does support multiple instances of client transports with different options, because the options come in SOCKS args rather than an environment variable, so they are bound to a specific CMETHOD listener.

And here's an incomplete list of existing library implementations:

  • A seemingly unnamed Swift implementation of the v2.1 specification, maintained by the Operator Foundation.
  • PLUTO2 is a Java implementation of the v2.x specification, maintained by the Guardian Project.
  • goptlib is a Go implementation of the v1.0 specification, maintained by the Tor Project.
  • pyptlib is a Python implementation of the v1.0 specification, (formerly) maintained by the Tor Project.

Edit: remove duplicate issues

Last edited 5 weeks ago by phw (previous) (diff)

comment:6 Changed 2 months ago by phw

We now have a discussion thread on tor-dev@ and I started pointing some implementers to this thread in the hope that they will share their experience.

comment:7 in reply to:  5 ; Changed 2 months ago by dcf

Replying to phw:

And here's an incomplete list of existing library implementations:

Really there are two types of PT implementations, or three if you count PT 2.0 additions. There aren't really standard names for these.

  1. IPC manager/dispatcher. As far as I know, tor and https://github.com/twisteroidambassador/ptadapter are the only two implementations of this. This is the thing that sets e.g. TOR_PT_MANAGED_TRANSPORT_VER and manages subprocesses of type (2).
  2. IPC transport/plugin. This is goptlib and pyptlib. It's a subprocess managed by an implementation of type (1). This is the thing that writes e.g. CMETHOD to stdout.
  3. From PT 2.0, there are also plugin/transport implementations that you are meant to link with directly in the same executable, without going through the IPC interface. There are Go and Swift API spec. From talking to Brandon Wiley, my understanding is that everything that uses PT other than tor and ptadapter uses such an API, or something like it, not the IPC model. shapeshifter-dispatcher converts implementations of type (3) into type (2).

The Pluggable Transports Base Spec v2.1 calls types (1) and (2) "IPC" and type (3) "API".

comment:8 in reply to:  7 ; Changed 2 months ago by mcs

Cc: brade mcs added

Replying to dcf:

Replying to phw:

And here's an incomplete list of existing library implementations:

Really there are two types of PT implementations, or three if you count PT 2.0 additions. There aren't really standard names for these.

  1. IPC manager/dispatcher. As far as I know, tor and https://github.com/twisteroidambassador/ptadapter are the only two implementations of this. This is the thing that sets e.g. TOR_PT_MANAGED_TRANSPORT_VER and manages subprocesses of type (2).

The module within Tor Launcher that implements Moat (interactive bridge retrieval) is another example of the above. We did that so we could reuse your Meek PT implementation. See https://gitweb.torproject.org/tor-launcher.git/tree/src/modules/tl-bridgedb.jsm?h=maint-0.2.18#n188

comment:9 in reply to:  8 Changed 2 months ago by dcf

Replying to mcs:

The module within Tor Launcher that implements Moat (interactive bridge retrieval) is another example of the above. We did that so we could reuse your Meek PT implementation. See https://gitweb.torproject.org/tor-launcher.git/tree/src/modules/tl-bridgedb.jsm?h=maint-0.2.18#n188

Oh good call, IIRC on the server side of Moat as well there is some half-baked shell script managing the meek-server process, that could be replaced with ptadapter (#29096).

comment:10 Changed 2 months ago by phw

#21816 (Add support for Pluggable Transports 2.0) is related.

comment:11 Changed 2 months ago by teor

We also want to be able to assign multiple addresses to each pluggable transport. For example, and IPv4 and IPv6 address.
See #30953 for details.

comment:12 Changed 6 weeks ago by msherr

Cc: msherr added

comment:13 Changed 6 weeks ago by gaba

Keywords: anti-censorship-roadmap-august added; network-team-roadmap-2019-Q1Q2 anti-censorship-roadmap removed
Note: See TracTickets for help on using tickets.