Opened 5 years ago

Last modified 3 months ago

#12930 new defect

Someone, somewhere needs to unescape pluggable transport "SMETHOD ARGS" arguments.

Reported by: yawning Owned by:
Priority: Medium Milestone:
Component: Obfuscation/Pluggable transport Version:
Severity: Normal Keywords: goptlib
Cc: dcf, catalyst Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Per pt-spec.txt:

      - ARGS:K=V,K=V,K=V

        If this option is set, the K=V arguments are added to Tor's
        extrainfo document. Equal signs and commas must be escaped
        with a backslash.

All of obfs4's server (extra info) document arguments end with a number of equal signs because they are Base64 strings.

goptlib does the right thing here and escapes the args, so the trailing Base64 padding passed to tor as part of SMETHOD ARGS ends with \\=. The fun here is that, tor does not unescape the ARGS line, so \\= is what ends up in the extrainfo document on BridgeDB.

The arguments that appear on obfs4 bridge lines should not be escaped, so someone, somewhere between little-t tor, and the place where the arguments appear on whatever BridgeDB frontend the end user sees, needs to unescape the arguments.

Child Tickets

Change History (14)

comment:1 Changed 5 years ago by yawning

Cc: dcf added

There's also a goptlib bug here, where it escapes twice. Once in encodeSmethodArgs() (fine), and once in pt.go:formatline() (not so fine).

If the output was to be compliant with the pt-spec the problematic entries will end in \=.

Last edited 5 years ago by yawning (previous) (diff)

comment:2 Changed 5 years ago by yawning

Parent ID: #12130

Removing this from being a child of the obfs4 deployment bug because obfs4 uses base16 for all the arguments now to work around this problem.

comment:3 in reply to:  1 ; Changed 5 years ago by dcf

Replying to yawning:

There's also a goptlib bug here, where it escapes twice. Once in encodeSmethodArgs() (fine), and once in pt.go:formatline() (not so fine).

If the output was to be compliant with the pt-spec the problematic entries will end in \=.

Good find. The escape function isn't specified in pt-spec; it's just a safety valve I cooked up to handle the case when someone tries to pass a string containing '\n' or '\x00', which would otherwise trash the stdout interaction. But clearly using a backslash escape is the wrong thing, because backslashes are used in other parts of the protocol. Maybe we should just strip the problematic bytes, or even panic. As far as I'm concerned it's undefined behavior, so we can do what we want.

Could you write (and commit) a failing test that demonstrates the error? I.e., in TestEscape, add test cases including backslashes and other bytes we care about, and modify the check function to verify that if stringIsSafe(input) is true, that escape(input) == input.

comment:4 in reply to:  3 Changed 5 years ago by dcf

Replying to dcf:

Replying to yawning:

There's also a goptlib bug here, where it escapes twice. Once in encodeSmethodArgs() (fine), and once in pt.go:formatline() (not so fine).

If the output was to be compliant with the pt-spec the problematic entries will end in \=.

Good find. The escape function isn't specified in pt-spec; it's just a safety valve I cooked up to handle the case when someone tries to pass a string containing '\n' or '\x00', which would otherwise trash the stdout interaction. But clearly using a backslash escape is the wrong thing, because backslashes are used in other parts of the protocol. Maybe we should just strip the problematic bytes, or even panic. As far as I'm concerned it's undefined behavior, so we can do what we want.

Could you write (and commit) a failing test that demonstrates the error? I.e., in TestEscape, add test cases including backslashes and other bytes we care about, and modify the check function to verify that if stringIsSafe(input) is true, that escape(input) == input.

I made #13370 for the goptlib bug.

comment:5 Changed 5 years ago by dcf

Keywords: goptlib added

comment:6 Changed 2 years ago by catalyst

Severity: Normal

There are multiple conflicting definitions of pluggable transport arguments that probably cannot be made consistent in a backward-compatible way. https://gitweb.torproject.org/torspec.git/tree/pt-spec.txt#n368 defines the general BNF syntax for a managed transport process-to-parent communication, which excludes NUL and NL. The SMETHOD syntax at https://gitweb.torproject.org/torspec.git/tree/pt-spec.txt#n544 looks more ad-hoc, but it implies that the options are space-separated words. The ARGS option is k=v pairs separated by commas, and only has provisions for escaping commas and equals signs (but not NUL, NL, SP, or backslash).

Ultimately, the SMETHOD ARGS will end up in a Bridge config line, described at https://gitweb.torproject.org/torspec.git/tree/proposals/180-pluggable-transport.txt#n146. This config line has space-separated k=v pairs. The syntax has provisions for escaping backslash and semicolon but not spaces, equals signs, commas, newlines, or NUL characters. The tor(1) manual page is out of date and doesn't reflect the prop180 config line syntax. (This is bug #20341.)

https://gitweb.torproject.org/torspec.git/tree/bridgedb-spec.txt#n93 says the arguments are comma-separated k=v pairs, but https://gitweb.torproject.org/torspec.git/tree/bridgedb-spec.txt#n338 then recommends displaying them as space-separated k=v pairs. (This is consistent with the prop180 config line syntax.)

After IRC discussion with arma and Yawning, it seems that the best solution may be to amend the specs to disallow certain characters from keys or values in transport args, such as NUL, NL, SP, backslash, equals sign, comma, and maybe semicolon.

comment:7 Changed 2 years ago by dcf

If it helps, here is how goptlib handles args (goptlib being the primary implementation of the PT protocol other than tor itself). It's mostly based on my own interpretation of what the spec says, but it at least has a lot of tests. The comments that quote pt-spec.txt are taken from an earlier version of the spec, probably 4dcd7e94f1 from July 2014.

args.go: the main functions are parseClientParameters, parseServerTransportOptions, and encodeSmethodArgs (we haven't yet needed a function to encode client parameters). encodeSmethodArgs escapes only these three bytes: \ = ,. Other byte values such as \n and \x00 are handled instead in pt.go.

args_test.go: is test code for the functions in args.go. Please let me know if you have additional test cases or if any of the existing ones seem wrong to you.

pt.go: interacts a little bit with argument syntax in the formatline function, which is responsible for formatting stdout lines like SMETHOD. formatline panics (i.e. crashes) on any \n, \x00, or byte value greater than \x7f (see the argIsSafe function). Formerly, goptlib didn't panic but applied an additional backslash encoding to these bytes, which Yawning noted in comment:1 and has since been removed in favor of panicking.

comment:8 Changed 2 years ago by catalyst

Cc: catalyst added

comment:9 Changed 2 years ago by catalyst

Related ticket: #20341.

comment:10 Changed 2 years ago by catalyst

If there are to be any code changes, they should probably be in the BridgeDB implementation and maybe the pluggable transport client of tor. I like the idea of using something like Bourne shell variable syntax, e.g., a=b=c sets $a to the string b=c. Variable (key) names may not contain = but values may, without needing quoting or escaping. I know obfs4 worked around the = escaping problem by using unpadded base64, but could other transports need = in argument values in the future? (maybe as query parameters to URLs?)

Here's how I understand the existing data flow:

  1. SMETHOD ARGS
    • comma-separated key=value pairs
    • , and = (but not \ -- at least not consistently, e.g. goptlib) escaped by \ (pt-spec.txt#n566)
    • no provision for escaping whitespace
  2. transport lines in extra-info
  3. BridgeDB (as output to users)
  4. Bridge line in torrc or Tor Browser config dialog
    • space-separated key=value pairs
    • no specification for escaping any characters
  5. encoded in SOCKS username/password

Proposed spec changes will go in a separate ticket.

comment:11 in reply to:  10 ; Changed 2 years ago by dcf

Replying to catalyst:

  1. SMETHOD ARGS
    • comma-separated key=value pairs
    • , and = (but not \ -- at least not consistently, e.g. goptlib) escaped by \ (pt-spec.txt#n566)
    • no provision for escaping whitespace

Is there a place where goptlib doesn't escape \? If so it is probably a bug. \ has to be escaped in order to make the escaping reversible, even though the spec doesn't explicitly call for it ("Equal signs and commas MUST be escaped with a backslash"); my comment in goptlib interpolates "[and backslashes]".

The encodeSmethodArgs function has

// "Equal signs and commas [and backslashes] must be escaped with a backslash."
        escape := func(s string) string {
                return backslashEscape(s, []byte{'=', ','})
        }

but that doesn't mean \ isn't itself escaped; backslashEscape escapes \ in addition to the other bytes that are listed.

  1. encoded in SOCKS username/password

Here, I felt that the lack of backslashing equals signs was a bug in the spec and interpolated into a comment above parseClientParameters:

// "If a key or value value must contain [an equals sign or] a semicolon
// or a backslash, it is escaped with a backslash."                                   

comment:12 in reply to:  11 Changed 2 years ago by catalyst

Replying to dcf:

Replying to catalyst:

  1. SMETHOD ARGS

Is there a place where goptlib doesn't escape \? If so it is probably a bug. \ has to be escaped in order to make the escaping reversible, even though the spec doesn't explicitly call for it ("Equal signs and commas MUST be escaped with a backslash"); my comment in goptlib interpolates "[and backslashes]".

I think we're actually in agreement. I meant that goptlib does something (escaping \) that the spec doesn't call for. I think the escaping is reversible even if \ isn't escaped, as long as everyone is consistent (which I think they aren't). It's more robust to have a syntax where \ gets escaped though, particularly if characters can be optionally escaped.

  1. encoded in SOCKS username/password

Here, I felt that the lack of backslashing equals signs was a bug in the spec and interpolated into a comment above parseClientParameters:

// "If a key or value value must contain [an equals sign or] a semicolon
// or a backslash, it is escaped with a backslash."                                   

I would tend to agree, but see #22088 for a possible way to update the specs to avoid escaping =. Note that to make tor conform to the current spec (which requires = to be escaped), it might need to do additional parsing of the PT arguments (to split them into pairs of keys and values) beyond what it does now (treating them as a sequence of space-separated words each of which contains an = character).

comment:13 Changed 3 months ago by teor

Owner: asn deleted
Status: newassigned

asn does not need to own any obfuscation tickets any more. Default owners are trouble.

comment:14 Changed 3 months ago by cohosh

Status: assignednew

tickets were assigned to asn, setting them as unassigned (new) again.

Note: See TracTickets for help on using tickets.