- ARGS:K=V,K=V,K=V If this option is set, the K=V arguments are added to Tor's extrainfo document. Equal signs and commas must be escaped with a backslash.
All of obfs4's server (extra info) document arguments end with a number of equal signs because they are Base64 strings.
goptlib does the right thing here and escapes the args, so the trailing Base64 padding passed to tor as part of SMETHOD ARGS ends with \\=. The fun here is that, tor does not unescape the ARGS line, so \\= is what ends up in the extrainfo document on BridgeDB.
The arguments that appear on obfs4 bridge lines should not be escaped, so someone, somewhere between little-t tor, and the place where the arguments appear on whatever BridgeDB frontend the end user sees, needs to unescape the arguments.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
There's also a goptlib bug here, where it escapes twice. Once in encodeSmethodArgs() (fine), and once in pt.go:formatline() (not so fine).
If the output was to be compliant with the pt-spec the problematic entries will end in \=.
Good find. The escape function isn't specified in pt-spec; it's just a safety valve I cooked up to handle the case when someone tries to pass a string containing '\n' or '\x00', which would otherwise trash the stdout interaction. But clearly using a backslash escape is the wrong thing, because backslashes are used in other parts of the protocol. Maybe we should just strip the problematic bytes, or even panic. As far as I'm concerned it's undefined behavior, so we can do what we want.
Could you write (and commit) a failing test that demonstrates the error? I.e., in TestEscape, add test cases including backslashes and other bytes we care about, and modify the check function to verify that if stringIsSafe(input) is true, that escape(input) == input.
There's also a goptlib bug here, where it escapes twice. Once in encodeSmethodArgs() (fine), and once in pt.go:formatline() (not so fine).
If the output was to be compliant with the pt-spec the problematic entries will end in \=.
Good find. The escape function isn't specified in pt-spec; it's just a safety valve I cooked up to handle the case when someone tries to pass a string containing '\n' or '\x00', which would otherwise trash the stdout interaction. But clearly using a backslash escape is the wrong thing, because backslashes are used in other parts of the protocol. Maybe we should just strip the problematic bytes, or even panic. As far as I'm concerned it's undefined behavior, so we can do what we want.
Could you write (and commit) a failing test that demonstrates the error? I.e., in TestEscape, add test cases including backslashes and other bytes we care about, and modify the check function to verify that if stringIsSafe(input) is true, that escape(input) == input.
There are multiple conflicting definitions of pluggable transport arguments that probably cannot be made consistent in a backward-compatible way. https://gitweb.torproject.org/torspec.git/tree/pt-spec.txt#n368 defines the general BNF syntax for a managed transport process-to-parent communication, which excludes NUL and NL. The SMETHOD syntax at https://gitweb.torproject.org/torspec.git/tree/pt-spec.txt#n544 looks more ad-hoc, but it implies that the options are space-separated words. The ARGS option is k=v pairs separated by commas, and only has provisions for escaping commas and equals signs (but not NUL, NL, SP, or backslash).
Ultimately, the SMETHOD ARGS will end up in a Bridge config line, described at https://gitweb.torproject.org/torspec.git/tree/proposals/180-pluggable-transport.txt#n146. This config line has space-separated k=v pairs. The syntax has provisions for escaping backslash and semicolon but not spaces, equals signs, commas, newlines, or NUL characters. The tor(1) manual page is out of date and doesn't reflect the prop180 config line syntax. (This is bug #20341 (moved).)
After IRC discussion with arma and Yawning, it seems that the best solution may be to amend the specs to disallow certain characters from keys or values in transport args, such as NUL, NL, SP, backslash, equals sign, comma, and maybe semicolon.
Trac: Reviewer: N/AtoN/A Sponsor: N/AtoN/A Severity: N/Ato Normal
If it helps, here is how goptlib handles args (goptlib being the primary implementation of the PT protocol other than tor itself). It's mostly based on my own interpretation of what the spec says, but it at least has a lot of tests. The comments that quote pt-spec.txt are taken from an earlier version of the spec, probably 4dcd7e94f1 from July 2014.
args.go: the main functions are parseClientParameters, parseServerTransportOptions, and encodeSmethodArgs (we haven't yet needed a function to encode client parameters). encodeSmethodArgs escapes only these three bytes: \=,. Other byte values such as \n and \x00 are handled instead in pt.go.
args_test.go: is test code for the functions in args.go. Please let me know if you have additional test cases or if any of the existing ones seem wrong to you.
pt.go: interacts a little bit with argument syntax in the formatline function, which is responsible for formatting stdout lines like SMETHOD. formatline panics (i.e. crashes) on any \n, \x00, or byte value greater than \x7f (see the argIsSafe function). Formerly, goptlib didn't panic but applied an additional backslash encoding to these bytes, which Yawning noted in comment:1 and has since been removed in favor of panicking.
If there are to be any code changes, they should probably be in the BridgeDB implementation and maybe the pluggable transport client of tor. I like the idea of using something like Bourne shell variable syntax, e.g., a=b=c sets $a to the string b=c. Variable (key) names may not contain = but values may, without needing quoting or escaping. I know obfs4 worked around the = escaping problem by using unpadded base64, but could other transports need = in argument values in the future? (maybe as query parameters to URLs?)
Here's how I understand the existing data flow:
SMETHOD ARGS
comma-separated key=value pairs
, and = (but not \ -- at least not consistently, e.g. goptlib) escaped by \ (pt-spec.txt#n566)
currently, transports.c doesn't escape =, contrary to spec (transports.c#n1668)
on the other hand, 180-pluggable-transport.txt#n157 doesn't specify escaping of = when sending to the transport's SOCKS proxy, so maybe this can remain unchanged
Proposed spec changes will go in a separate ticket.
, and = (but not \ -- at least not consistently, e.g. goptlib) escaped by \ (pt-spec.txt#n566)
no provision for escaping whitespace
Is there a place where goptlib doesn't escape \? If so it is probably a bug. \ has to be escaped in order to make the escaping reversible, even though the spec doesn't explicitly call for it ("Equal signs and commas MUST be escaped with a backslash"); my comment in goptlib interpolates "[and backslashes]".
The encodeSmethodArgs function has
// "Equal signs and commas [and backslashes] must be escaped with a backslash." escape := func(s string) string { return backslashEscape(s, []byte{'=', ','}) }
but that doesn't mean \ isn't itself escaped; backslashEscape escapes \in addition to the other bytes that are listed.
currently, transports.c doesn't escape =, contrary to spec (transports.c#n1668)
on the other hand, 180-pluggable-transport.txt#n157 doesn't specify escaping of = when sending to the transport's SOCKS proxy, so maybe this can remain unchanged
Here, I felt that the lack of backslashing equals signs was a bug in the spec and interpolated into a comment above parseClientParameters:
// "If a key or value value must contain [an equals sign or] a semicolon// or a backslash, it is escaped with a backslash."
Is there a place where goptlib doesn't escape \? If so it is probably a bug. \ has to be escaped in order to make the escaping reversible, even though the spec doesn't explicitly call for it ("Equal signs and commas MUST be escaped with a backslash"); my comment in goptlib interpolates "[and backslashes]".
I think we're actually in agreement. I meant that goptlib does something (escaping \) that the spec doesn't call for. I think the escaping is reversible even if \ isn't escaped, as long as everyone is consistent (which I think they aren't). It's more robust to have a syntax where \ gets escaped though, particularly if characters can be optionally escaped.
encoded in SOCKS username/password
Here, I felt that the lack of backslashing equals signs was a bug in the spec and interpolated into a comment above parseClientParameters:
{{{
// "If a key or value value must contain [an equals sign or] a semicolon
// or a backslash, it is escaped with a backslash."
}}}
I would tend to agree, but see #22088 (moved) for a possible way to update the specs to avoid escaping =. Note that to make tor conform to the current spec (which requires = to be escaped), it might need to do additional parsing of the PT arguments (to split them into pairs of keys and values) beyond what it does now (treating them as a sequence of space-separated words each of which contains an = character).