(Re-using text from Zack Weinberg for this description.)
There are pluggable transport proxies that don't actually act as a SOCKS proxy. For example, StegoTorus has its own configuration; it ignores everything told it in the SOCKS dialogue and always connects to the bridge that it knows about. If you want multiple StegoTorus bridges accessible to your Tor client, you need multiple "ClientTransportPlugin ... exec" specifications. This is only going to get worse when they move away from having everything set up on StegoTorus' command line, which has been direly needed for some time now.
Theoretically all of StegoTorus' configuration could be encapsulated in the SOCKS key-value-pairs-in-the-password hack that's described in 180-pluggable-transport.txt, but they never implemented that and they don't want to. They want to rip out all of the SOCKS code, in fact. The way they want it to work is
Bridge storus1 direct [keyid=...]ClientTransportPlugin storus1 direct 127.0.0.1:8888
In this case, 'storus1' is not a "method", it's a human-readable identifier for the bridge that Tor will be connected to if it starts talking the OR protocol -- with no initial SOCKS exchange! -- on 127.0.0.1:8888.
"direct" should also be valid in CMETHOD/SMETHOD lines for the proxy-management protocol, with the same semantics. Zack says he hasn't really thought through how the server side of this stuff ought to work.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
In my new found role as human mail-to-Trac proxy, quoting Roger here:
"I'm not clear on whether we want to change the interface to be the one Stegotorus tried to use. But it does seem clear that Tor's current interface is not the one that everybody wants. We should try to learn our lesson from how Stegotorus tries to use Tor, and see if it's something we should change on Tor's side, or instead write up a "recommendation for how to interface with Tor correctly if you're in Stegotorus's situation".
I think this is something that Nick and Andrea would have good opinions about."
Trac: Cc: zwol, asn, nickm to zwol, asn, nickm, andrea
Rephrased yet again, in case the above is ambiguous:
One of the issues is that Stegotorus expects a one-to-many interface ("for this tor connection, please contact the following 10 stegotorus listeners"), whereas our API was more designed for a one-to-one interface ("here's your obfsproxy server address; I'll tell it to you using the
socks handshake").
Is "ignore the address in the socks handshake, and cram all the actual addresses into the socks username/password" really the best hack we can do?
Is "ignore the address in the socks handshake, and cram all the actual addresses into the socks username/password" really the best hack we can do?
Alternatively, "Stick something useful in the address" instead. I had never intended that the address would have to be for a one-to-one interface like you describe instead.
Myself, I'm not sure yet which way of solving this issue is better:
a) Use the SOCKS5 username/password covert channel to transfer multiple addresses.
Is this hack sufficiently future-proof? Will there be pluggable transports in the future that won't be satisfied by such a solution? How do we pass other parameters to the pluggable transport if the username/password covert channel is occupied by addresses? Are 512 bytes enough?
b) Implement Zack's direct suggestion.
This seems like an OK-ish solution, if the pluggable transport proxy always has a way of acquiring bridge addresses. Will this be true in the future?
c) Other solutions.
Like leaving the situation as it currently is, or sticking metadata in the SOCKS handshake that allows the pluggable transport proxy to find the bridge addresses on its own.
Trac: Cc: zwol, asn, nickm, andrea to zwol, asn, nickm, andrea, dcf
This is true. In the configuration we use an IP address that tries to be obviously fake: 0.0.1.0:1. (The address 0.0.0.0 doesn't work because tor uses it as a reserved value. I'm guessing I also tried 0.0.0.1 but it didn't work because SOCKS4a uses addresses of that form. Similarly a port of 0 is problematic.)
I have thought about, but haven't tested, using multiple Bridge lines with different fake addresses. They would in fact all be going to the same websocket bridge, but they would be going through different flash proxies (unknown in advance).
I can think of two things that would make things easier for flash proxy. The first would be a way for flashproxy-client to advise tor of how many virtual bridges (flash proxies) are currently available. We aim to keep about five of these, but only use one at a time. We could get better performance by running different circuits over different proxies.
The second is a solution for #3292 (moved). tor currently freaks out and refuses to connect if its bridge at 0.0.1.0:1 changes fingerprint. This can be expected to happen because 0.0.1.0:1 is not the bridge's real address, but a virtual address which can refer to different bridges at different times. We currently work around this by having only one websocket bridge the facilitator knows about.
c) Other solutions.
Like leaving the situation as it currently is, or sticking metadata in the SOCKS handshake that allows the pluggable transport proxy to find the bridge addresses on its own.
What metadata would be passed here? Am I correct in assuming that stegotorus takes a list of addresses on the command line, and that this information would be passed as part of a SOCKS handshake instead. This wouldn't make any difference for flash proxy, because we don't know any flash proxy addresses a priori, so we couldn't give them to the transport even if we wanted to.
Some time ago I promised to write up a bit of an explanation for why "cram[ming] all the actual addresses into the socks username/password" isn't great for StegoTorus.
A full configuration for the StegoTorus client [assuming the presence of a bunch of things that still haven't actually been implemented] is pretty heavyweight. It looks something like this -- format not to be taken as set in stone, but contents pretty much all necessary --
bridge example_name { tor_id "ca8a aab2 05b3 5b13 5afc 2782 b891 b555 c6fa f824" key "IQt2ejUA1gD927mLFXGN1daxojRa,qcjh1z8mduWejiM6IT8PdQle/UJq" window [1356824532 1356910932] ticket "gRT3RayKU4epkB7xPJV/p1Pm0G7faYA09vG4FAhkrqU=" addresses { "192.0.2.1:80" { steg http formats [html js css pdf swf] maxconn 6 maxrate 8000 } "192.0.2.2:443" { steg embed template spdy maxconn 2 maxrate 65535 } "192.0.2.3:6666" { steg irc maxconn 1 maxrate 1000 } // five to ten more of these }}
As written, that's 586 bytes long; obviously it could be more compact, but as a ballpark estimate of how much space a good wire format would take, zlib only manages to compress it down to 369 bytes. So we are already uncomfortably close to the 512-byte upper limit on a SOCKS5 username/password pair; I suspect adding the anticipated five to ten more listener addresses would put it over the limit. SOCKS4a doesn't have that limit but I'm worried about compatibility issues; most of the SOCKS code out there probably blithely assumes short usernames and passwords. (We do control both sides of the SOCKS dialogue, so maybe this concern is unwarranted.)
In addition, having SOCKS support in StegoTorus at all is inconvenient. It's only about 800 lines of code (not counting tests), but they add complexity to the connection setup and teardown logic, which has, over the lifetime of the project, been by far the most troublesome component to get right. There are still several known bugs in there and no doubt more I haven't even noticed yet. Removing SOCKS would allow (in my opinion) a significant and worthwhile simplification of that component.
Configuring StegoTorus independently of Tor may also prove more convenient in the larger picture. The "DEFIANCE" architecture that was presented at FOCI last summer contemplates packaging up pluggable-transport configurations as "network entry tickets" and delivering them to clients in a semi-automated manner. So there would be a controller process that would receive basically the above configuration blob from the network. In the current pluggable transport design, it would have to be parsed, repackaged as a bridge descriptor in Tor's dynamic configuration, and then repackaged again to hand off to the transport plugin. Feeding it directly to the pluggable transport instead (as a file in the filesystem, presumably) saves two data-reformatting operations. Furthermore, it means StegoTorus only needs one configuration parser, and the "ACS dance" client needs none at all.
The "DEFIANCE" architecture that was presented at FOCI last summer contemplates packaging up pluggable-transport configurations as "network entry tickets" and delivering them to clients in a semi-automated manner. So there would be a controller process that would receive basically the above configuration blob from the network. In the current pluggable transport design, it would have to be parsed, repackaged as a bridge descriptor in Tor's dynamic configuration, and then repackaged again to hand off to the transport plugin. Feeding it directly to the pluggable transport instead (as a file in the filesystem, presumably) saves two data-reformatting operations. Furthermore, it means StegoTorus only needs one configuration parser, and the "ACS dance" client needs none at all.
Hm. It seems there are probably two directions we could go in here -- more complex and more minimal. I'd like to aim for minimal if we can.
Proposal 199 looks pretty complex on its surface; but maybe it'll turn out to be less so once I read it again more carefully.
Zack, David -- I'm not getting a great sense here what interface you actually want, beyond "Not the current one!" Is there a flashproxy/stegotorus document that explains your preferred solution here? (Please forgive me if it's described above and I'm just not seeing it.)
Right -- the main topic of this ticket is to see if it's time to revise the set of specified (supported) ways that Tor can delegate connections to bridges.
When we started, we were generalizing from the obfsproxy example, where each pluggable transport has a local address it listens on, and you connect and tell it your desired bridge address in a socks handshake, and if there's more you want to say, you sneak it in via the socks username/password field of the socks handshake.
Now we have two new data points, neither of which fit that model well. The first is Stegotorus, where the parameters we want to sneak in don't even fit in the username/password field, and also Stegotorus doesn't want to hear the address we tell it. The second is Flashproxy, where there are no parameters, but also it ignores the address we tell it.
So the question for this ticket is: now that we have these three data points, is the socks-handshake-with-extra-stuff-in-username/password still the best model to recommend? If so, should we also recommend (aka specify and support) how best to behave if you're one of those two other data points?
Stegotorus has been developed under the model that something else would start it as an external proxy, with its own configuration. It has never fit well in the managed proxy model. Zack proposes one hack to make it fit a bit better. Can we think of better ones? Note that we're still starting the Stegotorus process with its parameters, rather than telling it parameters for a given connection around the time we make the connection; that seems klunky to me, since I think it means tearing down the Stegotorus process and launching another one, when you want to talk differently to your bridges (i.e. when you want to start using a new network entry ticket).
David wants to go a step in a different direction, where a given bridge address is actually many bridges, and he doesn't know which ones they are because his pluggable transport will only find out once it receives the connections from them. His "ignore the address" hack sounds less crazypants than Stegotorus's, mostly I guess since he doesn't have any other parameters he wants to pass in, but also because his flashproxy client stub shouldn't need a restart when you change bridges.
I'm not getting a great sense here what interface you actually want, beyond "Not the current one!" Is there a flashproxy/stegotorus document that explains your preferred solution here? (Please forgive me if it's described above and I'm just not seeing it.)
I think you're looking at this from the opposite direction: they hope we will think up a protocol that better (less hackily) supports what they're trying to wrestle our current protocol into doing. This isn't a "please build the following alternate interface" ticket. This ticket is a step before that.
I think you're looking at this from the opposite direction: they hope we will think up a protocol that better (less hackily) supports what they're trying to wrestle our current protocol into doing. This isn't a "please build the following alternate interface" ticket. This ticket is a step before that.
Hm. Okay, in that case I probably need to get much more up to speed about "what they're trying to wrestle our current protocol into doing" so I can understand what it means to support it less hackily.
Zack, David -- I'm not getting a great sense here what interface you actually want, beyond "Not the current one!" Is there a flashproxy/stegotorus document that explains your preferred solution here? (Please forgive me if it's described above and I'm just not seeing it.)
The current model mostly works fine for flash proxy. Being SOCKS works fine. The Bridge storus1 direct example in the description, I don't know how flash proxy would use something like that.
What flash proxy could use is a way to inform Tor dynamically of the bridges currently available. The way it works now is
You put
{{{
Bridge websocket 0.0.1.0:1
ClientTransportPlugin websocket exec ./flashproxy-client --register
}}}
in your torrc and start Tor.
The flash proxy client transport plugin does its rendezvous, but no proxy connections come in immediately (takes about a minute).
Tor makes a SOCKS request for the fake bridge address 0.0.1.0:1.
The flash proxy client transport plugin accepts the SOCKS request even though it doesn't have any proxies yet. Tor sends a few hundred bytes at this point, which the transport plugin buffers.
A minute later, about 10 proxy connections come in, mostly all at once. The transport plugin picks one of them, flushes its buffer over the proxy, and proxies as long as the proxy exists. When the proxy disconnects, the transport plugin switches to another already existing connection, but Tor doesn't notice this fact and thinks it's the same bridge at 0.0.1.0:1.
Note in the above how there is only ever one SOCKS connection open at a time. Nine out of ten proxies sit idle while one of them is being used.
This is what would be better:
You put
{{{
ClientTransportPlugin websocket exec ./flashproxy-client --register
}}}
in your torrc and start Tor. Notice you don't configure any bridges (because none exist yet).
The transport plugin does its rendezvous and waits.
Tor doesn't make a SOCKS connection because it knows of no bridges yet.
About 10 proxy connections come in. The transport plugin informs Tor of each of them, over a control port or something. Think of it as effectively dynamically adding new bridge lines:
{{{
Bridge websocket X.X.X.X:X
Bridge websocket Y.Y.Y.Y:Y
Bridge websocket Z.Z.Z.Z:Z
...
}}}
Where X.X.X.X:X etc. are the actual proxy IP addresses and ports that have just been learned. (Or they could be made up or hashed, it doesn't really matter.)
Tor now knows of 10 bridges it can make SOCKS requests for. If Tor asks for a connection to X.X.X.X:X, for example, the transport plugin would match the address to one of its existing proxy connections, and report SOCKS success to Tor. Having an address as a name for the connection allows Tor to manage bandwith per flash proxy or whatever.
As flash proxies disconnect, the transport plugin can dynamically remove those bridge lines that it added previously, again over some control channel. Tor notices when a bridge goes away.
So the number of parallel SOCKS connections could be as high as the current number of flash proxies, and Tor would be able to manage bandwidth and keyids independently for each of them.
In summary, if there were a way for a transport plugin to tell Tor that there is a new bridge, and tell Tor when a bridge no longer exists, it would be all we need. All the SOCKS infrastructure wouldn't have to change.
Notice you don't configure any bridges (because none exist yet).
2. The transport plugin does its rendezvous and waits.
3. Tor doesn't make a SOCKS connection because it knows of no bridges yet.
Oh oh oh! This reminds me of a third data point we have, for a pluggable transport that doesn't work well with our current approach: Skypemorph.
See ticket #5483 (moved): the Skypemorph people want exactly this feature too. They're thinking of working around it by starting Tor with DisableNetwork set to 1, and then when their Skypemorph transport is done bootstrapping, they set DisableNetwork back to 0. That sure is a hack though, since the pluggable transport isn't meant to have to be a Tor controller too.
Okay, so here's my suggestions for trying to make everybody a bit more happy here. If people like this, I'll turn it into a proposal and a set of tickets.
Nick's Proposal, v1
Extract variant of the easy parts of Proposal 199, so that pluggable transports can also act as bridge managers. The parts I propose to build are:
When being launched, a managed proxy can find out a tor controller port, along with a password or cookie location to use to authenticate to Tor. An external proxy can get this information too. Getting this information means that you're acting as a bridge manager.
After making a controller connection to Tor, the bridge managers can use SETCONF to tell Tor about bridge information.
Change the semantics of setting "UseBridges 1" when no bridges are configured. Right now, it's an error. I propose that instead it have the same effect as
Add a new __Bridge configuration option. It will have the same effect as Bridge, but (because it starts with __) its values won't be saved by a SAVECONF command.
Add a new ADDCONF / DELCONF command to help maintain the Bridges and __Bridges configuration options. It will only operate on a linelist. DELCONF will only remove lines when they're provided verbatim.
Clarify that it's okay to be a proxy that only supports SOCKS4a, so that nobody goes out of their way to build SOCKS5 support when they don't need to.
Add a new SOCKS reply code meaning "proxy not ready yet; try later."
Let's add a weight parameter to bridges.
Additionally, I'm suggesting some items for us not to do. I think most of these are unnecessary or unwise; I'm listing them only because I was considering them for at least a while, and I want to know if they're better than I thought.
Let's not add a new not-SOCKS connection mechanism. (Tor really needs to have some idea when it's making connections to the same bridge or not.)
Let's not add a new mechanism other than the control protocol for bridge managers to tell Tor what the bridges are. (More protocols make everybody sad. The subset of the Tor control protocol that's needed for this would seem relatively small.)
Let's not reduce the question of "what bridges are there" to "how many bridges can be accessed via plugin X". (Needless complexity.)
Let's not introduce a new variant of the control protocol where Tor connects to bridge manager and then they authenticate. (Needless complexity)
Okay, so here's my suggestions for trying to make everybody a bit more happy here. If people like this, I'll turn it into a proposal and a set of tickets.
=== Nick's Proposal, v1 ===
Extract variant of the easy parts of Proposal 199, so that pluggable transports can also act as bridge managers. The parts I propose to build are:
When being launched, a managed proxy can find out a tor controller port, along with a password or cookie location to use to authenticate to Tor. An external proxy can get this information too. Getting this information means that you're acting as a bridge manager.
After making a controller connection to Tor, the bridge managers can use SETCONF to tell Tor about bridge information.
Change the semantics of setting "UseBridges 1" when no bridges are configured. Right now, it's an error. I propose that instead it have the same effect as
Add a new __Bridge configuration option. It will have the same effect as Bridge, but (because it starts with __) its values won't be saved by a SAVECONF command.
Add a new ADDCONF / DELCONF command to help maintain the Bridges and __Bridges configuration options. It will only operate on a linelist. DELCONF will only remove lines when they're provided verbatim.
Clarify that it's okay to be a proxy that only supports SOCKS4a, so that nobody goes out of their way to build SOCKS5 support when they don't need to.
Add a new SOCKS reply code meaning "proxy not ready yet; try later."
Let's add a weight parameter to bridges.
What you describe sounds like it will work for flash proxy.
I guess there is a tension with #5018 (moved)--ideally we want the flashproxy-client transport plugin to start without having to configure any false bridges. We currently use the "fake" address 0.0.1.0:1, but it's not for the sake of getting the transport to start up--it's because we don't know any real bridge addresses at startup time.
I guess there is a tension with #5018 (moved)--ideally we want the flashproxy-client transport plugin to start without having to configure any false bridges. We currently use the "fake" address 0.0.1.0:1, but it's not for the sake of getting the transport to start up--it's because we don't know any real bridge addresses at startup time.
Good point -- we'll need:
A flag to apply to a client plugin to tell Tor to launch the plugin whether any current bridges want it or not.
I guess there is a tension with #5018 (moved)--ideally we want the flashproxy-client transport plugin to start without having to configure any false bridges. We currently use the "fake" address 0.0.1.0:1, but it's not for the sake of getting the transport to start up--it's because we don't know any real bridge addresses at startup time.
Good point -- we'll need:
A flag to apply to a client plugin to tell Tor to launch the plugin whether any current bridges want it or not.
Nick's proposal appears almost entirely orthogonal to the problems StegoTorus has with doing SOCKS; it appears mostly about improving communication between Tor and a controller process, which ST isn't. The "configuration may be too large for a SOCKS connection request" issue, which is probably the most important, is only addressed by saying that it's okay to require use of SOCKS4a (which has no official upper limit) and I do not feel comfortable relying on that; the actual implementation in ST right now does have an upper limit (chosen arbitrarily, according to comments) and I am concerned that pluggable transports may all pick different arbitrary cutoffs and we'll have a big mess. My other concerns (number of marshal/unmarshal passes, additional implementation costs of having SOCKS code at all) do not seem to have been addressed at all.
I don't understand the stated objection to a "just start talking OR on this local port" bridge method:
(Tor really needs to have some idea when it's making connections to the same bridge or not.)
Any given ST local-port talks to one and only one bridge, whose key fingerprint is (optionally) specified on the "direct" Bridge line. This would seem to be a non-problem.
Nick's proposal appears almost entirely orthogonal to the problems StegoTorus has with doing SOCKS; it appears mostly about improving communication between Tor and a controller process, which ST isn't. The "configuration may be too large for a SOCKS connection request" issue, which is probably the most important, is only addressed by saying that it's okay to require use of SOCKS4a (which has no official upper limit) and I do not feel comfortable relying on that;
Why not? SOCKS4a support is supposed to be guaranteed. There's no reason I can see that a pluggable transport is required to support SOCKS5.
I'm happy allowing Tor to support an arbitrarily large upper limit for socks4a, or to define an upper limit of 128KB, or whatever makes sense.
the actual implementation in ST right now does have an upper limit (chosen arbitrarily, according to comments) and I am concerned that pluggable transports may all pick different arbitrary cutoffs and we'll have a big mess. My other concerns (number of marshal/unmarshal passes, additional implementation costs of having SOCKS code at all) do not seem to have been addressed at all.
There is no way for process A to communicate with process B without some marshalling/unmarshalling, of course, but things could be simpler. I'll scan the above to see if there are any ideas for a simpler proposal.
Also, SOCKS4a is, like, pretty darn simple. Pluggable transports are not required to support all versions of SOCKS; any version of socks is okay. Is the objection to all versions of SOCKS, or just SOCKS5?
I don't understand the stated objection to a "just start talking OR on this local port" bridge method:
(Tor really needs to have some idea when it's making connections to the same bridge or not.)
Any given ST local-port talks to one and only one bridge, whose key fingerprint is (optionally) specified on the "direct" Bridge line. This would seem to be a non-problem.
Ah, so this sounds like ST wants more to be treated like a bridge than like a pluggable transport. If it wants to get its parameters out-of-band, and it wants to have one local port per bridge, then it makes more sense to configure Tor with
[...] it's okay to require use of SOCKS4a (which has no official upper limit) and I do not feel comfortable relying on that;
Why not? SOCKS4a support is supposed to be guaranteed. There's no reason I can see that a pluggable transport is required to support SOCKS5.
It is specifically the "no official upper limit" thing that I am not entirely comfortable relying on. It appears to be a thing left accidentally unspecified, rather than an intentional feature, as I read the SOCKS4a spec.
It is true that we control both ends of the communication here, so maybe this is not as much of a problem as it seems. At the least I think a minimum-maximum number should be written into the pluggable transport spec.
There is no way for process A to communicate with process B without some marshalling/unmarshalling, of course, but things could be simpler.
Lemme try to unpack that a little. We have (abstractly) three processes running on the same machine: Tor, the controller, and the transport. Right now, as I understand it, the intention is that the controller configures the transport by poking Bridge lines into Tor's in-memory configuration, and Tor then turns around and hands that configuration to the transport over SOCKS.
What I'm saying is that it is somewhat more convenient for ST (in particular, ST as a component of the "DEFIANCE" system) if the controller can configure the transport directly rather than having to use the Tor process as a registry. It's not a huge thing, but it is the difference between one or two configuration parsers in ST, and zero or one configuration reformatters in the controller. (Note that this is all somewhat hypothetical, as ST currently has no configuration parser and I never did get around to merging George's managed proxy implementation from obfsproxy. It is possible that this objection would evaporate if looked at more seriously. It is also possible that it would mushroom.)
Also, SOCKS4a is, like, pretty darn simple. Pluggable transports are not required to support all versions of SOCKS; any version of socks is okay. Is the objection to all versions of SOCKS, or just SOCKS5?
This objection is to having any SOCKS support, and is strictly on implementation-complexity grounds. As I said way up top, SOCKS is ~800 lines of code, which might not seem like a big deal, but it intrudes itself on the single most conceptually complicated and (consequently) most bug-ridden aspect of ST, namely connection setup.
I rate it as highly probable that SOCKS is part of why ST's reliability never got to where I could actually finish the crypto layer.
(Tor really needs to have some idea when it's making connections to the same bridge or not.)
Any given ST local-port talks to one and only one bridge, whose key fingerprint is (optionally) specified on the "direct" Bridge line. This would seem to be a non-problem.
Ah, so this sounds like ST wants more to be treated like a bridge than like a pluggable transport. If it wants to get its parameters out-of-band, and it wants to have one local port per bridge, then it makes more sense to configure Tor with
{{{
Bridge 127.0.0.1:34325
Bridge 127.0.0.1:34311
Bridge 127.0.0.1:34415
}}}
or whatever ports ST is listening on.
I don't think anyone has ever suggested this before. Off the top of my head I don't see any reason why it wouldn't work.
I have thought about, but haven't tested, using multiple Bridge lines with different fake addresses. They would in fact all be going to the same websocket bridge, but they would be going through different flash proxies (unknown in advance).
{{{
ClientTransportPlugin websocket exec ./flashproxy-client --register
UseBridges 1
Bridge websocket 0.0.1.0:1
Bridge websocket 0.0.1.1:1
Bridge websocket 0.0.1.2:1
Bridge websocket 0.0.1.3:1
Bridge websocket 0.0.1.4:1
}}}