Opened 6 years ago

Closed 14 months ago

#3875 closed project (fixed)

TBB's Firefox should use optimistic data socks handshake variant

Reported by: arma Owned by: mikeperry
Priority: High Milestone: TorBrowserBundle 2.3.x-stable
Component: Applications/Tor Browser Version:
Severity: Normal Keywords: tbb-performance roundtrip MikePerry201304 tbb-bounty SponsorF20131101
Cc: iang@…, t55wang@…, karsten Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Tor Proposal 181 lets the Tor client save a round-trip if the application speaks socks in a special way. In short, the application needs to send its data before hearing that the socks connection was successful. It's supported as of Tor 0.2.3.3-alpha.

Ian originally had suggested to hack polipo to use a modified socks handshake. With polipo out of the picture for TBB, we should make Firefox itself do it.

Is this something Torbutton should (could) do, or should we patch the Firefox we include in TBB?

Child Tickets

Attachments (2)

torbrowser-optimistic-data.patch (2.7 KB) - added by t55wang 5 years ago.
Optimistic Data patch for Tor Browser.
tor-optimistic-data.patch (2.6 KB) - added by t55wang 5 years ago.
Optimistic Data patch for Tor.

Download all attachments as: .zip

Change History (37)

comment:1 Changed 6 years ago by mikeperry

Milestone: TorBrowserBundle 2.3.x-stable

We will likely need to patch firefox. Also, my guess is that various disparate pieces of Firefox networking stack are not going to like being told it can send data before getting certain types of error codes. This might be a huge pain in the ass.

Therefore, we should first test a simple hack that has Tor always respond with a connection success to see what the user experience is if SOCKS hangs up in the case of errors rather than mucking around with allowing data to be sent before an error comes back. If the UI doesn't reflect this difference in a way the user is likely to understand in the first place, it seems like there's little point in adding the complexity.

comment:2 Changed 6 years ago by arma

Parent ID: #3890

comment:3 Changed 6 years ago by iang

Cc: iang@… added

comment:4 Changed 6 years ago by mikeperry

Component: TorBrowserButtonTor Browser
Priority: normalmajor

comment:5 in reply to:  1 ; Changed 5 years ago by arma

Keywords: performance roundtrip added

Replying to mikeperry:

We will likely need to patch firefox. Also, my guess is that various disparate pieces of Firefox networking stack are not going to like being told it can send data before getting certain types of error codes. This might be a huge pain in the ass.

Ok. Might be a good argument for sticking an http proxy shim back in. Not that we want to maintain one or have it take up space in the bundle.

Therefore, we should first test a simple hack that has Tor always respond with a connection success to see what the user experience is if SOCKS hangs up in the case of errors rather than mucking around with allowing data to be sent before an error comes back. If the UI doesn't reflect this difference in a way the user is likely to understand in the first place, it seems like there's little point in adding the complexity.

I don't understand this last part. The expected gain is not in UI responsiveness. It's in actually getting the pages faster.

I wonder if a patch to polipo would be simpler, and a good way to test out how much of a difference the optimistic data feature makes.

comment:6 in reply to:  5 ; Changed 5 years ago by mikeperry

Replying to arma:

Therefore, we should first test a simple hack that has Tor always respond with a connection success to see what the user experience is if SOCKS hangs up in the case of errors rather than mucking around with allowing data to be sent before an error comes back. If the UI doesn't reflect this difference in a way the user is likely to understand in the first place, it seems like there's little point in adding the complexity.

I don't understand this last part. The expected gain is not in UI responsiveness. It's in actually getting the pages faster.

What I meant was instead of creating a shim or changing *anything* on the Firefox side, let's just test what happens with Tor Browser as it is today if the Tor SOCKS handshake always responds with success to connection attempts, only to close later if the data doesn't actually make it.

This test requires no change to Firefox to carry out.

If the user experience is not substantially different from what it is without the SOCKS change on the Tor side, there is no reason to devote any additional development effort either to change Firefox, or to introduce a shim. We can just deploy the feature immediately.

If there are differences in the user experience, we can perhaps just correct those to make them more uniform. In fact, the error messages Firefox currently gives on connection failure are already pretty uniform, regardless of the type of failure.

Is optimistic data already on by default for Tor 0.2.3.x? If I point my TBB at a 0.2.3.x tor client, can I try this test today?

I wonder if a patch to polipo would be simpler, and a good way to test out how much of a difference the optimistic data feature makes.

While I have a penchant for distrusting academic results, I see no reason to doubt Ian's own data in this case. It seems obvious to me there will be significant improvement in load time by eliminating the round trip for connection setup.

In fact, I was actually a little embarrassed I didn't think of it myself, having worked at a company pretty much exclusively devoted to removing round trips from braindead protocols.

comment:7 in reply to:  6 Changed 5 years ago by arma

Replying to mikeperry:

What I meant was instead of creating a shim or changing *anything* on the Firefox side, let's just test what happens with Tor Browser as it is today if the Tor SOCKS handshake always responds with success to connection attempts, only to close later if the data doesn't actually make it.

This test requires no change to Firefox to carry out.

Oh! So a Tor client patch you mean. I opened #5915 with what I think you meant.

comment:8 Changed 5 years ago by iang

Tao's actually working on this as we speak. He's trying both the "tiny change to FF" and the "just modify the OP" approaches. If either of you will be at Oakland, he'll be here too.

comment:9 in reply to:  6 Changed 5 years ago by arma

Replying to mikeperry:

Is optimistic data already on by default for Tor 0.2.3.x? If I point my TBB at a 0.2.3.x tor client, can I try this test today?

You'll need to set "OptimisticData 1" in your torrc. And I don't know how much it's been tested. But it is supposed to be working, yes.

comment:10 in reply to:  8 ; Changed 5 years ago by arma

Replying to iang:

Tao's actually working on this as we speak. He's trying both the "tiny change to FF" and the "just modify the OP" approaches. If either of you will be at Oakland, he'll be here too.

Ian:

I forgot to bring this up with him at Oakland.

Any news?

comment:11 in reply to:  10 ; Changed 5 years ago by t55wang

Replying to arma:

Replying to iang:

Tao's actually working on this as we speak. He's trying both the "tiny change to FF" and the "just modify the OP" approaches. If either of you will be at Oakland, he'll be here too.

Ian:

I forgot to bring this up with him at Oakland.

Any news?

I've implemented preliminary versions of both. Right now, the "tiny change to FF" approach works except for the SSL handshake, which is to say it does not work. The "modifying OP" approach works, but doesn't deal with some errors very well yet. I will keep you updated when either or both approaches do in fact work.

comment:12 in reply to:  11 ; Changed 5 years ago by mikeperry

Replying to t55wang:

I've implemented preliminary versions of both. Right now, the "tiny change to FF" approach works except for the SSL handshake, which is to say it does not work. The "modifying OP" approach works, but doesn't deal with some errors very well yet. I will keep you updated when either or both approaches do in fact work.

How exactly does the SSL handshake fail, and in what cases?

And for the OP case, what does "not dealing with some errors very well" mean in practice? The user experience for errors is already pretty bad. I think in practice we can ignore most UX issues with the errors since the perf benefit for the good case is so clear.

Also, can you attach patches as well? It will make it easier for us to make suggestions/decisions.

Changed 5 years ago by t55wang

Optimistic Data patch for Tor Browser.

Changed 5 years ago by t55wang

Attachment: tor-optimistic-data.patch added

Optimistic Data patch for Tor.

comment:13 in reply to:  12 ; Changed 5 years ago by t55wang

Replying to mikeperry:

Replying to t55wang:

I've implemented preliminary versions of both. Right now, the "tiny change to FF" approach works except for the SSL handshake, which is to say it does not work. The "modifying OP" approach works, but doesn't deal with some errors very well yet. I will keep you updated when either or both approaches do in fact work.

How exactly does the SSL handshake fail, and in what cases?

And for the OP case, what does "not dealing with some errors very well" mean in practice? The user experience for errors is already pretty bad. I think in practice we can ignore most UX issues with the errors since the perf benefit for the good case is so clear.

These things have been fixed. For the SSL handshake, I told it to stop polling after one round and I shouldn't have. For the errors, I meant that DNS resolution failure returns "connection has been reset" to the Browser, which seems confusing, so I wrote an error message explaining what happened (copying from the original "Tor is not an HTTP proxy" message) for that case.

Also, can you attach patches as well? It will make it easier for us to make suggestions/decisions.

I have included patches for both the Tor Browser and Tor. Both are working, but performance measurements are underway and should be done by tomorrow.

The Tor Browser fix is slightly more involved, as it introduces a new state to the connect process (marking where the GET request can be sent) and changes the behavior of the SOCKS handshake. I am not sure if there are unforeseen consequences of flipping the handshake on its head - everything I've tried has worked so far. The Tor fix is much simpler (the core is literally one line), but there are some potential problems. We did a "hack" where Tor checks if the port we're trying to talk to is port 80, and only displays the specific HTTP error message (for DNS resolution failure) in that situation. There's also the problem that Tor is essentially lying to Firefox by telling it that the socks connect has been accepted by the end server, when it hasn't been.

comment:14 Changed 5 years ago by t55wang

Cc: t55wang@… added

comment:15 in reply to:  13 ; Changed 5 years ago by mikeperry

Replying to t55wang:

Replying to mikeperry:

Replying to t55wang:

I've implemented preliminary versions of both. Right now, the "tiny change to FF" approach works except for the SSL handshake, which is to say it does not work. The "modifying OP" approach works, but doesn't deal with some errors very well yet. I will keep you updated when either or both approaches do in fact work.

How exactly does the SSL handshake fail, and in what cases?

And for the OP case, what does "not dealing with some errors very well" mean in practice? The user experience for errors is already pretty bad. I think in practice we can ignore most UX issues with the errors since the perf benefit for the good case is so clear.

These things have been fixed. For the SSL handshake, I told it to stop polling after one round and I shouldn't have. For the errors, I meant that DNS resolution failure returns "connection has been reset" to the Browser, which seems confusing, so I wrote an error message explaining what happened (copying from the original "Tor is not an HTTP proxy" message) for that case.

Ah yeah, for DNS typos that's pretty sad... Random thought: On the Firefox side, we could simply retry with optimistic data disabled before actually displaying any errors... The error path would then be like 2X as slow but correct, but the common case would be still fast. Or, we could sink the work into figuring out how to communicate the error belatedly through some ungodly side channel. I would not be opposed to either of these approaches, if one or both are easy...

However, looking at your change to Tor, it looks like we can still give the exact error condition to the user anyways.. Unless that's what you mean wrt the port 80 hack? I guess the error message gets eaten by Firefox if the DNS failure was for a 443 connect? Oh, ew, also, we'd have to localize that message in tor either way, and I think tor currently has no localization support at all for any messages :/.

Also, can you attach patches as well? It will make it easier for us to make suggestions/decisions.

I have included patches for both the Tor Browser and Tor. Both are working, but performance measurements are underway and should be done by tomorrow.

The Tor Browser fix is slightly more involved, as it introduces a new state to the connect process (marking where the GET request can be sent) and changes the behavior of the SOCKS handshake. I am not sure if there are unforeseen consequences of flipping the handshake on its head - everything I've tried has worked so far. The Tor fix is much simpler (the core is literally one line), but there are some potential problems. We did a "hack" where Tor checks if the port we're trying to talk to is port 80, and only displays the specific HTTP error message (for DNS resolution failure) in that situation. There's also the problem that Tor is essentially lying to Firefox by telling it that the socks connect has been accepted by the end server, when it hasn't been.

Do you know off-hand if we can grow the SOCKS state machine easily to do a (possibly synthetic?) retry-on-error, or is that likely to involve some higher-level Firefox logic or side channel?

comment:16 in reply to:  15 ; Changed 5 years ago by t55wang

Replying to mikeperry:

Replying to t55wang:

Replying to mikeperry:

Replying to t55wang:

I've implemented preliminary versions of both. Right now, the "tiny change to FF" approach works except for the SSL handshake, which is to say it does not work. The "modifying OP" approach works, but doesn't deal with some errors very well yet. I will keep you updated when either or both approaches do in fact work.

How exactly does the SSL handshake fail, and in what cases?

And for the OP case, what does "not dealing with some errors very well" mean in practice? The user experience for errors is already pretty bad. I think in practice we can ignore most UX issues with the errors since the perf benefit for the good case is so clear.

These things have been fixed. For the SSL handshake, I told it to stop polling after one round and I shouldn't have. For the errors, I meant that DNS resolution failure returns "connection has been reset" to the Browser, which seems confusing, so I wrote an error message explaining what happened (copying from the original "Tor is not an HTTP proxy" message) for that case.

Ah yeah, for DNS typos that's pretty sad... Random thought: On the Firefox side, we could simply retry with optimistic data disabled before actually displaying any errors... The error path would then be like 2X as slow but correct, but the common case would be still fast. Or, we could sink the work into figuring out how to communicate the error belatedly through some ungodly side channel. I would not be opposed to either of these approaches, if one or both are easy...

However, looking at your change to Tor, it looks like we can still give the exact error condition to the user anyways.. Unless that's what you mean wrt the port 80 hack? I guess the error message gets eaten by Firefox if the DNS failure was for a 443 connect? Oh, ew, also, we'd have to localize that message in tor either way, and I think tor currently has no localization support at all for any messages :/.

The error is sort of communicated to the user in this patch I posted - if DNS resolution fails with the Tor fix (not the Tor Browser fix), the user will see a message displayed in the Browser saying something like "Tor has terminated the connection because DNS resolution failed".

Oh, you're right, 443 connects were not considered. I could change it to "port 80 or port 443". Not sure if this is the best solution.

What do you mean by localization?

Also, can you attach patches as well? It will make it easier for us to make suggestions/decisions.

I have included patches for both the Tor Browser and Tor. Both are working, but performance measurements are underway and should be done by tomorrow.

The Tor Browser fix is slightly more involved, as it introduces a new state to the connect process (marking where the GET request can be sent) and changes the behavior of the SOCKS handshake. I am not sure if there are unforeseen consequences of flipping the handshake on its head - everything I've tried has worked so far. The Tor fix is much simpler (the core is literally one line), but there are some potential problems. We did a "hack" where Tor checks if the port we're trying to talk to is port 80, and only displays the specific HTTP error message (for DNS resolution failure) in that situation. There's also the problem that Tor is essentially lying to Firefox by telling it that the socks connect has been accepted by the end server, when it hasn't been.

Do you know off-hand if we can grow the SOCKS state machine easily to do a (possibly synthetic?) retry-on-error, or is that likely to involve some higher-level Firefox logic or side channel?

Currently the SOCKS state machine works this way, as far as I know. During the handshake stage, if anything arrives at the socket, a function catches it during polling and calls the handshake function. The handshake function tries to proceed to the next step of the handshake based on what arrived at the socket and what its current step is. If it succeeds, it will keep trying to go to the next step further until it can't, in which case it informs the polling function to call it again once anything else arrives at the socket. At some step the handshake is finished.

So there is a sort of retry-on-error in there, the handshake function will keep trying until the handshake function itself declares that the handshake is finished (either a complete success or a complete failure). There are only a few conditions which trigger a complete failure, like a bad SOCKS version number or an unreadable destination address. So, what the handshake function understands as a complete failure can indeed be tampered with. What are we considering?

I have done some straightforward measurements. I measured the time taken between the SOCKS request and the HTTP 200 OK reply for a given site. After ten measurements Tor with optimistic data had an average time of 0.56 seconds with a standard deviation of 0.14 seconds, whereas Tor without optimistic data had an average time of 0.80 seconds with a standard deviation of 0.11 seconds.

If I measure the time between the GET request and the HTTP 200 OK reply for the same site, the average would be 0.56 seconds with a standard deviation of 0.09 seconds for Tor without optimistic data, and the same measurement would be unchanged for Tor with optimistic data (logically).

comment:17 in reply to:  16 Changed 5 years ago by iang

Replying to t55wang:

Oh, you're right, 443 connects were not considered. I could change it to "port 80 or port 443". Not sure if this is the best solution.

You can't do it for 443, since FF is expecting to receive an SSL handshake, not an HTTP response.

What do you mean by localization?

Other languages.

In my opinion, the "patch the FF SOCKS state machine" approach is far superior to the "patch Tor to lie about it connectedness state and then do some hackery if it's caught out in that lie" approach.

comment:18 Changed 5 years ago by t55wang

Is there anything else I can do to get this into the latest browser bundle so that tor clients can start using optimistic data?

comment:19 Changed 5 years ago by mikeperry

Keywords: MikePerry201208 added

I guess I got confused by your above comments in response to the behavior of the browser patch (which given the localization and SSL issues does seem better than the Tor patch). Just to clarify your above comments: The DNS failure/typo case for the browser patch results in a "Connection reset" message with the current browser SOCKS patch?

If so, we might want to come up with some sort of way to retry failed requests like that without optimistic data... You say the SOCKS state machine already retries for that failure case, though? That seems weird to me...

comment:20 Changed 5 years ago by mikeperry

Keywords: MikePerry201210 added; MikePerry201208 removed

comment:21 Changed 5 years ago by mikeperry

Keywords: tbb-bounty added

comment:22 in reply to:  19 Changed 5 years ago by t55wang

Replying to mikeperry:

I guess I got confused by your above comments in response to the behavior of the browser patch (which given the localization and SSL issues does seem better than the Tor patch). Just to clarify your above comments: The DNS failure/typo case for the browser patch results in a "Connection reset" message with the current browser SOCKS patch?

If so, we might want to come up with some sort of way to retry failed requests like that without optimistic data... You say the SOCKS state machine already retries for that failure case, though? That seems weird to me...

In Tor Firefox, before my modification, you'd get a "Connection Reset" message. After my modification, you'd get a "Unable to Connect" message, after two or three seconds (when I tried it). In original Firefox, not the Tor browser, you'd get a "Server not found" message. (this makes a bit of sense when you consider that in optimistic data we sort of assume the server always agreed to the SOCKS connection, so it can't be "not found")

Retrying without optimistic data sounds interesting, but I'm not exactly sure why. Is there a situation under which Tor browser with optimistic data would not be able to find the server but Tor browser without it would?

I misunderstood when I replied earlier; what I meant is that the SOCKS state machine has a number of retry mechanisms when certain things fail (one of which I manipulated to get optimistic data in), For your particular case, it does not in my patch. The browser sends the GET request optimistically and any failure is simply reflected as a failure of the GET request.

I think it may be possible to both match the error messages perfectly and force a retry with optimistic data by playing with the state machine more cleverly. I'm not sure about this though. I haven't managed to do so, and this seems to work anyway.

comment:23 Changed 5 years ago by mikeperry

Keywords: MikePerry201211 added; MikePerry201210 removed

"Unable to connect" might be general enough that it can work. I guess I need to just test this patch in a TBB-alpha build and decide for myself. Hopefully I can find the time to do that soon.

comment:24 Changed 5 years ago by mikeperry

Keywords: MikePerry201212 added; MikePerry201211 removed
Status: newneeds_review

comment:25 Changed 5 years ago by mikeperry

Keywords: MikePerry201301 added; MikePerry201212 removed

comment:26 Changed 5 years ago by mikeperry

Keywords: MikePerry201302 added; MikePerry201301 removed

comment:27 Changed 5 years ago by arma

Keywords: SponsorF20131101 added

comment:28 Changed 5 years ago by arma

Type: enhancementproject

(Apparently only Projects get listed in trac tables. Move along, nothing to see here.)

comment:29 Changed 5 years ago by mikeperry

Keywords: MikePerry201304 added; MikePerry201302 removed

Man, hopefully by April things have calmed down enough for me to have time to look at this :/.

comment:30 Changed 5 years ago by karsten

Cc: karsten added

comment:31 Changed 5 years ago by mikeperry

arma says Optimistic Data should be on by default in the tor client shipped with TBB-stable (if used with 0.2.3.x exits). But, if it's easy to point this beast at a 0.2.4.x tor client for #8184, I should do that too.

comment:32 Changed 5 years ago by mikeperry

Ok, after much delay (sorry about that) I have a build working with this patch, which I have stored in mikeperry/2.4-next until we get the next TBB release out.

Based on Wireshark, it seems to be working just fine! The strange thing is that it *also* seemed to work just fine with an 0.2.2.x Tor client...

I'll drive this around for the next few days and see what I think about the various error messages. Any other tips to verify that Optimistic Data is actually happening all the way to the exit relay would be helpful too.

comment:33 Changed 5 years ago by mikeperry

Resolution: fixed
Status: needs_reviewclosed

I merged this for the next TBB alpha and stable release (I merged it to both because we don't have the ability to have separate patches for TBB-alpha and TBB-stable at the moment).

I've been running this for a while against an 0.2.4.x client and I've been getting successful optimistic data messages in my logs, and I haven't seen any strange errors.

I am a little worried because other things than just SOCKS use nsISocketTransport (including Torbutton's control port activity). I haven't seen any obvious issues so far, though.

I'll keep an eye on it in debug builds for a while longer, and look into the overall behavior a bit closer for #8184.

But it seems good so far! Thanks!

comment:34 Changed 15 months ago by cypherpunks

Component: Firefox Patch IssuesApplications/Tor Browser
Parent ID: #3890
Resolution: fixed
Severity: Normal
Status: closedreopened
Summary: TBB's Firefox should use optimistic data socks handshake variantTBB's Firefox shouldn't use optimistic data socks handshake variant
Type: projectdefect

Tor Browser shouldn't use optimistic data socks handshake variant as totally incompatible with Firefox code and total broken.

comment:35 Changed 14 months ago by bugzilla

Keywords: tbb-performance added; performance removed
Resolution: fixed
Status: reopenedclosed
Summary: TBB's Firefox shouldn't use optimistic data socks handshake variantTBB's Firefox should use optimistic data socks handshake variant
Type: defectproject

You've inverted the original meaning of this ticket! Please, don't do that. If you think it's a defect, file a new ticket with some proofs.

Note: See TracTickets for help on using tickets.