Opened 21 months ago

Last modified 5 weeks ago

#18859 merge_ready defect

A new SOCKS connection should use a pre-built circuit for its first stream

Reported by: arthuredelstein Owned by: arthuredelstein
Priority: Medium Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tbb-performance, tor-client, performance, 031-backport
Cc: intrigeri Actual Points:
Parent ID: Points:
Reviewer: nickm Sponsor:

Description

Since #3455, we use SOCKS auth isolation in Tor Browser to separate URL bar domains to different tor circuits. When the user browses to a new URL bar domain, a new SOCKS connection is opened with a SOCKS username/password unique to the site's domain.

By telneting to the tor control port, I observed that immediately after I entered a new URL bar domain in a Tor Browser tab, a new circuit was built and assigned the SOCK_USERNAME and SOCKS_PASSWORD for that URL bar domain.

It seems there would be better performance if we could use an existing, pre-built (yet-unused) circuit when a new SOCKS connection opens, and assign the SOCKS_USERNAME and SOCKS_PASSWORD to the pre-built circuit. That way the user wouldn't have to wait for a circuit to be established after requesting a new website.

I don't know yet whether this is something that can be adjusted by config settings or if we would need to patch tor somehow.

Child Tickets

Change History (27)

comment:1 Changed 21 months ago by arthuredelstein

Summary: A new SOCKS connection should use a pre-built circuit for its streamA new SOCKS connection should use a pre-built circuit for its first stream

comment:2 Changed 21 months ago by cypherpunks

Do you mean the same as #18122? (Mind the original summary/component.)

comment:3 Changed 21 months ago by arma

Arthur: can you attach an event log (setevents circ stream) here to show the problem?

That way when other people here try to reproduce it, they'll know whether they're seeing what you see or not.

Incidentally, it *is* expected behavior that Tor tries to launch a circuit right around the time you make your stream request. That is because Tor should attach the new stream to the preemptively built circuit, and then very soon after it will notice that it doesn't have enough preemptively built circuits hanging around anymore, and launch a new one.

So one of the first debugging steps is to make sure you're not seeing and misinterpreting that behavior.

comment:4 in reply to:  3 Changed 21 months ago by arthuredelstein

Replying to arma:

Arthur: can you attach an event log (setevents circ stream) here to show the problem?

Here's an example session:

setevents circ stream
250 OK
getinfo circuit-status
250+circuit-status=
1 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings BUILD_FLAGS=ONEHOP_TUNNEL,IS_INTERNAL,NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:03.769809
2 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$A47CF4F0B9AFD005C0A5B67A503158923202BE90~dragon1993,$EC116BCB80565A408CE67F8EC3FE3B0B02C3A065~orion BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:04.809515 SOCKS_USERNAME="--unknown--" SOCKS_PASSWORD="0"
3 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$EF29F69F49FF1C17CBDCFF8E11E10CD9F6B9DF95~raincat,$6207FC9DDE4EC78F45BB24C53C2EEE63DCC2E2B6~PrivacyRepublic BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:05.790793
4 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$385FE4D32D51C03766F3FBD24B108D6FB82C9E36~CDR,$2CF5110C8F31F45737F807F080ABA90A7AE50781~dragon1993 BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY,NEED_UPTIME PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:06.789676
5 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$F60E3B747FE017346005243058FFFE72AA7D59A2~torpoc666,$5937800DEB5B3FDF68BD3392F2AE3A084254FE5F~tor6kryptonit BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY,NEED_UPTIME PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:07.790465
.
250 OK

Now I opened a new tab and entered "https://arthuredelstein.github.io" in the URL bar domain:

650 STREAM 14 NEW 0 arthuredelstein.github.io:443 SOURCE_ADDR=127.0.0.1:52129 PURPOSE=USER
650 STREAM 14 SENTCONNECT 3 arthuredelstein.github.io:443
650 STREAM 14 REMAP 3 23.235.43.133:443 SOURCE=EXIT
650 STREAM 14 SUCCEEDED 3 23.235.43.133:443

Then I waited a few seconds, and opened a new tab and entered "example.com":

650 STREAM 15 NEW 0 example.com:80 SOURCE_ADDR=127.0.0.1:52130 PURPOSE=USER
650 CIRC 6 LAUNCHED BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:55.108644
650 CIRC 6 EXTENDED $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:55.108644 SOCKS_USERNAME="example.com" SOCKS_PASSWORD="0"
650 CIRC 6 EXTENDED $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$408807EE2ED6C87F921139ABE0C07E84203A8621~MysticOrb BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:55.108644 SOCKS_USERNAME="example.com" SOCKS_PASSWORD="0"
650 CIRC 6 EXTENDED $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$408807EE2ED6C87F921139ABE0C07E84203A8621~MysticOrb,$B0964415A5380080570845E7CBFCADF87FDCCE5A~Necto7 BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:55.108644 SOCKS_USERNAME="example.com" SOCKS_PASSWORD="0"
650 CIRC 6 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$408807EE2ED6C87F921139ABE0C07E84203A8621~MysticOrb,$B0964415A5380080570845E7CBFCADF87FDCCE5A~Necto7 BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:55.108644 SOCKS_USERNAME="example.com" SOCKS_PASSWORD="0"
650 STREAM 15 SENTCONNECT 6 example.com:80
650 STREAM 15 REMAP 6 93.184.216.34:80 SOURCE=EXIT
650 STREAM 15 SUCCEEDED 6 93.184.216.34:80
[snipped more circuit and stream events]

Finally I inspected the list of open circuits again.

getinfo circuit-status
250+circuit-status=
1 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings BUILD_FLAGS=ONEHOP_TUNNEL,IS_INTERNAL,NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:03.769809
2 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$A47CF4F0B9AFD005C0A5B67A503158923202BE90~dragon1993,$EC116BCB80565A408CE67F8EC3FE3B0B02C3A065~orion BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:04.809515 SOCKS_USERNAME="--unknown--" SOCKS_PASSWORD="0"
3 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$EF29F69F49FF1C17CBDCFF8E11E10CD9F6B9DF95~raincat,$6207FC9DDE4EC78F45BB24C53C2EEE63DCC2E2B6~PrivacyRepublic BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:05.790793 SOCKS_USERNAME="arthuredelstein.github.io" SOCKS_PASSWORD="0"
4 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$385FE4D32D51C03766F3FBD24B108D6FB82C9E36~CDR,$2CF5110C8F31F45737F807F080ABA90A7AE50781~dragon1993 BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY,NEED_UPTIME PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:06.789676
5 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$F60E3B747FE017346005243058FFFE72AA7D59A2~torpoc666,$5937800DEB5B3FDF68BD3392F2AE3A084254FE5F~tor6kryptonit BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY,NEED_UPTIME PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:07.790465
6 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$408807EE2ED6C87F921139ABE0C07E84203A8621~MysticOrb,$B0964415A5380080570845E7CBFCADF87FDCCE5A~Necto7 BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:55.108644 SOCKS_USERNAME="example.com" SOCKS_PASSWORD="0"
7 BUILT $50586E25BE067FD1F739998550EDDCB1A14CA5B2~Jans BUILD_FLAGS=ONEHOP_TUNNEL,IS_INTERNAL,NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:44:05.795443
.
250 OK

So I saw two behaviors:

  1. arthuredelstein.github.io used a pre-built circuit (3)
  2. example.com did not use a pre-built circuit, but instead a new circuit (6) was launched

I tried browsing to a series of new domains, and my observation is that the majority of the time Behavior 2 happens. (Behavior 1 is of course better for performance.) I guess this could be that the rate of production of pre-built circuits is slower that the rate at which I was browsing to new sites in my tests. So I guess my question is, what determines the rate at which pre-built circuits are built, and could we make a performance improvement if we increased that rate?

Last edited 21 months ago by arthuredelstein (previous) (diff)

comment:5 Changed 21 months ago by bugzilla

Owner: set to nickm
Status: newassigned

Setting Owner temporarily to nickm, because, in general, tickets like this (#18122, #18138, #18229, #18310) are about Integration (e.g. between Tor & TB), and only Nick has architect "ability" to organize this process. (Currently there's even no such option in Component on Trac, and teams prefer to ping-pong such tickets to each other with no progress.)

comment:6 Changed 21 months ago by nickm

Owner: nickm deleted

comment:7 Changed 21 months ago by nickm

Status: assignednew

Removing myself as owner. I can't do anything here until we have an analysis of what to actually try.

what determines the rate at which pre-built circuits are built, and could we make a performance improvement if we increased that rate?

The code is in circuit_predict_and_launch_new, which is called once once per second. We could ramp that up, but it's not a super-cheap function. We could also schedule it to get invoked shortly after the first time any clean circuit becomes dirty.

Also, circuit_predict_and_launch_new seems to have quite a few magic numbers in it to determine the number of circuits to have for hidden services. Adjusting those might get better results.

comment:8 in reply to:  7 Changed 21 months ago by arthuredelstein

Replying to nickm:

Removing myself as owner. I can't do anything here until we have an analysis of what to actually try.

what determines the rate at which pre-built circuits are built, and could we make a performance improvement if we increased that rate?

The code is in circuit_predict_and_launch_new, which is called once once per second. We could ramp that up, but it's not a super-cheap function. We could also schedule it to get invoked shortly after the first time any clean circuit becomes dirty.

Thanks for point out this function. I'll try logging from that function and see if I can understand what is happening.

I doubt that increasing the rate of circuit_predict_and_launch_new calls to more than once per second will help, as I wasn't browsing to new sites more than once every several seconds. I think the issue is more likely related to the algorithm in that function that decides whether a new circuit is needed.

comment:9 Changed 12 months ago by arthuredelstein

Keywords: tbb-performance added

comment:10 Changed 11 months ago by nickm

Component: Core TorCore Tor/Tor

comment:11 Changed 11 months ago by nickm

Milestone: Tor: unspecified
Status: newneeds_information

Did you find anything out?

comment:12 in reply to:  11 Changed 11 months ago by arthuredelstein

Replying to nickm:

Did you find anything out?

I haven't yet -- I was investigating this issue further by using the "ts" command to get timing and realized that the worst bottleneck seems to be not this issue, but #21394. But I will have another look at this one as well.

comment:13 Changed 7 months ago by nickm

Keywords: tor-client needs-diagnosis needs-analysis performance added

comment:14 Changed 6 weeks ago by arthuredelstein

Status: needs_informationneeds_review

It appears the problem here is caused by a bug in handling isolated circuits. Here's a possible patch (please review):

https://github.com/arthuredelstein/tor/commit/18859

comment:15 Changed 6 weeks ago by nickm

Keywords: review-group-27 added
Milestone: Tor: unspecifiedTor: 0.3.2.x-final

comment:16 Changed 6 weeks ago by nickm

Owner: set to arthuredelstein
Status: needs_reviewassigned

setting owner

comment:17 Changed 6 weeks ago by nickm

Status: assignedneeds_review

comment:18 Changed 6 weeks ago by nickm

Reviewer: nickm

comment:19 Changed 6 weeks ago by nickm

Good detective work here figuring this one out!

Two quick questions:

1) How far do you think it makes sense to backport this?

2) What if we allowed circuits with isolation flags set, so long as the isolation flags were compatible with the stream we wanted to attach? In other words,

diff --git a/src/or/circuituse.c b/src/or/circuituse.c
index aa0df95652246a..497d93923ee278 100644
--- a/src/or/circuituse.c
+++ b/src/or/circuituse.c
@@ -1038,6 +1038,9 @@ circuit_stream_is_being_handled(entry_connection_t *conn,
         continue;
       if (origin_circ->unusable_for_new_conns)
         continue;
+      if (origin_circ->isolation_values_set &&
+          !connection_edge_compatible_with_circuit(conn, origin_circ))
+        continue;
 
       exitnode = build_state_get_exit_node(build_state);
       if (exitnode && (!need_uptime || build_state->need_uptime)) {

comment:20 Changed 6 weeks ago by nickm

Status: needs_reviewneeds_revision

(Other code review notes, which I'm totally willing to clean up if you don't feel like it: We make a changes file for every user-visible change in Tor, and we try to base backportable changes on the oldest maint-* branch that they should be backported to.)

comment:21 in reply to:  20 ; Changed 6 weeks ago by arthuredelstein

Status: needs_revisionneeds_review

Replying to nickm, comment 19:

1) How far do you think it makes sense to backport this?

Tor Browser stable is currently using 0.3.1.x, so I think that would be useful.

2) What if we allowed circuits with isolation flags set, so long as the isolation flags were compatible with the stream we wanted to attach? In other words,

Thanks for the suggestion. I added that check, plus a NULL check, because conn is set to NULL when the code path looks like:

#0  circuit_stream_is_being_handled (conn=0x0, port=443, min=2) at src/or/circuituse.c:993
#1  0x00000453de87e352 in circuit_remove_handled_ports (needed_ports=0x453e056b6e0) at src/or/circuituse.c:974
#2  0x00000453de867cfb in circuit_get_unhandled_ports (now=1512599788) at src/or/circuitbuild.c:1674
#3  0x00000453de867d44 in circuit_all_predicted_ports_handled (now=1512599788, need_uptime=0x7ffe6698b060, 
    need_capacity=0x7ffe6698b064) at src/or/circuitbuild.c:1690
#4  0x00000453de87e7cf in needs_exit_circuits (now=1512599788, needs_uptime=0x7ffe6698b060, needs_capacity=0x7ffe6698b064)
    at src/or/circuituse.c:1082
#5  0x00000453de87ea8c in circuit_predict_and_launch_new () at src/or/circuituse.c:1183
#6  0x00000453de87ec98 in circuit_build_needed_circs (now=1512599788) at src/or/circuituse.c:1268
#7  0x00000453de7b0505 in run_scheduled_events (now=1512599788) at src/or/main.c:1443

Replying to nickm, comment 20:

(Other code review notes, which I'm totally willing to clean up if you don't feel like it: We make a changes file for every user-visible change in Tor, and we try to base backportable changes on the oldest maint-* branch that they should be backported to.)

OK! I've added a changes file (feel free to fix that up as needed) and I applied the patch to maint-0.3.1:
https://github.com/arthuredelstein/tor/commit/18859+1

Last edited 6 weeks ago by arthuredelstein (previous) (diff)

comment:22 in reply to:  21 Changed 6 weeks ago by arma

Replying to arthuredelstein:

OK! I've added a changes file (feel free to fix that up as needed)

Good stuff! I'd say this merits a "major bugfixes", not just minor, given the impact on actual users.

comment:23 Changed 6 weeks ago by nickm

(I had accidentally committed an earlier version of this along with my 046acf208bc53a3fa7ea9 commit, and just reverted it with c2c0f83c23986344c4f3fab03)

comment:24 Changed 6 weeks ago by nickm

Keywords: 031-backpot added
Milestone: Tor: 0.3.2.x-finalTor: 0.3.1.x-final
Status: needs_reviewmerge_ready

Thanks! I've re-done the changes file a bit to make it pass lintChanges, to make it "major" per arma's request, and to try to explain the problem from a user's POV. It's now a branch arthuredelstein_18859+1_031 which I'm merging into 0.3.2 and later. If it doesn't cause trouble, it should get backported.

comment:25 Changed 6 weeks ago by nickm

Keywords: needs-diagnosis needs-analysis review-group-27 removed

comment:26 Changed 5 weeks ago by intrigeri

Cc: intrigeri added

comment:27 Changed 5 weeks ago by cypherpunks

Keywords: 031-backport added; 031-backpot removed

Thanks, Arthur!

Note: See TracTickets for help on using tickets.