Opened 5 years ago

Closed 2 years ago

#13790 closed enhancement (implemented)

Refactor and add comments to new_route_len()

Reported by: dgoulet Owned by: catalyst
Priority: Low Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: 026-deferrable, tor-03-unspecified-201612
Cc: Actual Points:
Parent ID: Points:
Reviewer: dgoulet Sponsor:

Description

(function in src/or/circuitbuild.c)

At the moment, this function is kind of scary and it should be refactored and heavily documented to avoid future confusion. (This ticket exists because two different developers at different time span over a week expressed concerns about the behavior of HS and this function).

Bit of info on new_route_len(), this function is called to possibly extend a circuit with an extra hop that matches specific purpose(s). Right now it's testing the given purpose if it is *not* ESTABLISH_INTRO and TESTING, it will extend it to 4 hops (with exit information).

However, there are other purposes that do NOT need 4 hops. Fortunately, it seems that there are no code paths that end up calling this function with exit info and a purpose that should not be extended (investigated by special, sysrqb, asn and me). But, we all agree that this is very fragile thus the purpose condition in this function should be applied on only purposes that need 4 hops.

Furthermore, add more comments to make sure no more confusion happens with that fairly important function.

Child Tickets

Change History (31)

comment:1 Changed 5 years ago by nickm

Milestone: Tor: 0.2.6.x-final

comment:2 Changed 5 years ago by nickm

Keywords: 026-deferrable added
Priority: normalminor

comment:3 Changed 5 years ago by nickm

Milestone: Tor: 0.2.6.x-finalTor: 0.2.???

Defer some items from 0.2.6. They are mostly worth doing, but not going to happen super-fast.

comment:4 Changed 3 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:5 Changed 3 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:6 Changed 3 years ago by catalyst

Severity: Normal

Where can I find additional background information about what circuits require 4 hops? So far it seems that some but not all of the circuits used in hidden service rendezvous use 4-hop circuits, but most other circuits use 3 hops.

comment:7 Changed 3 years ago by nickm

To a first approximation: you should make 3 hops of your own. So, when the last hop is one that somebody else picked, you should pick 4 hops, so that there will be 3 hops that you picked.

I think this occurs:

  • when a hidden service client connects to an introduction point (which the hidden service picked)
  • when a hidden service itself connects to a rendezvous point (which the client picked).
  • when a hidden service or a hidden service client connects to the hidden service directory (which is determined by the current date and the HS public key)

(The above is from memory; if the code disagrees with me, I might be wrong!)

comment:8 Changed 3 years ago by catalyst

If I look at https://lists.torproject.org/pipermail/tor-dev/2015-October/009763.html I still sense some terminology ambiguity when referring to hops. Counting OR nodes is probably best rather than counting edges. In a normal circuit where a client connects to a public address, there are 3 OR nodes, right? In a normal hidden service circuit there are 2 OR nodes between the client and the rendezvous point (3 ORs total counting the RP) and 2 OR nodes between the hidden service and the RP (3 counting the RP itself) right? Or am I misinterpreting something?

comment:9 Changed 3 years ago by catalyst

Owner: set to catalyst
Status: newaccepted

comment:10 Changed 3 years ago by catalyst

I guess one thing I haven't completely figured out yet is what exactly the return value of new_route_len() is counting -- is it onion routers?

comment:11 Changed 3 years ago by catalyst

Which is worse: a path that should be 4 hops becoming 3 hops, or a path that should be 3 hops becoming 4 hops? I would think the first is worse because it could compromise anonymity, while the second wastes resources, increases latency, and decreases reliability. Does that sound right? Or does using a longer path than is necessary introduce some security risk that I'm missing?

comment:12 in reply to:  11 ; Changed 3 years ago by dgoulet

I can confirm that nickm is right, the 3 cases mentioned are the one where we need at least 3 hops that tor picked.

Replying to catalyst:

Which is worse: a path that should be 4 hops becoming 3 hops, or a path that should be 3 hops becoming 4 hops? I would think the first is worse because it could compromise anonymity, while the second wastes resources, increases latency, and decreases reliability. Does that sound right? Or does using a longer path than is necessary introduce some security risk that I'm missing?

A circuit that was suppose to be 4 hops but is 3 hops is _bad_. A 3 hops that instead become a 4 hops is "OK" but . Actually, that can happen in the code with circuit cannibalization which is when tor look for an existing circuits and then just re-extend to a 4 hops to the endpoint it's trying to connect (intro point, Exit, rendezvous point, ...).

I believe tor has a hard limit of 8 hops for a circuit (#define MAX_RELAY_EARLY_CELLS_PER_CIRCUIT 8). I'm not aware of security risk of having long circuits like that but it definitely is very bad on the user experience and overall load of the network.

Finally, to your question in comment:8, consider 1 hop to be 1 relay to go through. So if a client wants to open a circuit to an Exit node for instance (most common use case), you'll count Guard -> Middle -> Exit thus 3 hops.

For an hidden service connecting to a rendezvous point, you count Guard -> Middle -> Middle -> RP. The service picks 3 hops and then extends to a fourth one.

comment:13 Changed 3 years ago by catalyst

Below are my rough notes including analysis of each call chain to new_route_len(). The new code works and passes make check but I'm still working on making both the new and old new_route_len() unit testable.

Summary

Current behavior is for new_route_len() to return DEFAULT_ROUTE_LEN
if there is no chosen exit node (exit_ei == NULL). If there is a
chosen exit node (exit_ei != NULL), return DEFAULT_ROUTE_LEN + 1
unless purpose == CIRCUIT_PURPOSE_TESTING or
purpose == CIRCUIT_PURPOSE_S_ESTABLISH_INTRO,
in which case still return DEFAULT_ROUTE_LEN.

New behavior is the same except for some logging and a nonfatal
assertion if we get an unexpected purpose with a chosen exit.

Most of the complexity related to new_route_len() actually lies in
its callers

New behavior

When there's no chosen exit node (exit_ei == NULL), always return
DEFAULT_ROUTE_LEN.

When there is a chosen exit node (exit_ei != NULL), explicitly
return (DEFAULT_ROUTE_LEN + 1) for:

  • CIRCUIT_PURPOSE_C_GENERAL: connections to a hidden service directory
  • CIRCUIT_PURPOSE_C_INTRODUCING: client connecting to an introduction point -- the hidden service chose the introduction point so could possibly collude with it to unmask the client
  • CIRCUIT_PURPOSE_S_CONNECT_REND: hidden service connecting to a rendezvous point -- the client could collude with the rendezvous point to unmask the hidden service

but return DEFAULT_ROUTE_LEN for:

  • CIRCUIT_PURPOSE_S_ESTABLISH_INTRO (as in old code) -- the hidden service chose the introduction point, so doesn't need an extra hop
  • CIRCUIT_PURPOSE_TESTING (as in old code)

For purposes not explicitly handled, log a warning but return
(DEFAULT_ROUTE_LEN + 1). This defaults to a longer path for
purposes not explicitly handled, which is safer but possibly wasteful.

Other future work

circuit_get_open_circ_or_launch() could use refactoring: it's a
complicated 300+ line function that tail-calls itself from multiple
places.

Details

new_route_len() is a static function that currently returns
DEFAULT_ROUTE_LEN unless

     (exit_ei &&
      purpose != CIRCUIT_PURPOSE_TESTING &&
      purpose != CIRCUIT_PURPOSE_S_ESTABLISH_INTRO)

in which case it returns DEFAULT_ROUTE_LEN + 1.

The only caller of new_route_len() is static function whose only
caller is onion_pick_cpath_exit(), which doesn't call it unless
!state->onehop_tunnel.

The only caller of onion_pick_cpath_exit() is
circuit_establish_circuit() in src/or/circuitbuild.c.

The only apparent caller of circuit_establish_circuit() is
circuit_launch_by_extend_info() in src/or/circuituse.c

Callers of circuit_launch_by_extend_info() are:

  • circuit_launch() in src/or/circuituse.c, which passes exit_ei=NULL
  • circuit_get_open_circ_or_launch() static function in src/or/circuituse.c
  • rend_service_receive_introduction() in src/or/rendservice.c, which passes purpose=CIRCUIT_PURPOSE_S_CONNECT_REND, exit_ei=rp
  • rend_service_relaunch_rendezvous() in src/or/rendservice.c, which passes purpose=CIRCUIT_PURPOSE_S_CONNECT_REND, exit_ei=oldstate->chosen_exit
  • rend_service_launch_establish_intro() in src/or/rendservice.c, which passes purpose=CIRCUIT_PURPOSE_S_ESTABLISH_INTRO, exit_ei=launch_ei; the current new_route_len() explicitly excludes it
  • consider_testing_reachability() in src/or/router.c, which passes purpose=CIRCUIT_PURPOSE_TESTING, exit_ei=extend_info_from_router(me); the current new_route_len() explicitly excludes it

circuit_launch() always passes exit_ei=NULL.

This leaves as possible 4-hop purposes
CIRCUIT_PURPOSE_S_CONNECT_REND (from rendservice.c callers) and
whatever purposes circuit_get_open_circ_or_launch() passes in.

circuit_get_open_circ_or_launch() is a complicated 300+ line static
function and tail-calls itself from two places.

circuit_get_open_circ_or_launch() sets extend_info if
desired_circuit_purpose is CIRCUIT_PURPOSE_C_INTRODUCE_ACK_WAIT or
CIRCUIT_PURPOSE_C_GENERAL.

circuit_get_open_circ_or_launch() rewrites
CIRCUIT_PURPOSE_C_INTRODUCE_ACK_WAIT to
new_circ_purpose=CIRCUIT_PURPOSE_C_INTRODUCING

The only non-self caller of circuit_get_open_circ_or_launch() is
connection_ap_handshake_attach_circuit().

connection_ap_handshake_attach_circuit() uses a purpose of
CIRCUIT_PURPOSE_C_GENERAL, CIRCUIT_PURPOSE_C_REND_JOINED, or
CIRCUIT_PURPOSE_C_INTRODUCE_ACK_WAIT.

The only non-self caller of connection_ap_handshake_attach_circuit()
seems to be connection_ap_attach_pending(), which is dispatched by
some event-oriented stuff.

A purpose of CIRCUIT_PURPOSE_C_GENERAL with a chosen exit can happen
due to hsdir lookup: if use_begindir == 1 in
connection_ap_make_link(), conn->chosen_exit_name gets set.

The call chain for connection_ap_make_link() that sets
use_begindir=1 is

  • directory_get_from_hs_dir()
  • directory_initiate_command_routerstatus_rend()
  • directory_initiate_command_rend()
  • connection_ap_make_link()

comment:14 Changed 3 years ago by catalyst

Status: acceptedneeds_review

Pushed a proposed series of commits to branch bug13790 at https://gitlab.com/cdlxxvi/tor.git

comment:15 Changed 3 years ago by nickm

Milestone: Tor: unspecifiedTor: 0.3.1.x-final

comment:16 in reply to:  12 Changed 3 years ago by arma

Replying to dgoulet:

I'm not aware of security risk of having long circuits like that but it definitely is very bad on the user experience and overall load of the network.

For posterity: there actually is a subtle anonymity risk that using a longer-than-needed path can introduce. Let's say we have a choice between using 3-hop circuits in Tor and using 6-hop circuits in Tor, and we have an adversary who runs some relays and tries to learn what destinations each user goes to.

For illustration, let's assume we don't use the entry guard design. The adversary's strategy is "if any of your relays are on a circuit, but you don't own the first and last relay on that circuit, then fail the circuit (and hope they try building a new one)." In just a passive attack, an adversary who runs 10% of the network has a .1*.1=0.01 chance of being able to learn your destinations, but with this strategy, it's better than .01 -- and the longer the path length, the more likely the adversary is to be in a position to tear down circuits he can't beat, so the bigger the success rate gets.

See https://www.freehaven.net/anonbib/#ccs07-doa for details.

Now, the entry guard design complicates the attack, because no matter how many times the attacker tears down the circuit, the user isn't going to budge from their guard. Maybe entry guards resolve the attack entirely, because either the attacker doesn't own the guard and they can't win, or they do own the guard, and then they're on the circuit no matter how many hops the circuit has?

Actually, maybe using *one* entry guard resolves the attack, because if you have say three guards, then the attack still works, it's just much more limited? (If the attacker controls some but not all of your guards, then he can do the attack to shift your circuits more onto the guards he controls.)

In any case, there's your potential anonymity implication to longer circuits. So it isn't *just* higher latency for users and more load on the network that makes us want to avoid them.

comment:17 Changed 3 years ago by nickm

Keywords: review-group-16 added

comment:18 Changed 3 years ago by dgoulet

Reviewer: dgoulet

comment:19 Changed 3 years ago by dgoulet

Status: needs_reviewneeds_revision

Thanks arma! It should go in the "Why Tor is like that?" FAQ :)

Ok, I went over the patch. It's pretty solid. I made a small comment on the Gitlab about adding a BUG() macro in case a purpose can't be handled so we can get a stacktrace instead of a simple log line. With that, I'm confident this is merge_ready

comment:20 Changed 3 years ago by nickm

Keywords: review-group-16 removed

comment:21 Changed 2 years ago by catalyst

I responded to the GitLab comment and will soon push a rebased and revised patch series. Hopefully it won't make the comment thread vanish.

comment:22 Changed 2 years ago by catalyst

Comment thread is still visible at https://gitlab.com/cdlxxvi/tor/commit/11ffddf92a1aec121d6685d99b5306839f99b282#note_26287570 if it turns out to be hard to find after my recent force-push.

comment:23 Changed 2 years ago by catalyst

Status: needs_revisionneeds_review

comment:24 Changed 2 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

This looks good to me; I'm merging it. (I had to rebase first since it seems to have picked up a copy of ef4c10fb42e796a7a166 as 4c97157e8963934698d. No worries; worse things happen in C.)

comment:25 Changed 2 years ago by dgoulet

This test could benefit from having an expected msg modification (not failing just verbose):

circuitbuild/unhandled_exit: Apr 03 12:06:20.192 [warn] route_len_for_purpose(): Bug: Unhandled purpose 19 with a chosen exit; assuming routelen 4. (on Tor 0.3.1.0-alpha-dev 67c88fd10dd74d03)

comment:26 Changed 2 years ago by catalyst

dgoulet: Do you mean like using expect_log_msg_containing() to keep the test log cleaner? (and maybe to verify that those additional details get logged?)

comment:27 in reply to:  26 Changed 2 years ago by dgoulet

Replying to catalyst:

dgoulet: Do you mean like using expect_log_msg_containing() to keep the test log cleaner? (and maybe to verify that those additional details get logged?)

Yes! Precisely.

(Sorry, I wanted to offer a patch but I'm kind of physically unable right now so I noted at least the issue for later ref.)

comment:28 Changed 2 years ago by catalyst

Resolution: implemented
Status: closedreopened

comment:29 Changed 2 years ago by catalyst

Status: reopenedneeds_review

comment:30 Changed 2 years ago by dgoulet

Status: needs_reviewmerge_ready

lgtm! Thanks!

comment:31 Changed 2 years ago by nickm

Resolution: implemented
Status: merge_readyclosed

merged that too

Note: See TracTickets for help on using tickets.