Opened 6 years ago

Last modified 19 months ago

#7870 needs_revision defect

Retry on a new circuit for more reasons.

Reported by: arma Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-client, nickm-patch, needs-analysis, needs-design
Cc: nickm Actual Points:
Parent ID: Points: medium
Reviewer: Sponsor:

Description

In git commit 388ac4126 I added some code that will never be reached:

  if (rh->length > 0 && edge_reason_is_retriable(reason) &&
[...]
      case END_STREAM_REASON_CONNECTREFUSED:
        if (!conn->chosen_exit_optional)
          break; /* break means it'll close, below */
        /* Else fall through: expire this circuit, clear the
         * chosen_exit_name field, and try again. */
[...]
      case END_STREAM_REASON_TIMEOUT:

since those two reasons aren't included in edge_reason_is_retriable().

One fix would be just to remove those dead code lines.

But another fix would be to include both of those reasons in the is_retriable() list, to better tolerate broken exits -- for example, an exit that sets -j REJECT in an iptable rule, rather than putting everything in its exit policy; or an exit for which the routing to the destination has a loop, causing the ttl to expire and some internet router to send back an icmp unreachable.

The downsides are a) if will take many seconds longer before your connection fails, if it fails, and b) we throw away circuits even more often, since we'll now chew through several circuits every time you attempt a destination that is down or unreachable.

The upside is that users will see fewer false failures, in proportion to how many broken or misconfigured exits there are.

I'm not entirely sure which side of the fence I'm on here. I think I lean slightly towards retrying for these two cases.

Child Tickets

TicketStatusOwnerSummaryComponent
#10478newUse prepend_policy or similar in preference to mark_circuit_unusable_for_new_connsCore Tor/Tor

Change History (36)

comment:1 Changed 6 years ago by arma

If we keep the code, we should apply this patch:

diff --git a/src/or/relay.c b/src/or/relay.c
index f58c5c9..a80cebb 100644
--- a/src/or/relay.c
+++ b/src/or/relay.c
@@ -811,7 +811,7 @@ connection_ap_process_end_not_open(
         /* else, conn will get closed below */
         break;
       case END_STREAM_REASON_CONNECTREFUSED:
-        if (!conn->chosen_exit_optional)
+        if (conn->chosen_exit_name && !conn->chosen_exit_optional)
           break; /* break means it'll close, below */
         /* Else fall through: expire this circuit, clear the
          * chosen_exit_name field, and try again. */

comment:2 Changed 6 years ago by arma

bug (and suggested fix) reported by "oftc_must_be_destroyed" on oftc

comment:3 Changed 5 years ago by cypherpunks

Milestone: Tor: unspecifiedTor: 0.2.4.x-final
Priority: normalmajor
Type: enhancementdefect

It's bug. You never knows what operator tuned or what filters installed. Like all setups by Torservers tuned tcp/ip to try only 2 syn before giveup, depends kernel it means 11s or 3s of timeout.

comment:4 Changed 5 years ago by cypherpunks

Summary: retry on a new circuit if reason timeout or reason connectrefused?Retry on a new circuit if reason timeout or reason connectrefused

comment:5 Changed 5 years ago by nickm

What's the suggested fix on the Tor side? Still the patch above, or something else?

comment:6 Changed 5 years ago by cypherpunks

Need to include END_STREAM_REASON_TIMEOUT and END_STREAM_REASON_CONNECTREFUSED to edge_reason_is_retriable(), with addon of above patch.

comment:7 Changed 5 years ago by cypherpunks

Priority: majorcritical
Summary: Retry on a new circuit if reason timeout or reason connectrefusedRetry on a new circuit if reason timeout or reason connectrefused. ALARM!

connection_ap_process_end_not_open() is totally broken right now.

    if (reason == END_STREAM_REASON_TORPROTOCOL ||
        reason == END_STREAM_REASON_INTERNAL ||
        reason == END_STREAM_REASON_DESTROY) {
      /* All three of these reasons could mean a failed tag
       * hit the exit and it complained. Do not probe.
       * Fail the circuit. */
      circ->path_state = PATH_STATE_USE_FAILED;
      return -END_CIRC_REASON_TORPROTOCOL;

Client should to retry for any end reason, internal, external, any! Client have no abilities to know what and why exit relay sent that reason and distinguish case of valid (for example) refusing tcp connection by target host or filter or attack or anything else.
Keeping of throw away circuits is related but another task, that can't to stop fixing of this task. Right now exit relay fully controlling client, it can to stuck with it, to drop stream/circuit, and to force move to another circuit too anyway.

comment:8 Changed 5 years ago by cypherpunks

Summary: Retry on a new circuit if reason timeout or reason connectrefused. ALARM!Retry on a new circuit if any reason. ALARM!

comment:9 Changed 5 years ago by nickm

Cc: nickm added
Keywords: 024-backport added
Priority: criticalmajor
Summary: Retry on a new circuit if any reason. ALARM!Retry on a new circuit for more reasons.

The right solution here is in branch my branch bug7870_024, but arma's original comment is right: we shouldn't mark the whole circuit as unsuitable for new circuits just because of this failure. Fortunately, there's a not-too-hard way to fix that; see #10478.

comment:10 Changed 5 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final

comment:11 Changed 5 years ago by nickm

Status: newneeds_review

comment:12 Changed 5 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.6.x-final

Doing this right is contingent on #10478. The END_STREAM_REASON_INTERNAL business was part of #10777.

comment:13 Changed 4 years ago by nickm

Keywords: 025-backport 026-triaged added; 024-backport removed

comment:14 Changed 4 years ago by nickm

Keywords: 026-triaged-0 added; 026-triaged removed

comment:15 Changed 4 years ago by nickm

Keywords: nickm-patch added

Apply a nickm-patch keyword to tickets in needs_review in 0.2.6 where I wrote the patch, so I know which ones I can('t) review myself.

comment:16 Changed 4 years ago by nickm

Keywords: andrea-review added

comment:17 Changed 4 years ago by rl1987

The patch in bug7870_024 looks okay to me.

comment:18 Changed 4 years ago by mikeperry

Status: needs_reviewneeds_revision

I believe that this change is unsafe for Tor Browser - it enables a website to include content elements that continually sends TCP resets which will cause the client to retry on new circuits until the point where a middle node is selected that enables guard discovery. See also #13669.

I would suggest a limit on the number of retries, but really, any number of automatic retries on a new circuit is bad for the web. A web page gets to include as many content elements as it likes, so even if each one is limited to one retry, you can still perform guard discovery with lots of them :/.

comment:19 Changed 4 years ago by mikeperry

Perhaps a possible solution for this and related problems (#13669, others?) is to make SOCKS username+password isolation optionally much stronger. If I set a torrc option like "StrongSocksIsolation", and I send a username+password that is already in use for a request, perhaps Tor should fail that request if it cannot re-use a circuit that was already used for that SOCKS username+password?

comment:20 Changed 4 years ago by andrea

Keywords: andrea-review removed

This patch looks fine to me, but mikeperry's comment is concerning. I'm nervous about this behavior.

comment:21 Changed 4 years ago by nickm

Milestone: Tor: 0.2.6.x-finalTor: 0.2.7.x-final

comment:22 Changed 3 years ago by nickm

Milestone: Tor: 0.2.7.x-finalTor: 0.2.8.x-final

comment:23 Changed 3 years ago by nickm

Keywords: 028-triaged added

comment:24 Changed 3 years ago by nickm

Keywords: needs-analysis added
Points: medium

comment:25 Changed 3 years ago by nickm

Priority: HighMedium

comment:26 Changed 3 years ago by nickm

Keywords: pre028-patch added

comment:27 Changed 3 years ago by nickm

Milestone: Tor: 0.2.8.x-finalTor: 0.2.9.x-final

It is impossible that we will fix all 188 currently open 028 tickets before 028 releases. Time to move some out. This is my second pass through the "needs_revision" tickets, looking for things to move to 0.2.9.

Note that if the requested revisions happen in advance of the 0.2.8 freeze, they can get considered for 0.2.8.

comment:28 Changed 3 years ago by isabela

Milestone: Tor: 0.2.9.x-finalTor: 0.2.???

tickets market to be removed from milestone 029

comment:29 Changed 2 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:30 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:31 Changed 19 months ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:32 Changed 19 months ago by nickm

Keywords: 026-triaged-0 removed

comment:33 Changed 19 months ago by nickm

Keywords: 028-triaged removed

comment:34 Changed 19 months ago by nickm

Keywords: pre028-patch removed

comment:35 Changed 19 months ago by nickm

Keywords: 025-backport removed

These are not ripe for 0.2.5, even if they do get fixed now

comment:36 Changed 19 months ago by nickm

Keywords: needs-design added
Severity: Normal
Note: See TracTickets for help on using tickets.