Opened 6 years ago

Closed 6 years ago

Last modified 6 years ago

#6271 closed defect (fixed)

sendme windows don't stay synced between sides

Reported by: arma Owned by:
Priority: Medium Milestone: Tor: 0.2.3.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-client
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

#6252 showed us that circwindows grow beyond their initial windows in normal operation. We can't implement the 6252 patch until we fix this bug.

Our friendly irc person suggests

--- relay.orig.c
+++ relay.c
@@ -1263,7 +1263,7 @@
                "'connected' received, no conn attached anymore. Ignoring.");
       return 0;
     case RELAY_COMMAND_SENDME:
-      if (!conn) {
+      if (!rh.stream_id) {
         if (layer_hint) {
           if (layer_hint->package_window + CIRCWINDOW_INCREMENT >
                 CIRCWINDOW_START_MAX) {
@@ -1292,6 +1292,11 @@
         }
         return 0;
       }
+      if (!conn) {
+        log_info(domain,"sendme cell dropped, unknown stream (streamid %d).",
+                 rh.stream_id);
+        return 0;
+      }
       conn->package_window += STREAMWINDOW_INCREMENT;
       log_debug(domain,"stream-level sendme, packagewindow now %d.",
                 conn->package_window);

Child Tickets

Change History (21)

comment:1 Changed 6 years ago by arma

Status: newneeds_review

comment:2 Changed 6 years ago by arma

The bug appears to be "if you send a stream-level sendme but the other side doesn't know about the stream anymore, the other side will misinterpret the cell as a circuit-level sendme".

Sounds plausible.

I *believe* that means that we can put #6271 and #6252 in together, rather than waiting for #6271 to get deployed everywhere first.

comment:3 Changed 6 years ago by arma

"#6271 happens for any such stream: CONNECTED, DATA X*50 times,END"

comment:4 Changed 6 years ago by arma

Status: needs_reviewneeds_revision

comment:5 Changed 6 years ago by karsten

This ticket might be related to #5397.

comment:6 Changed 6 years ago by nickm

A friendly person suggests that there are many other cases where we check for "conn" when we should be checking rh.stream_id. I think we should merge this in 0.2.3 and consider it for 0.2.2 backport once we have the other issues figured out.

comment:7 Changed 6 years ago by nickm

Status: needs_revisionneeds_review

Stuck this in a branch "bug6271_part_a" for inclusion in 0.2.3 and possible backport to 0.2.2

comment:8 Changed 6 years ago by nickm

Milestone: Tor: 0.2.3.x-final

comment:9 Changed 6 years ago by arma

Milestone: Tor: 0.2.3.x-final
Status: needs_reviewneeds_information

Merged that part. Still waiting on more info re what else this bug affects.

comment:10 Changed 6 years ago by karsten

Parent ID: #4465

Roger told me at the Florence hackfest that this ticket is blocking simulations which are needed, e.g., for deciding on a good token bucket refill interval (#4465). Making that ticket a parent of this.

comment:11 Changed 6 years ago by nickm

Milestone: Tor: 0.2.3.x-final

comment:12 Changed 6 years ago by nickm

I still need information here if we're going to make any progress on it.

comment:13 Changed 6 years ago by nickm

No info forthcoming apparently. So, what I can think to do here is: audit the code, looking for:

  • places in relay.c and elsewhere we check conn and make sure that none of them wants to be checking rh.stream_id instead.
  • every place where we adjust a package or deliver window, and make sure that we do so in a way that matches the spec exactly.

Any other ideas? Any other leads?

Roger, can you please give any more hints about how this is blocking simulations? If it's observable in simulation, then presumably I can provoke it and try to figure out how it's happening. Can you give me instructions for doing so, or tell me who can?

comment:14 Changed 6 years ago by arma

I think this ticket isn't blocking simulations anymore. The part we already merged was. We thought it was related to #5397, and it might be. Rob experienced #5397, then later (before we merged this fix) he stopped experiencing it. Mystery. I assume that means it'll be back sometime, unless the part of #6271 we already merged fixed it.

comment:15 in reply to:  13 Changed 6 years ago by arma

Replying to nickm:

audit the code

I did some of that earlier. I didn't find anything that looked out of place. Somebody who didn't write it should probably audit it too. :)

comment:16 Changed 6 years ago by nickm

Okay. I'll poke at this a little longer, see if I can find any more cases, then take it off my list of stuff that we need for mid-aug or else gotterdammerung.

comment:17 Changed 6 years ago by nickm

Didn't find any bogus handling of windows after going through relay.c slowly. So if there's more, it's subtle.

The handling of RELAY_EXTEND cells in relay.c is incorrect: it's checking for conn where it should be checking for stream_id. Doesn't appear to be harmful; should fix anyway. Still looking for more cases.

comment:18 in reply to:  14 Changed 6 years ago by karsten

Parent ID: #4465

Replying to arma:

I think this ticket isn't blocking simulations anymore.

Okay. Removing the parent ticket relationship with #4465.

comment:19 Changed 6 years ago by nickm

Resolution: fixed
Status: needs_informationclosed

Fixed that case where conn should have been rh.stream_id in cdd882ee71fb296.

Can't find any more instances of this bug or the mistake behind it. Please reopen if there are any more, or if the problem reappears.

comment:20 Changed 6 years ago by nickm

Keywords: tor-client added

comment:21 Changed 6 years ago by nickm

Component: Tor ClientTor
Note: See TracTickets for help on using tickets.