Opened 9 years ago

Closed 9 years ago

Last modified 7 years ago

#1653 closed defect (fixed)

exit relays don't consider local cell queue when hearing sendme

Reported by: arma Owned by:
Priority: High Milestone: Tor: 0.2.2.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay
Cc: Actual Points:
Parent ID: #1298 Points:
Reviewer: Sponsor:

Description

When the exit relay gets a circuit-level sendme cell, it starts reading on the exit streams, even if we have 500 cells queued in our circuit queue already. So our circuit queue just grows and grows in some cases.

Reported by Mashael, Ian's grad student. She might even have a patch, but I haven't seen it yet so opening a ticket.

Child Tickets

Attachments (1)

diff (1.8 KB) - added by yetonetime 9 years ago.
For compare with exists patches.

Download all attachments as: .zip

Change History (19)

Changed 9 years ago by yetonetime

Attachment: diff added

For compare with exists patches.

comment:1 in reply to:  description Changed 9 years ago by yetonetime

Replying to arma:

When the exit relay gets a circuit-level sendme cell, it starts reading on the exit streams, even if we have 500 cells queued in our circuit queue already. So our circuit queue just grows and grows in some cases.

Reported by Mashael, Ian's grad student. She might even have a patch, but I haven't seen it yet so opening a ticket.

Any progress?

comment:2 Changed 9 years ago by yetonetime

No progress.

comment:3 Changed 9 years ago by nickm

Milestone: Tor: 0.2.2.x-final

comment:4 Changed 9 years ago by nickm

Status: newneeds_review

comment:5 Changed 9 years ago by nickm

The patch looks good to me. I'd change the implementation on circuit_queue_high() so that instead of checking for "queue->n >= HIGHWATER" it checks for "streams_blocked && queue->n >= LOWWATER". Being above the high-water mark is the condition to block, but the condition to resume is getting under the low-water mark again.

comment:6 Changed 9 years ago by nickm

(Relatedly, we should verify that new streams on a blocked circuit also become blocked.)

comment:7 Changed 9 years ago by nickm

See branch "bug1653" in my public repo; it is based on "yetonetime"'s original patch, and tries to fix the new-streams issue too.

It rebases cleanly onto maint-0.2.1 if we think it should go there. I think it should; this might also be the solution (or a solution) for bug #1298 .

comment:8 Changed 9 years ago by arma

Priority: normalmajor

Triage: this is blocking 0.2.2.x, and maybe even wants a backport once we've tested it.

comment:9 Changed 9 years ago by nickm

As noted on the #1298 comments by yetonetime, this fix in 8782dcf6a291b isn't right. To consider why, look at the two warning cases in 8782dcf6a291b:

+      log_info(LD_BUG, "Got a cell added to a cell queue when streams were "
+               "supposed to be blocked; found that %d streams weren't.", n);

This can happen when we get a new connection on a blocked circuit: the first CONNECTED relay cell will get added, since connections don't start out blocked (nor should they).

+      log_info(LD_BUG, "Got a cell added to a cell queue when streams were "
+               "all blocked. We should figure out why.");

This happens on every relay when a cell queue is high: we can't block incoming circuits (only edge connections), so we let them fill the circuit queue as much as they want. We sure don't want to be logging for every cell on a busy circuit, and we don't want every cell to make us do a loop over the streams.

I've tried to fix both of these on the latest commit on branch bug1653 (f89323afda).

comment:10 Changed 9 years ago by arma

Re f89323afdadadb8d, did you know we have a streamid_t type? We should either use it consistently or not have it anymore. (It looks like we're already not using it consistently, before this patch.)

comment:11 Changed 9 years ago by arma

bug1653 (296a7d83880d75) branch looks good to me. I don't know whether we missed any more edge cases, though. Sounds like we should merge it and find out if anything breaks? It probably also wants to be merged into maint-0.2.1 at some point.

comment:12 Changed 9 years ago by nickm

Milestone: Tor: 0.2.2.x-finalTor: 0.2.1.x-final

Seems sensible to me. I've make a backport branch that we can merge into 0.2.1: see bug1653_for_021 in my public. For now, I'm just merging this to master and reassigning this ticket to the 0.2.1.x-final milestone.

comment:13 Changed 9 years ago by nickm

Parent ID: #1298

comment:14 Changed 9 years ago by arma

I think we should focus on getting 0.2.2 stable rather than trying to backport #1653 and #1298 to 0.2.1 and hoping they get enough testing (they won't).

Or said another way, I think 0.2.1 is frozen enough that this doesn't count as a 'critical' bugfix -- stuff still works after all.

comment:15 Changed 9 years ago by Sebastian

Not merging this for 0.2.1 sounds like not too bad for me.

comment:16 Changed 9 years ago by nickm

Milestone: Tor: 0.2.1.x-finalTor: 0.2.2.x-final
Resolution: fixed
Status: needs_reviewclosed

Okay; leaving this out of 0.2.1.x-final.

comment:17 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:18 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.