Opened 5 weeks ago

Closed 5 weeks ago

#28089 closed defect (fixed)

Exhausting our bandwidth write limit stops the connection from reading

Reported by: dgoulet Owned by:
Priority: Very High Milestone: Tor: 0.3.5.x-final
Component: Core Tor/Tor Version:
Severity: Blocker Keywords: regression, tor-relay, 034-backport
Cc: nickm Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

In commit 488e2b00bf881b97bcc8e4bbe304845ff1d79a03, we've refactored the block the connection on bandwidth logic and *one* typo got in, probably bad copy paste:

void
connection_write_bw_exhausted(connection_t *conn, bool is_global_bw)
{
  (void)is_global_bw;
  conn->write_blocked_on_bw = 1;
  connection_stop_reading(conn);
  reenable_blocked_connection_schedule();
}

Notice the connection_stop_reading() call where it should be a _stop writing_ ... This has the really bad side effect of making tor stop reading on the socket if the write limit is reached, and because read_blocked_on_bw is not set to 1, it is never reenabled through our mainloop callback.

This fix is *critical* else bytes accumulate in the kernel TCP buffers which can lead to OOM but also lost of connectivity with >= 0.3.4.x relays. One way to accumulate is the keepalive cell that bypasses KIST scheduler so tor sends it regardless if the kernel thinks it is OK. I'll open a ticket for this which is another problem.

This is most likely fixing #27813.

Appeared in 0.3.4.1-alpha.

Child Tickets

Change History (3)

comment:1 Changed 5 weeks ago by dgoulet

Status: newneeds_review

Branch: ticket28089_034_01
PR: https://github.com/torproject/tor/pull/411

FYI, so far so good on one of my test relay. No more bloating the kernel TCP buffers.

comment:2 Changed 5 weeks ago by nickm

Merged to 0.3.4 and forward.

comment:3 Changed 5 weeks ago by nickm

Resolution: fixed
Status: needs_reviewclosed

(Well hunted!)

Note: See TracTickets for help on using tickets.