Avoid needless wasted space in cells

changed milestone to %Tor: unspecified

added bwbug component::core tor/tor milestone::Tor: unspecified nagle priority::medium severity::normal sponsor8-removed status::needs-revision tor-relay type::defect labels

I'm going to try to write patches for these before I go to bed.

Trac:
Owner: N/A to nickm
Status: new to accepted

Branch bug7743_023 now has a patch for the first idea above.

Trac:
Status: accepted to needs_review

This is urgent to backport to 0.2.3 because of the shorter read bucket refill times. (Alternatively, the authorities could set the refill time to, say, 500 msec until 0.2.3 is obsolete, or tell 0.2.3 to use 500 msec intervals and give 0.2.4 Tors with this patch a new consensus variable to look at.)

The second fix I suggest above is less urgent, since the issue it fixes isn't a regression. It's also trickier to do, since the most obvious implementations would seem to allow a loud stream to starve a quiet stream if they share a circuit.

Trac:
Priority: major to critical
Keywords: backport-023 tor-relay deleted, backport-023 tor-relay regression added

Adding a bwbug tag for this tranche of bandwidth bugs.

Trac:
Keywords: backport-023 tor-relay regression deleted, backport-023 tor-relay regression bwbug added

Oops; there is no such consensus parameter; this is really going to need a backport, I think.

Also, we should probably do this ioctl hack at most once per cell, or else a trickle of incoming bytes might keep us from sending a cell in a timely way. Attaching a variant patch (submitted pseudonymously) to do that.

Trac:
216769.txt

Right, we never did the consensus param since our refill code couldn't handle being changed.

That said, do we have any evidence that this feared bug is for real much? connection_bucket_round_robin() tries pretty hard to fill its cell.

Replying to arma:

Right, we never did the consensus param since our refill code couldn't handle being changed.

That said, do we have any evidence that this feared bug is for real much? connection_bucket_round_robin() tries pretty hard to fill its cell.

It tries, but it can leave cells low when the per-tick bandwidth isn't huge. Please engage with the bug reporter if you don't believe me.

The current code also delays in the case where data arrives after recv but before the ioctl. That might not be the best choice; it might instead be a good idea to do the ioctl earlier, at the time when we do connection_bucket_read_limit

Then again, maybe not.

One way forward, pending a simulation or something in 0.2.3, would be to get this in 0.2.4, along with a patch to track how many wasted bytes it is preventing, and log that as part of the heartbeat.

Alternate version in bug7743_023_alt2. Basically untested, probably buggy.

20:20 <@nickm> I need somebody to try running master (as of 7a99d26c798a2223) 
               on an exit node.  It doesn't have to be a big one -- a smaller 
               one would be fine.
20:21 <@nickm> Ideal even, maybe.
20:21 <@nickm> There's a new heartbeat message of the form 
20:21 <@nickm> Dec 18 15:15:13.000 [notice] Average packaged cell fullness: 
               91.479%
20:22 <@nickm> If that number is low in practice, I want to know whether 
               running branch bug7743_024_test improves matters.

If that helps but not enough (bug7743_024_test is just bug7743_023_alt2 merged onto master), there are more things we can try to do in order to improve cell performance, including:

Remove the loop from connection_handle_read_impl. It doesn't seem to do anything useful.
If, when setting avail_to_package, we find that it is more than RELAY_PAYLOAD_SIZE, round it down to the nearest RELAY_PAYLOAD_SIZE.
When reading, round at_most down to avail.

Trac:
11734484.txt

Attaching an additional patch, pseudonymously contributed, to handle the two rounding ideas.

Trac:
Resolution: N/A to worksforme
Status: needs_review to closed

Trac:
Resolution: worksforme to N/A
Status: closed to reopened

Stack dump for SIGSEGV seen while testing:

#0 0x0000000000539321 in tor_strlower (s=0x0) at src/common/util.c:517 #1 0x00000000004d56c4 in connection_exit_begin_conn (cell=0x7fffffffde30, circ=0x1bec960) at src/or/connection_edge.c:2482 #2 (closed) 0x0000000000429143 in connection_edge_process_relay_cell (cell=0x7fffffffde30, circ=0x1bec960, conn=0x0, layer_hint=0x0) at src/or/relay.c:1190 #3 (closed) 0x0000000000425e66 in circuit_receive_relay_cell (cell=0x7fffffffde30, circ=0x1bec960, cell_direction=CELL_DIRECTION_OUT) at src/or/relay.c:193 #4 (closed) 0x00000000004b00a9 in command_process_relay_cell (cell=0x7fffffffde30, chan=0x1d9c0e0) at src/or/command.c:407 #5 (closed) 0x00000000004af32d in command_process_cell (chan=0x1d9c0e0, cell=0x7fffffffde30) at src/or/command.c:147 #6 (closed) 0x0000000000485fac in channel_queue_cell (chan=0x1d9c0e0, cell=0x7fffffffde30) at src/or/channel.c:2516 #7 (closed) 0x000000000048cb48 in channel_tls_handle_cell (cell=0x7fffffffde30, conn=0x22fc5e0) at src/or/channeltls.c:921 #8 (closed) 0x00000000004db608 in connection_or_process_cells_from_inbuf (conn=0x22fc5e0) at src/or/connection_or.c:1937 #9 (closed) 0x00000000004d80a6 in connection_or_process_inbuf (conn=0x22fc5e0) at src/or/connection_or.c:461 #10 (closed) 0x00000000004cbba9 in connection_process_inbuf (conn=0x22fc5e0, package_partial=1) at src/or/connection.c:3954 #11 (closed) 0x00000000004c95c9 in connection_handle_read_impl (conn=0x22fc5e0) at src/or/connection.c:2819 #12 (closed) 0x00000000004c96f4 in connection_handle_read (conn=0x22fc5e0) at src/or/connection.c:2860 #13 (closed) 0x000000000040a26f in conn_read_callback (fd=22, event=2, _conn=0x22fc5e0) at src/or/main.c:722 #14 (closed) 0x00007ffff772f930 in event_process_active (base=0x7e4c70, flags=) at event.c:395 #15 (closed) event_base_loop (base=0x7e4c70, flags=) at event.c:547 #16 (closed) 0x000000000040cc77 in do_main_loop () at src/or/main.c:1989 #17 (closed) 0x000000000040e237 in tor_main (argc=3, argv=0x7fffffffe658) at src/or/main.c:2701 #18 (closed) 0x0000000000408844 in main (argc=3, argv=0x7fffffffe658) at src/or/tor_main.c:30

Not sure if this is related to this patch or not yet.

Avoid needless wasted space in cells

Child items ...

Activity