Opened 8 years ago

Closed 4 years ago

#5552 closed defect (wontfix)

(bufferevents) Reported amount of uploaded data occasionally inflated by several gigabytes

Reported by: fk Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.3.13-alpha
Severity: Keywords: bufferevents tor-client
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

In December I started noticing in Arm that the total amount of uploaded data reported by my Tor client would occasionally jump by several gigabytes while I wasn't looking.

I'm currently using a very poor and easily saturated Internet connection through WCDMA, often I merely get around 4 kbytes/s in both directions, frequent packet loss and several seconds of RTT. It is thus rather unlikely that my Tor client will ever upload several gigabytes without being restarted.

My impression is that the pathetic connectivity has something to do with the problem. The frequency of the error message seems to be higher if the performance of the Internet connection is poor.

The inflated bandwidth seems to be the result of Tor casting a negative number to size_t, my current workaround (attached) is to let it detect this and reset the bytes_written to 0 instead.

In the log it looks like this:
Mar 31 17:41:57.977 [err] {GENERAL} Resetting bytes_written from -198/4294967098 to 0. cur_written: 2805925, stats_prev_n_written: 2806123
Mar 31 17:59:00.977 [err] {GENERAL} Resetting bytes_written from -198/4294967098 to 0. cur_written: 3387774, stats_prev_n_written: 3387972
Mar 31 17:59:05.977 [err] {GENERAL} Resetting bytes_written from -396/4294966900 to 0. cur_written: 3415804, stats_prev_n_written: 3416200
Mar 31 17:59:55.977 [err] {GENERAL} Resetting bytes_written from -198/4294967098 to 0. cur_written: 3581843, stats_prev_n_written: 3582041
Mar 31 17:59:59.977 [err] {GENERAL} Resetting bytes_written from -198/4294967098 to 0. cur_written: 3591359, stats_prev_n_written: 3591557
Mar 31 18:00:00.977 [err] {GENERAL} Resetting bytes_written from -198/4294967098 to 0. cur_written: 3591161, stats_prev_n_written: 3591359
Mar 31 18:25:03.977 [err] {GENERAL} Resetting bytes_written from -198/4294967098 to 0. cur_written: 4002667, stats_prev_n_written: 4002865
Mar 31 18:54:02.977 [err] {GENERAL} Resetting bytes_written from -198/4294967098 to 0. cur_written: 4519483, stats_prev_n_written: 4519681

The negative number seems to be -198 most of the time, but I've seen values ranging between -8 and -990.

I'm usually using the latest alpha release, currently Tor 0.2.3.13-alpha with libevent 2.0.16 on FreeBSD 10.0-CURRENT amd64.

I have (currently unsanitized) logs with log level [app]debug lying around, so if there are specific messages you are interested in I can probably look them up. I've seen no obvious correlations, though.

Child Tickets

Attachments (1)

tor-0.2.3.11-alpha-Reset-negative-bytes_written-values-to-0-to-work-aro.patch (1.2 KB) - added by fk 8 years ago.
Reset negative bytes_written values to 0 to work around bogus upload bandwidth values being sent to the control port

Download all attachments as: .zip

Change History (17)

Changed 8 years ago by fk

Reset negative bytes_written values to 0 to work around bogus upload bandwidth values being sent to the control port

comment:1 Changed 8 years ago by rransom

Possibly a duplicate of #3105.

comment:2 Changed 8 years ago by fk

While it may be obvious from the patch, I forgot to mention that I'm using bufferevents.

comment:3 Changed 8 years ago by rransom

This is definitely a libevent bug. The subtractions in decrement_buckets in bufferevent_openssl.c are wrapping around, and that's not okay.

comment:4 Changed 8 years ago by rransom

(They would be okay if unsigned long were exactly the same size as ev_ssize_t.)

comment:5 Changed 8 years ago by nickm

So, the wrap-around issue mentioned in the libevent comment is actually safe: It's for if the internal num_read or num_written field in openssl overflows past 4GB. Like, before the call, you have written 232 - 100. Then you write 200 bytes. The new value will be 100. So (100 - (232 - 100)) will overflow to 200, which is what we wanted.

rransom is right that we need to track down the root cause here: anything that can add 4GB to the total-written amount will thereby screw up other stuff too. One way to do this would be by adding appropriate asserts to bufferevent_ratelim's bufferevent_decrement_write_buckets function in ratelim.c, then getting a stack trace once the assertions hit. That would tell us what exactly is decrementing by the insane value, and maybe give us a clue on how to fix it.

comment:6 Changed 8 years ago by arma

Milestone: Tor: 0.2.3.x-final

comment:7 Changed 8 years ago by arma

Are we sure this only happens in the bufferevents case?

Also, it's quite plausible that 'read' can underflow just like 'write' can.

comment:8 Changed 8 years ago by Sebastian

I think this is a duplicate of #3105.

comment:9 Changed 8 years ago by Sebastian

(To answer your questions, looks like only bufferevents are affected, and only written bytes)

comment:10 Changed 8 years ago by fk

I started by adding an assert() in decrement_buckets().

The assert() was triggered right after starting Tor while fetching directory information.

(gdb) where
#0  0x000000000064476c in thr_kill () at thr_kill.S:3
#1  0x0000000000691320 in abort () at /usr/src/lib/libc/stdlib/abort.c:65
#2  0x00000000006796d0 in __assert (func=0x1891a <Address 0x1891a out of bounds>, file=0x6 <Address 0x6 out of bounds>, line=0, failedexpr=0x0)
    at /usr/src/lib/libc/gen/assert.c:54
#3  0x000000000055d129 in decrement_buckets (bev_ssl=0x800c7f300) at bufferevent_openssl.c:557
#4  0x000000000055dcad in do_handshake (bev_ssl=0x800c7f300) at bufferevent_openssl.c:960
#5  0x000000000055e071 in bufferevent_ssl_renegotiate (bev=0x800c7f300) at bufferevent_openssl.c:1057
#6  0x00000000004cd164 in connection_or_handle_event_cb (bufev=0x800c7f300, event=128, arg=0x80272c840) at connection_or.c:1262
#7  0x000000000056dd52 in bufferevent_run_deferred_callbacks_locked (_=0x800c7f4b0, arg=0x800c7f300) at bufferevent.c:144
#8  0x0000000000561755 in event_process_deferred_callbacks (queue=0x800c7c408, breakptr=0x800c7c3e0) at event.c:1364
#9  0x0000000000561873 in event_process_active (base=0x800c7c300) at event.c:1403
#10 0x0000000000561e18 in event_base_loop (base=0x800c7c300, flags=0) at event.c:1589
#11 0x00000000004056b4 in do_main_loop () at main.c:1933
#12 0x0000000000407048 in tor_main (argc=5, argv=0x7fffffffd660) at main.c:2628
#13 0x00000000004003eb in main (argc=5, argv=0x7fffffffd660) at tor_main.c:30
(gdb) f 3
#3  0x000000000055d129 in decrement_buckets (bev_ssl=0x800c7f300) at bufferevent_openssl.c:557
557             assert(num_w >= bev_ssl->counts.n_written);
(gdb) l
552
553             if (num_w >= bev_ssl->counts.n_written) {
554                       event_warnx("BUG: num_w is %d, bev_ssl->counts.n_written is %d",
555                               num_w, bev_ssl->counts.n_written);
556             }
557             assert(num_w >= bev_ssl->counts.n_written);
558             if (w)
559                     _bufferevent_decrement_write_buckets(&bev_ssl->bev, w);
560             if (r)
561                     _bufferevent_decrement_read_buckets(&bev_ssl->bev, r);
(gdb) p w
$1 = 18446744073709551283
(gdb) p num_w
$2 = 0
(gdb) p bev_ssl->counts.n_written
$3 = 333

Is this wrapping really expected, Nick?

If passing 18446744073709551283 to _bufferevent_decrement_write_buckets() isn't an issue, it's unclear to me what an appropriate assert() would look like. Any suggestions?

comment:11 Changed 8 years ago by nickm

Keywords: bufferevents added
Milestone: Tor: 0.2.3.x-finalTor: 0.2.4.x-final

comment:12 Changed 8 years ago by arma

Summary: Reported amount of uploaded data occasionally inflated by several gigabytes(bufferevents) Reported amount of uploaded data occasionally inflated by several gigabytes

comment:13 Changed 8 years ago by nickm

Keywords: tor-client added

comment:14 Changed 8 years ago by nickm

Component: Tor ClientTor

comment:15 Changed 8 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: unspecified

comment:16 Changed 4 years ago by nickm

Resolution: wontfix
Status: newclosed

The bufferevents code and corresponding build options have been removed in 0.2.9.2-alpha

Note: See TracTickets for help on using tickets.