Opened 9 years ago

Closed 9 years ago

Last modified 7 years ago

#1790 closed enhancement (fixed)

Count bytes spent on answering directory requests

Reported by: karsten Owned by: karsten
Priority: Medium Milestone:
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay
Cc: nickm, Sebastian, ln5, arma Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Mike wants to know for his bandwidth weights how many bytes we're writing and reading for directory requests as compared to all bytes. We could add two new lines in the style of write-history and read-history that declare how many bytes were spent on directory requests, including both direct connections to the Dir port and tunneled requests via BEGIN_DIR cells.

Here are some example numbers from my test relay's state file, together with the write-history and read-history lines for comparison:

BWHistoryReadEnds 2010-08-03 08:45:44
BWHistoryReadValues [...],185844736,205917184,207345664,17137133
BWHistoryWriteEnds 2010-08-03 08:45:44
BWHistoryWriteValues [...],223923200,240807936,239798272,21942640
BWHistoryDirReadEnds 2010-08-03 08:43:32
BWHistoryDirReadValues [...]38042624,32221184,32626688,8438197
BWHistoryDirWriteEnds 2010-08-03 08:43:32
BWHistoryDirWriteValues [...],684032,542720,642048,167731

The interval ends of read/written bytes and read/written dir bytes don't match, because the relay was measuring bytes before, but not dir bytes.

I'm wondering why we're reading that many more dir bytes than we are writing. Is something wrong with my patch?

Please find branch dirbytes in my public repository that has the patch. Note that the dirreq-{write,read}-history lines are not added to extra-info descriptors in that patch, but only written to the local state file. Once we're done reviewing the patch, we need to change the r<2 part in rep_hist_get_bandwidth_lines() to r<4 (search for "TODO" in the patch).

Child Tickets

Attachments (1)

dir-bytes-2010-08-10.png (46.4 KB) - added by karsten 9 years ago.
Analysis of written/read dir bytes vs. all bytes

Download all attachments as: .zip

Change History (15)

comment:1 Changed 9 years ago by karsten

Status: newneeds_review

comment:2 Changed 9 years ago by nickm

The check in connection_bytes_decrement is wrong; CONN_TYPE_DIR is not only directory request that we're answering, but also directory requests that we're making.

The calculation of "len" in rep_hist_get_bandwidth_lines assumed that the longest identifier it cared about was write-history; dirreq-write-history is longer. Also, are there any tors left that still care about getting "opt" ? I'm pretty sure "opt" is unnecessary now that Tor 0.1.2.5 and earlier are dead, so we might as well remove it in a followup patch.

comment:3 Changed 9 years ago by Sebastian

I'll do the opt removal once this gets resolved, there are a few more cases where we can get rid of it.

comment:4 in reply to:  2 Changed 9 years ago by karsten

Replying to nickm:

The check in connection_bytes_decrement is wrong; CONN_TYPE_DIR is not only directory request that we're answering, but also directory requests that we're making.

I added a check for conn->purpose == DIR_PURPOSE_SERVER. Look better?

The calculation of "len" in rep_hist_get_bandwidth_lines assumed that the longest identifier it cared about was write-history; dirreq-write-history is longer.

Right, I added 7 more bytes per line.

Also, are there any tors left that still care about getting "opt" ? I'm pretty sure "opt" is unnecessary now that Tor 0.1.2.5 and earlier are dead, so we might as well remove it in a followup patch.

Yes, it would be good to rid of "opt", but, as you say, not as part of this patch.

See new commit in my dirbytes branch that now also contains a changes file. I'm running this patch on my test node and will have some results later today. Other than the issues you noted here, does this patch look good to be merged? If so, and if my test goes well, I'll raise r<2 to r<4 (TODO line, noted above) and play the squash and rebase game.

Thanks for the review!

comment:5 Changed 9 years ago by nickm

Okay; looks improved to me. I say go for it.

comment:6 Changed 9 years ago by karsten

Still far more bytes read than written:

BWHistoryDirReadEnds 2010-08-06 06:48:10
BWHistoryDirReadValues 12288,3046400,8133632,14726144,19582976,22445056,
24411136,22467584,24254464,24261632,21853184,29230080,31928320,33859584,
26002432,35966976,31826944,30321664,38299648,30172160,30070784,29010944,
31027200,34312192,33003520,29108224,24442880,26138624,27737088,30773248,
20355072,21350400,21373952,21355520,18140160,16589824,18079744,14092288,
16784384,17646592,15721472,16871424,13120512,16904192,12439552,16291840,
14654464,14429184,14314496,15229952,10378240,12633088,13803520,17124352,
13905920,13348864,15665152,18242560,15900672,20761600,23213056,29303808,
23517184,20302848,16098304,19204096,17095680,15508480,17684480,16966656,
16137216,13243392,13696000,14605312,13999104,12584960,12645376,16165888,
13488128,14565376,15346688,19612672,15410176,16740352,18549760,15300608,
15569920,15515648,3056949
BWHistoryDirWriteEnds 2010-08-06 06:33:10
BWHistoryDirWriteValues 0,0,258048,359424,371712,338944,593920,696320,
855040,1366016,675840,771072,1018880,793600,1007616,750592,1458176,526336,
299008,763904,693248,636928,875520,494592,1142784,1075200,666624,787456,
842752,806912,864256,311296,642048,493568,1039360,126976,771072,668672,
419840,390144,805888,850944,452608,351232,274432,678912,370688,411648,
544768,168960,589824,550912,504832,228352,288768,371712,234496,258048,
178176,196608,338944,168960,66560,569344,385024,388096,229376,199680,
378880,455680,282624,482304,455680,409600,218112,363520,484352,286720,
455680,273408,407552,344064,500736,249856,796672,382976,253952,440754

Could it be that local directory connections are special in the sense that connection_buckets_decrement() doesn't get called when writing to them?

For example, in _connection_write_to_buf_impl(), there's a line "return; /* no need to try flushing */" before connection_handle_write() gets called which in turn calls connection_buckets_decrement().

Or am I missing something else?

comment:7 Changed 9 years ago by karsten

I think I found the problem:

https://gitweb.torproject.org/karsten/tor.git/commitdiff/c69c0983da4353bc4d5b3ad70ab9f1b8bd9bf5de

In that particular line we read n_read bytes on a connection and need to decrement the bucket counters of the linked connection. I think we need to consider these bytes as written bytes on the linked connection, not read bytes.

This logic seems to work for tunneled directory connections (OR conn -> Exit conn -> Dir conn; Exit and Dir conn are linked). Can someone confirm that this also works for linked connections in general?

Attached you'll find a graph with more plausible results.

See updated branch dirbytes in my public repository. If others agree, this patch can be merged into master.

Changed 9 years ago by karsten

Attachment: dir-bytes-2010-08-10.png added

Analysis of written/read dir bytes vs. all bytes

comment:8 Changed 9 years ago by nickm

This seems okay to me. One last thing: the bugfix in c69c0983da435 needs a changes entry (and should it get backported to 0.2.1.x?)

comment:9 Changed 9 years ago by karsten

Cc: arma added

Oh ho! When looking up when this bug was introduced, I found:

commit 1c62b9d5fa21a802f26579379bd7b5d0fc17af9c
Author: Roger Dingledine <arma@torproject.org>
Date:   Sat Oct 10 14:52:41 2009 -0400

    fix a bug where we were decrementing the wrong bucket
    
    i think this doesn't actually affect anything, since linked
    conns usually don't impact buckets

diff --git a/src/or/connection.c b/src/or/connection.c
index 18464d2..ca71373 100644
--- a/src/or/connection.c
+++ b/src/or/connection.c
@@ -2352,7 +2352,7 @@ loop_again:
 
     if (n_read) {
       /* Probably a no-op, but hey. */
-      connection_buckets_decrement(linked, approx_time(), 0, n_read);
+      connection_buckets_decrement(linked, approx_time(), n_read, 0);
 
       if (connection_flushed_some(linked) < 0)
         connection_mark_for_close(linked);

Roger, any objections if I undo your fix? See my comment above that makes me think we want to count n_read bytes as written bytes on the linked connection. We might even add a comment saying why we do what we do.

comment:10 Changed 9 years ago by arma

I think undoing 1c62b9d5fa2 is the right thing to do. We should also put in a comment as you say. Something like

/* Probably a no-op, since linked conns typically don't count for bandwidth rate limiting. But do it anyway so we can keep stats accurately. Note that since we read the bytes from conn, and we're writing the bytes onto the linked connection, we count these as <i>written</i> bytes. */

comment:11 Changed 9 years ago by karsten

Great! Then this branch is ready to be merged. See dirbytes2 in my public repository.

comment:12 Changed 9 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

Okay, merged.

comment:13 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:14 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.