Opened 13 years ago

Last modified 7 years ago

#366 closed defect (Implemented)

Tor directory caches open too many connections to download descriptors

Reported by: nickm Owned by: nickm
Priority: Low Milestone: post 0.2.0.x
Component: Core Tor/Tor Version: 0.1.2.3-alpha
Severity: Keywords:
Cc: nickm Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

23:16 < armadev> btw, it looks like if our dirport is on, we prefer going to

dir authorities,

23:17 < armadev> but we still open the 9 or whatever connections
23:17 < armadev> which means we open multiple connections to a given authority
23:17 < nickm> that could well be a bug.
23:17 < nickm> I'll put it on flyspray.
23:17 < armadev> we might want to add a "but no more than the number of options

we have" to the spec

23:17 < nickm> right

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Change History (11)

comment:1 Changed 13 years ago by nickm

Actually, if I'm reading the code right, caches assign descriptors at random to one
of the authorities that claims to have them, then launch connections to each
authority. We split this into multiple connections of no more than MAX_DL_PER_REQUEST
because of Squid's URL limit. From the code:

/* Max amount of hashes to download per request.

  • Since squid does not like URLs >= 4096 bytes we limit it to 96.
  • 4096 - strlen(http://255.255.255.255/tor/server/d/.z) == 4058
  • 4058/41 (40 for the hash and 1 for the + that separates them) => 98
  • So use 96 because it's a nice number. */

#define MAX_DL_PER_REQUEST 96

Arguably, we should pipeline these requests, especially if we're tunneling over TLS.

comment:2 Changed 12 years ago by nickm

Possible options for pipelining are:

  • Limit number of descriptor requests launched per minute. (Won't solve the issue, but may mitigate it.)
  • Limit number of descriptor requests inflight at once.
  • ???

comment:3 Changed 12 years ago by nickm

Another option:

  • If no Tor directory authorities are behind squid, then don't split requests into chunks of 96 at all.

comment:4 Changed 12 years ago by arma

It's not a matter of whether the directory authorities are behind squid -- if
there's a *user* who has set his httpproxy to squid, then he needs to split
the requests.

If this is in fact the reason we're breaking the requests apart (and not simply
a math bug), I suggest we leave this as a feature request for 0.2.0.x.

comment:5 Changed 12 years ago by nickm

Agreed. This is a little annoying, but mostly harmless, and can wait till we pull up the asphault over
the directory code anyway.

comment:6 Changed 12 years ago by nickm

For 0.2.0.x, we should also consider using the (slightly chatter but infinitely more cacheable) HTTP/1.1 facilities
for pipelining.

comment:7 Changed 11 years ago by nickm

Since we're trying to reduce load on authorities, this might be a good bug to work on.

I think it would suffice for launch_router_descriptor_downloads to ask each directory server for no more than
one share of the desired descriptors, and to ignore the other desired descriptors. Later, in directory_info_has_arrived
and/or in run_scheduled_events(), we'll call update_router_descriptor_downloads() again, which will launch descriptor
download requests for whatever we didn't get before.

Thoughts?

comment:8 Changed 11 years ago by nickm

21:37 < armadev> so re 366, the idea would be to get pipelining by only asking

one question per authority at a time?

21:39 < armadev> even if that means it takes a bit longer to get all of our

answers. that sounds like a fine solution.

21:40 < armadev> except, we only call update_router_descriptor_downloads()

immediately upon dir-has-arrived if we
don't-have-minimum-dir-info.

21:40 < armadev> so that means we are waiting way more than we need to, if we

already have some fraction of the directory. that means there
will be more caches that don't have the expected answers. is
that true?

21:41 < armadev> one solution would be to change directory_info_has_arrived()

so it calls update_router_descriptor_downloads() whether or
not we have-minimum-dir-info, if we're a dir mirror.

comment:9 Changed 11 years ago by nickm

Fixed/implemented in r17592 ... r17594. See comment in or.h on limitation of current approach:

  • [XXXX021 NOTE: This option is only implemented for pick_trusteddirserver,
  • not pick_directory_server. If we make it work on pick_directory_server
  • too, we could conservatively make it only prevent multiple fetches to
  • the same authority, or we could aggressively make it prevent multiple
  • fetches to _any_ directory server.]

comment:10 Changed 11 years ago by nickm

flyspray2trac: bug closed.

comment:11 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.