Opened 12 years ago

Last modified 7 years ago

#509 closed enhancement (Fixed)

SMARTLIST not very smart for every purpose

Reported by: Safari Owned by:
Priority: Low Milestone: post 0.2.0.x
Component: Core Tor/Tor Version: 0.1.2.16
Severity: Keywords:
Cc: Safari, nickm Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

routers seem to be stored in a list, which is searched linearly.
this is not good for scalability.
look what I got:

CPU: P4 / Xeon, speed 2797.2 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 45000
Counted FSB_DATA_ACTIVITY events (DRDY or DBSY events on the front side bus) with a unit mask of 0x03 (multiple flags) count 45000
Counted BRANCH_RETIRED events (retired branches) with a unit mask of 0x05 (multiple flags) count 45000
Counted BRANCH_RETIRED events (retired branches) with a unit mask of 0x0a (multiple flags) count 45000
samples % samples % samples % samples % symbol name
40310 62.2558 999 64.8280 2186 54.9799 4 12.5000 router_get_by_nickname
4146 6.4032 160 10.3829 611 15.3672 2 6.2500 router_get_combined_status_by_nickname
3811 5.8858 65 4.2180 76 1.9115 2 6.2500 digestmap_get
2771 4.2796 35 2.2713 31 0.7797 1 3.1250 routerlist_assert_ok
2255 3.4827 54 3.5042 471 11.8461 1 3.1250 smartlist_string_isin
1872 2.8912 30 1.9468 47 1.1821 1 3.1250 digestmap_set
1168 1.8039 2 0.1298 26 0.6539 0 0 rijndaelEncrypt

here 85% of CPU time in router_get_by_nickname is spent on this code:

SMARTLIST_FOREACH(routerlist->routers, routerinfo_t *, router,
{

if (!strcasecmp(router->nickname, nickname)) {

would be nice to have routers in rbtree. no need to sort them all the time, too :)

my tor was pretty idle, I take new profile in 24h and see what's number one then...

it would be fun to simulate for example with 100000 routers and 10000 ExcludeNodes.. how'd I do that?
/me wants torbench :)

[Automatically added by flyspray2trac: Operating System: Redhat/CentOS Linux]

Child Tickets

Change History (6)

comment:1 Changed 12 years ago by nickm

Hi!

Yeah, we know linear searches are slow. We're already using hash tables to retrieve routres by identity digest
and by descriptor digest... we just never saw router_get_by_nickname() show up in a profile before. Wow.

By any chance, can your oprofile figure out which function is calling router_get_by_nickname so much?

Also, do by any chance have a really big huge number of nodes listed in your torrc?

comment:2 Changed 12 years ago by Safari

Unfortunately if I enable callgraph on my x86_64 system, system freezes hard and I need to cycle power.
Worked on IA-32...

Not many, around 1300 (ExcludeNodes). Most of them are outdated, too.
It would be much smaller if you did this: enable ECN on the system which does the tor server's reachability test.
Now I get lots of failed connection establishment attempts
(mostly to users running Microsoft products).
because the other side's TCP/IP stack is not compatible with RFC793 from year 1981.
http://www.icir.org/floyd/ecn.html

comment:3 Changed 12 years ago by nickm

Wow. I'm not sure that it's a very high priority to get good performance when the user has excluded more nodes
than actually exist. ;) If you're determined to do this, you should identify routers by fingerprint, not by
nickname. The lookup there is O(1).

I'd love a patch to make nickname lookup faster, but I don't see making this particular use case a priority for
0.2.0.x.

As for the ECN issue: could you walk me through the steps you followed to find out that this was the problem,
and so that I can double-check that this is what's up?

comment:4 Changed 11 years ago by nickm

Should be fixed in 0.2.1.3-alpha (r16061).

comment:5 Changed 11 years ago by nickm

flyspray2trac: bug closed.

comment:6 Changed 7 years ago by nickm

Component: Tor ClientTor
Note: See TracTickets for help on using tickets.