Opened 5 years ago

Closed 5 years ago

#7831 closed defect (fixed)

Investigate consensus-tracker's memory usage

Reported by: atagar Owned by: atagar
Priority: Medium Milestone:
Component: Core Tor/Stem Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

The first script that I ported over to stem was the consensus-tracker script which provides the automated emails for the list by the same name...

https://gitweb.torproject.org/atagar/tor-utils.git/blob/HEAD:/consensusTracker.py
https://lists.torproject.org/cgi-bin/mailman/listinfo/consensus-tracker/

Moving this turned out to reveal some major issues with stem's ExitPolicy class in terms of memory usage. Those issues are fixed and the script now ran for several days without issue, but then a new type of memory problem surfaced.

Each hour the consensus-tracker makes an instance of the Sampling class, storing up to 192 of them at a time. Individually these our fine, but as the script runs and reaches that threshold the memory starts to stack up.

After a week the consensus-tracker instance on my system was using 75% of the system's memory and started failing to fetch new consensus information (I'm not positive that the memory usage is related to the failures, but seems likely).

So first question, why is stem using more memory than torctl? At a guess there's two issues...

  1. TorCtl likely provided version 2 router status entries while stem provides version 3. A big difference between those two is that version 3 includes the microdescriptor exit policy.
  1. TorCtl's ExitPolicyLine class is far lighter than our ExitPolicy. All it stores is the binary representation of the address, subnet mask, and port range (ie, the bare minimum to have a working match() method). Ours, however, includes IPv6 support and some additional data.

I've made a little hack in my consensus-tracker to drop the exit policy from the router status entries (... actually, the script doesn't use them so this should have zero impact). After a week or so of running this'll confirm or deny that the ExitPolicy is the issue.

If it is then I'll likely make the microdescriptor policies become lighter weight. They only need a subset of the information of a normal policy.

Child Tickets

Change History (3)

comment:1 Changed 5 years ago by atagar

While in the shower I realized that there's a couple pieces of low hanging fruit to improve runtime and memory usage with regard to exit policies so I went ahead and made the changes...

https://gitweb.torproject.org/stem.git/commitdiff/da65f282699d3e8e763e1f3ba3c7612cc970178a
https://gitweb.torproject.org/stem.git/commitdiff/d35b9c737e320500fa6cfdfa874534cfbe6f1b3a

Restarted the consensus-tracker with these changes. It didn't run enough to tell for certain if the exit policies are to blame for its prior memory issues, but no point in not taking advantage of these.

comment:3 Changed 5 years ago by atagar

Resolution: fixed
Status: newclosed

Yup, 792b3d6 with some tweaks to add additional attributes did the trick. Memory usage of the consensus tracker is roughly 30% of that system. Resolving.

Note: See TracTickets for help on using tickets.