Opened 6 years ago

Closed 5 years ago

#9795 closed enhancement (invalid)

Increase the number of parallel scanners in a BandwidthAuthority from 4 to 8

Reported by: aagbsn Owned by: aagbsn
Priority: Medium Milestone:
Component: Core Tor/Torflow Version:
Severity: Keywords:
Cc: mikeperry Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


In an attempt to measure the Tor network faster, double the number of concurrent scanners from 4 to 8.

Child Tickets

Change History (10)

comment:2 Changed 6 years ago by aagbsn

Cc: mikeperry added

Here are some #s of slices completed with the current pct distribution:

scanner1: 82
scanner2: 86
scanner3: 75
scanner4: 87
scanner5: 87
scanner6: 10
scanner7: 43
scanner8: 6

Is it useful to have the bottom 60% (scanner6-scanner8 represent the bottom 60%) scanned more frequently? Should I try to balance the pct distributions so that we have roughly equal coverage of the network?

comment:3 Changed 6 years ago by mikeperry

FYI: Because the dirauths use the median bw value during the voting process, it makes more sense to have an odd number of bw auths than an even one. We only have 4 right now because we lost one and haven't found a replacement.

comment:4 Changed 6 years ago by arma

Mike: Aaron is talking about the number of parallel scanning threads launched by a single bwauth. He's not talking about how many bwauths we have.

comment:5 Changed 6 years ago by aagbsn

Oops, sorry that wasn't clearer from the topic. Indeed, I would like to increase the parallelism of each BandwidthAuthority so that relays get measured more often. My question above was regarding how best to map the Tor network to scanner threads -- should we scan the fastest relays in the network more often as is currently the case, or should we allocate more scanner threads to the bottom 60% in order to collect measurements more frequently?

My thought is that new relays end up in this latter group and it takes much longer to get measured than established fast relays. A better solution might be to add a configuration option to always pick an unmeasured relay from the consensus if one exists, and set up one scanner instance to use that option.


comment:6 Changed 6 years ago by arma

Summary: Increase the number of BandwidthAuthority scanners from 4 to 8Increase the number of parallel scanners in a BandwidthAuthority from 4 to 8

comment:7 Changed 6 years ago by mikeperry

aagbsn: The method I used to determine those percentages was to run it and tweak them until each scanner was completing its segment roughly at the same rate. The simplest way to monitor that rate is to watch the timestamps on the *done* files in each each subprocesses data/scanner.N directory.

Then you just tweak the percentages until each scanner is producing *done* files at roughly the same rate.

comment:8 Changed 6 years ago by aagbsn

I had a typo in my db_url for scanner6... here are accurate counts of slices completed:

scanner1: 55
scanner2: 59
scanner3: 51
scanner4: 54
scanner5: 55
scanner6: 55
scanner7: 49
scanner8: 49

To complete the full pct range for each scanner: (hours:mins:seconds)

scanner1: 4:31:37, 7:13:11, 3:12:59, 3:20:42, 5:01:16, 5:20:31, 2:25:38, 1:48:20, 1:45:53, 2:27:27, 2:34:19
scanner2: 3:54:59, 6:01:36, 3:05:24, 3:51:59, 5:20:05, 3:18:41, 3:04:29, 2:33:33, 2:13:36, 2:14:25, 1:59:47
scanner3: 10:16:12, 9:18:17, 13:37:24, 4:43:18, 5:15:28
scanner4: 8:54:53, 7:57:23, 8:44:05, 8:29:27, 5:12:53, 5:42:57
scanner5: 10:38:20, 13:21:54, 12:19:22, 11:14:21
scanner6: 12:59:37, 12:09:43, 11:48:27, 10:17:01
scanner7: 18:22:59, 19:08:37
scanner8: 16:31:27, 18:41:05

Interestingly, the scanners all produced roughly the same number of slice files each, which was not at all what I was expecting. Did I miss something?

comment:9 Changed 5 years ago by arma

Seems related to #3440.

comment:10 Changed 5 years ago by cypherpunks

Resolution: invalid
Status: newclosed

This doesn't seem too well thought out, as each scanner instance must parse and discard tor events caused by every other scanner instance.

Note: See TracTickets for help on using tickets.