Opened 9 years ago

Closed 9 years ago

#2071 closed defect (not a bug)

BridgeDB stuck at 500 bridges

Reported by: mikeperry Owned by: mikeperry
Priority: Medium Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Why is 500 the magic number for max bridges we have ever seen?

https://metrics.torproject.org/network.html?graph=networksize&start=2009-01-01&end=2010-10-12#networksize

Is there some bug that is preventing us from getting more relays? Or is this just a reporting issue?

We could try to ask people to run 100 bridges on the mechanical turk to see if it changes:
https://www.mturk.com/mturk/preview?groupId=VZ2CYWR61R3MWRXH8S00

Verification of the job would be to provide a working, usable bridge line.

Child Tickets

Attachments (1)

bridgedb_scraper.sh (538 bytes) - added by mikeperry 9 years ago.
Attaching the script as a proper attachment to avoid filtering…

Download all attachments as: .zip

Change History (13)

comment:1 Changed 9 years ago by mikeperry

That job I linked to is some weird relay job. Relay jobs on mturk are a bad idea because people don't know the risks of what they are getting into. We should make a bridge-specific job for this.

comment:2 Changed 9 years ago by karsten

Apart of the idea to run an additional 100 bridges, see #2053 for a somewhat closer investigation of the problem.

comment:3 in reply to:  description Changed 9 years ago by shamrock

Replying to mikeperry:
Provided mikeperry with one week of Bridge Authority info-level log files upon request.

comment:4 Changed 9 years ago by mikeperry

Ok, I ran this script over the logs:

for i in tonga_logs_20101*;
do

echo ; echo $i
echo -n "New:"
grep dirserv_add_descriptor $i | grep -c accepted
echo -n "Updated:"
grep dirserv_add_descriptor $i | grep updated | awk '{ print $11; }' | sort | uniq | wc -l

done

This was the output:

tonga_logs_20101025
New:199
Updated:650

tonga_logs_20101026
New:136
Updated:631

tonga_logs_20101027
New:112
Updated:593

tonga_logs_20101028
New:101
Updated:643

tonga_logs_20101029
New:135
Updated:614

tonga_logs_20101030
New:122
Updated:613

tonga_logs_20101101
New:123
Updated:644

So it appears we are seeing about 250 more descriptors in a 24 hour period than tonga is publishing, give or take 20 or so?

comment:5 Changed 9 years ago by mikeperry

Owner: set to mikeperry
Status: newaccepted

Here is an updated script based on input from Roger:

for i in tonga_logs_20101*;
do

echo ; echo $i
new=grep dirserv_add_descriptor $i | grep -c accepted
echo "New: $new"
updated=grep dirserv_add_descriptor $i | grep updated | awk '{ print $11; }' | sort | uniq | wc -l
echo "Updated: $updated"
unreachable=grep -c rep_hist_note_router_unreachable $i
unreachable=echo "$unreachable/(3600*24/128/10)" | bc
echo "Unreachable: $unreachable"
running=echo "$new + $updated - $unreachable" | bc
echo "Running: $running"

done

It produces output more in line with what tonga is reporting. For example:

tonga_logs_20101028
New: 101
Updated: 643
Unreachable: 291
Running: 453

Changed 9 years ago by mikeperry

Attachment: bridgedb_scraper.sh added

Attaching the script as a proper attachment to avoid filtering...

comment:6 Changed 9 years ago by mikeperry

The question that still remains is might tonga be hitting some other issue that is preventing it from testing more than a certain number of bridges successfully. Supposedly it tests 1/256 of the keyspace every 10 seconds, which means a connection every 3 seconds with the current bridge pool.

From the logs now, it looks like we're not hitting file descriptor limits, but it seems like eventually we might. We could also be hitting some other limit on the box that is causing it to fail reachability testing for half of these bridges.

comment:7 Changed 9 years ago by mikeperry

Err, two corrections. It's apparently 1/128 of the keyspace, and I believe all the connection attempts happen immediately in parallel, and are not spread out serially. So we've got a burst of 6-7 tcp connect attempts on tonga every 10 seconds...

comment:8 Changed 9 years ago by mikeperry

I think this might mean the stat we should look at next is what is the failure rate for each round of these reachability tests.. If it is always 2 or 3/6, then we might be on to something.

comment:9 Changed 9 years ago by mikeperry

Hrmm. rep_hist_note_router_unreachable() is called only 2x an hour.. I guess this is because it's only called during NS/vote generation.. bleh. Info logs might not have what we need for this.

comment:10 Changed 9 years ago by karsten

This graph implies that there's no artificial limit at 500 bridges. See #2053 for the metrics part of this question.

comment:11 Changed 9 years ago by arma

Sounds like this one is closeable too.

comment:12 Changed 9 years ago by nickm

Resolution: not a bug
Status: acceptedclosed

closing as notabug; reopen if needed.

Note: See TracTickets for help on using tickets.