Opened 2 years ago

Last modified 6 weeks ago

#23251 assigned defect

Parsing a networkstatus-bridges with flags only causes BridgeDB to hang

Reported by: isis Owned by:
Priority: Medium Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Major Keywords: bridgedb-parsers, ex-sponsor-19
Cc: Actual Points:
Parent ID: Points: 3
Reviewer: Sponsor: Sponsor30-can

Description

The following file:

bridgedb@polyanthum /srv/bridges.torproject.org/from-bifroest
 % cat networkstatus-bridges                                                                                    
published 2017-08-15 18:58:10                                                                                   
flag-thresholds stable-uptime=0 stable-mtbf=0 fast-speed=0 guard-wfu=0.000% guard-tk=0 guard-bw-inc-exits=0 guard-bw-exc-exits=0 enough-mtbf=1 ignoring-advertised-bws=0                                               

causes the bridgedb process to hang. The last log lines output are:

bridgedb@polyanthum ~ % tail -f /srv/bridges.torproject.org/log/bridgedb.log
19:46:33 INFO     L465:Bridges.insert()         BridgeSplitter placing bridge $$C44836BF2F42DB5B1AD3CF6085626056593D846A~Shizuokalibelous into hashring https (via n=5, pos=0).
19:46:33 DEBUG    L547:Bridges.insert()         Inserting $$C44836BF2F42DB5B1AD3CF6085626056593D846A~Shizuokalibelous into hashring...
19:46:33 DEBUG     L78:geo.getCountryCode()     Looked up country code: NL
19:46:33 INFO     L465:Bridges.insert()         BridgeSplitter placing bridge $$2CC6A05D7F52D7B936ABEE13C782780E4B23B64F~conglutinatoreri into hashring email (via n=14, pos=1).
19:46:33 DEBUG    L547:Bridges.insert()         Inserting $$2CC6A05D7F52D7B936ABEE13C782780E4B23B64F~conglutinatoreri into hashring...
19:46:33 INFO     L174:Main.load()              Done inserting 1959 bridges into hashring.
19:46:33 DEBUG    L208:persistent.save()        Saving state to:        '/srv/bridges.torproject.org/run/bridgedb.state'
19:46:33 INFO      L80:Main.load()              Processing descriptors in ../from-bifroest directory...
19:46:33 INFO      L86:Main.load()              Opening networkstatus file: /srv/bridges.torproject.org/from-bifroest/networkstatus-bridges
19:46:33 INFO     L124:descriptors.parseNetwo() Parsing networkstatus file: /srv/bridges.torproject.org/from-bifroest/networkstatus-bridges

Further, and this might be a separate issue, but when BridgeDB hangs in this state, the cronjob which calls bridgedb --reload launches an entirely new process of bridgedb, without killing the old one, since the old one is locked while doing blocking IO, and the signal handlers are in the async code that it's supposed to come back to. I think there's not really any way to fix this, since Stem is doing the IO there, and Stem isn't async aware/capable.

Child Tickets

Change History (6)

comment:1 Changed 2 years ago by atagar

Hi Isis. Added a quick test to see if that line's problematic for Stem. Seems to be just fine...

https://gitweb.torproject.org/stem.git/commit/?id=30990ab

As for Stem not being 'async aware' I'm not sure what you mean or why that's relevant. You're just using Stem as a descriptor parser. That has nothing to do with its controller functionality.

comment:2 in reply to:  1 Changed 2 years ago by isis

Replying to atagar:

Hi Isis. Added a quick test to see if that line's problematic for Stem. Seems to be just fine...

https://gitweb.torproject.org/stem.git/commit/?id=30990ab


Right, that should be fine for Stem. Also, BridgeDB's parsers actually skip the flags/headers entirely and tell Stem to start reading after them, so it wouldn't be a Stem issue anyway. :)

As for Stem not being 'async aware' I'm not sure what you mean or why that's relevant. You're just using Stem as a descriptor parser. That has nothing to do with its controller functionality.


Oh, I mean that BridgeDB is Twisted python, so its main loop is asynchronous. But while BridgeDB is doing its bridgedb.parse.descriptors.* functions (which call Stem) the handling of the file isn't done in the way that Twisted is happy with because it's blocking (see the FileWriter(proto.Protocol) class in the bridgedb.git/scripts/get-tor-exits script for an example of how Twisted wants IO to be done). So because this IO is all blocking, and because of the hang bug, it never returns to BridgeDB's main loop, which is where the handlers for the SIGUSR1 and SIGHUPs signals are, which are required to restart BridgeDB when it receives new descriptors from the BridgeAuth.

To be clear, none of this is a bug in Stem. It's all BridgeDB.

Last edited 2 years ago by isis (previous) (diff)

comment:3 Changed 2 years ago by atagar

Ahhh, got it. Thanks Isis. :P

If ya need anything from me just let me know.

comment:4 Changed 6 months ago by gaba

Owner: isis deleted
Points: 3
Sponsor: SponsorM-canSponsor19
Status: newassigned

comment:5 Changed 6 weeks ago by gaba

Keywords: ex-sponsor-19 added

Adding the keyword to mark everything that didn't fit into the time for sponsor 19.

comment:6 Changed 6 weeks ago by phw

Sponsor: Sponsor19Sponsor30-can

Moving from Sponsor 19 to Sponsor 30.

Note: See TracTickets for help on using tickets.