Extensions #4097 (moved), #5027 (moved) sort bridges into sub-rings determined by a set of filters. #4568 (moved) (pluggable transports) will function similarly.
e.g. A bridge request that specifies ipv6 address, or bridges not blocked in a given country, or bridges with a specified pluggable transport type, or even some combination of the above.
The changes in #4097 (moved) (and #5027 (moved)) cause BridgeDB to write all the assignments including subrings -- which are named by the set of applied filters -- into the assignments.log. This seems to confuse the bridge-assignments script(s) that Karsten runs.
A bridge may be present in several different sub-rings, because it matches several filters (i.e. a bridge may be in a ring of ipv6 bridges as well as another ring of bridges not blocked in iran, and another ring of bridges with ipv6 addresses and not blocked in iran)
We need to figure out what the bridge assignments.log should look like, and whether/how the scripts that parse this file should be updated.
Karsten, what are your thoughts/questions?
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
The current assignments.log format specifies three distributor rings where a bridge can only be contained in exactly one ring. If the general assumption that a bridge is only distributed via email or https or neither of them (unallocated), we should be able to extend this format.
The current format already allows assigning bridges to several subrings:
We could add three new subrings for bridges that are blocked in certain countries (listing non-blocked might leave us with a quite long list) and are reachable via IPv4/v6 or via certain transports:
The current assignments.log format specifies three distributor rings where a bridge can only be contained in exactly one ring. If the general assumption that a bridge is only distributed via email or https or neither of them (unallocated), we should be able to extend this format.
The current format already allows assigning bridges to several subrings:
0103bb5b00ad3102b2dbafe9ce709a0a7c1060e4 https ring=2 port=443 flag=stable}}}We could add three new subrings for bridges that are blocked in certain countries (listing non-blocked might leave us with a quite long list) and are reachable via IPv4/v6 or via certain transports:{{{0103bb5b00ad3102b2dbafe9ce709a0a7c1060e4 https ring=2 port=443 flag=stable blocked=cn,ir ip=4,6 transport=or,obfs2}}}Note that I only made a guess how the ip and transport subrings could look like. asn and ln5, please comment on the format there.
With regards to pluggable transports, there should also be a way to specify in which port each pluggable transport is listening (even when a bridge supports multiple pluggable transports). In the managed proxy protocol, we are specifying this information by using the following format:
{{{
TOR_PT_SERVER_TRANSPORTS=trebuchet,ballista
TOR_PT_SERVER_BINDADDR=trebuchet-127.0.0.1:1984,ballista-127.0.0.1:4891
it's not particularly nice but it does its job.Also, in the future we might also like to specify an optional arguments field for each bridge, so that BridgeDB can pass pluggable transport shared-secrets etc. to BridgeDB clients.> In theory, any tool that parses assignment.log files should ignore the unknown subrings. If not, we should fix the tools.
With regards to pluggable transports, there should also be a way to specify in which port each pluggable transport is listening (even when a bridge supports multiple pluggable transports). In the managed proxy protocol, we are specifying this information by using the following format:
{{{
TOR_PT_SERVER_TRANSPORTS=trebuchet,ballista
TOR_PT_SERVER_BINDADDR=trebuchet-127.0.0.1:1984,ballista-127.0.0.1:4891
}}}
it's not particularly nice but it does its job.
Also, in the future we might also like to specify an optional arguments field for each bridge, so that BridgeDB can pass pluggable transport shared-secrets etc. to BridgeDB clients.
Makes sense. However, while this is information that BridgeDB needs to know to give out to clients, it's not information that BridgeDB needs to write to its assignments.log file. That file should only contain information that determines how BridgeDB gives out bridges to clients. So, if BridgeDB gets asked for bridges which speak trebuchet and it returns a different set of bridges than it would return for other requests, that information should go into assignments.log. But if there are no plans to specifically give out trebuchet bridges on port 1984 or to weight them any more than other bridges, that information shouldn't go into assignments.log. Shared secrets shouldn't go into assignments.log at all. The purpose of assignments.log is that we can later understand why certain bridges see more usage than others. See #2866 (moved) for example.
I realize that IPv6 and pluggable transport information is something that we could also learn from looking at sanitized bridge descriptors. So, in theory, we don't have to include it in assignments.log. But it's quite useful to have all information that BridgeDB uses to make decisions in a single place. It's going to reduce complexity of analyses similar to #2866 (moved), and that means that more people might want to dive into them.
So, is a subring of the form transport=or,obfs2 sufficient to explain why BridgeDB gives out a certain subset of bridges to certain BridgeDB clients?
This gets hairy; for example, consider the case where a bridge has an ipv4 address with port 443, and an ipv6 address without port 443.
True. This isn't only a problem of assignments.log, but of BridgeDB's operation in general. Is BridgeDB supposed to return at least 1 IPv6 bridge on port 443, or do we ignore ports until we have enough IPv6 bridges to be picky about the port? What if a bridge has port 8443 as primary OR port and uses a second IPv4 address (which isn't possible right now, but which is allowed by the spec) with port 443---would we put it in the "port 443" subring?
For simplicity, I'd say we should only put a bridge in the "port 443" subring and add an port=443 entry to assignments.log if the bridge's primary OR port is 443. Let's ignore "or-address" and "a" lines with respect to port 443 until we run into problems not giving out at least one bridge on that port. We can always make it more complex later.