Bridge operators often want to know what distribution bucket their bridge fell into. Since #29480 (moved), one can find out by inspecting our archived bridge pool assignments but that's cumbersome and not user friendly. We should instead show the bucket on the bridge's relay search page. How can we get this done?
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
I am excited about this one, because it's the culmination of all the back-end work: just having the data in a data set is a good step zero, but giving it to users (bridge operators) in a usable way is where it all needs to lead.
Here are the steps for getting this done (with very rough estimates with 1 point == 1 workday):
Decide where to add bridge distribution information on Relay Search. For example, would we simply want to display something like "https ip=4,6 ring=2 transport=websocket,fte,obfs3,scramblesuit,obfs4", or would we want to structure that information somehow for the user or leave out less relevant parts? (0.5 points)
Specify one or more fields to be added to Onionoo's bridge details documents, following requirements from the first step above. (0.25 points)
Extend Onionoo to fetch bridge pool assignments from CollecTor, store the latest bridge pool assignment for each bridge in its details status document, and write this information to the bridge's details document. (1 point)
Add bridge distribution information to Relay Search as specified in the first step, using the data from the updated Onionoo protocol version as specified in the second step. (0.25 points)
I can do steps 2 and 3, and irl should do (or review) steps 1 and 4.
Once this is done, we shouldn't forget to update our documentation. We want to tell people how they can learn their bridge's distribution bucket, and what they can expect for a given bucket.
This sounds like lots of things coming together nicely, we should close this loop.
On step 1, could an anti-censorship person come up with an MS paint quality mock up of what data should go where? I'm not familiar with how this works exactly or what bridge operators may have for context.
Happy to do step 1 with that input, and step 4. I can also review 2 and 3, some of the input on step 1 should feed into step 2.
I agree, this mockup is what I was going to suggest too.
And then the "https" should be a link to an anchor in an external document that the anti-censorship team maintains, which has anchors for each distribution strategy.
(Actually phw, do you want the word 'bucket' or should we go with something like 'strategy' so we won't be explaining to everybody why it's a bucket?)
Decide where to add bridge distribution information on Relay Search. For example, would we simply want to display something like "https ip=4,6 ring=2 transport=websocket,fte,obfs3,scramblesuit,obfs4", or would we want to structure that information somehow for the user or leave out less relevant parts? (0.5 points)
I suggest only displaying the bucket, which would be https in your example. We're already displaying the supported transport protocols in a separate field and I don't think it's useful to expose the BridgeDB ring. That said, here's a simple mockup:
And then the "https" should be a link to an anchor in an external document that the anti-censorship team maintains, which has anchors for each distribution strategy.
That's a good idea. This could be a new page on BridgeDB, e.g., bridges.torproject.org/info.
(Actually phw, do you want the word 'bucket' or should we go with something like 'strategy' so we won't be explaining to everybody why it's a bucket?)
The bridge model will need to be extended in relay search, and an extra row on the table. If you want to have a go at that it should be an easy change.
Onionoo patch merged, released, and deployed. metrics-web patch rebased and extended by that easy change to put into Relay Search, and deployed. Please give it a try!
What remains is that "https" and the other distribution mechanisms link to an anchor in an external document. As far as I can see, that page would need anchors for "email", "https", "moat", and "unallocated". Ideally, anchors would be the exact strings as these distribution mechanisms, like https://bridges.torproject.org/info#unallocated.
Should that be a new ticket, or is creating that page a quick task on your side?
Is this the main https://bridges.torproject.org page? If so, the steps for adding bridges are gone and it's unclear to me what the press to actually get bridges from this page.I'd suggest keeping the steps and adding this extra info at the very bottom of the page.
"Unallocated" isn't a very simple or descriptive word to describe that bucket. Can we use "private"
instead? Perhaps this is too late in the game to change it, but it seems a bit contradictory since these bridges are allocated to the unallocated bucket.
This corresponds a bit to the point above, but we could change the description of the HTTPS bucket to be more clear and include a link to the page where you actually submit your request.
There's repeated information on this page between the description of the Email bucked and the section on I need an alternative way of getting bridges! below it. Can we condense these into the same section? And it would be great if the resulting section had a mailto: link.
This is a nit, but there is some mixing of second and third person between the old and new content on this page. I think this is fine, but should be done intentionally.
"Unallocated" isn't a very simple or descriptive word to describe that bucket. Can we use "private"
instead? Perhaps this is too late in the game to change it, but it seems a bit contradictory since these bridges are allocated to the unallocated bucket.
How about "unreleased" or "unpublished" or "reserve" ?
Is this the main https://bridges.torproject.org page? If so, the steps for adding bridges are gone and it's unclear to me what the press to actually get bridges from this page.I'd suggest keeping the steps and adding this extra info at the very bottom of the page.
No, this page will live at bridges.torproject.org/info. For now, only Relay Search will link to it, so BridgeDB users won't see it. In the future, we can use the new /info page to add additional documentation.
"Unallocated" isn't a very simple or descriptive word to describe that bucket. Can we use "private"
instead? Perhaps this is too late in the game to change it, but it seems a bit contradictory since these bridges are allocated to the unallocated bucket.
Yes, I see your point. I don't like "private" because we already use that term for bridges that don't report themselves to the authority. I like computer_freak's suggestion of "reserved" but I actually prefer keeping "unallocated" because the cost of changing this term seems to outweigh the benefit of using a somewhat more descriptive term.
I wonder what Karsten thinks?
This corresponds a bit to the point above, but we could change the description of the HTTPS bucket to be more clear and include a link to the page where you actually submit your request.
Good idea, done.
There's repeated information on this page between the description of the Email bucked and the section on I need an alternative way of getting bridges! below it. Can we condense these into the same section? And it would be great if the resulting section had a mailto: link.
Right, that's because BridgeDB includes a short FAQ section at the bottom of each page. I agree that we don't want that here, so I made the embedding of the FAQ conditional. I also added a mailto: link.
This is a nit, but there is some mixing of second and third person between the old and new content on this page. I think this is fine, but should be done intentionally.
I believe this is fixed, now that we removed the FAQ?
"Unallocated" isn't a very simple or descriptive word to describe that bucket. Can we use "private"
instead? Perhaps this is too late in the game to change it, but it seems a bit contradictory since these bridges are allocated to the unallocated bucket.
Yes, I see your point. I don't like "private" because we already use that term for bridges that don't report themselves to the authority. I like computer_freak's suggestion of "reserved" but I actually prefer keeping "unallocated" because the cost of changing this term seems to outweigh the benefit of using a somewhat more descriptive term.
I wonder what Karsten thinks?
My initial thought was that we shouldn't change the term, because bridge pool assignment files contain it and because Onionoo includes it in its response.
Maybe we'll have to say "None" here rather than "Unallocated"?
Note that case doesn't matter in case of configuring this in the torrc file. "HTTPS" is accepted just like "https" or "hTtPs" are. So it's fine to write "HTTPS".
To make this even more complicated, it turns out that a non-zero number of bridges does not have BridgeDB distribution information:
553 moat
505 https
191 email
76 none
37 unallocated
The 37 "unallocated" bridges are the ones we're talking about above.
But I'm not yet sure why those 76 bridges are not included in any distributor, not even the "unallocated" distributor. It could be that they're too new (bridge pool assignment files are only synced once per day at UTC midnight). It could have other reasons like older tor versions.
In any case it seems possible that a bridge will show up with "none" in Relay Search, and we might have to provide information on BridgeDB's information page what that means. In a way these bridges are truly unallocated.
Thanks, these changes look good to me. As a further suggestion, I'd also suggest changing the HTTPS text to be something like:
"... hands out bridges over this website. To get bridges, go to , enter your preferences, and solve the CAPTCHA."
this provides some more detail to the HTTPS instructions, and I find "this website" to be less ambiguous than "the site you're looking at". Maybe we could ask antonela her thoughts on this.
If a bridge sets BridgeDistribution none in its config file, BridgeDB will discard the bridge's descriptor. Bridges may end up in the "unallocated" bucket if they set BridgeDistribution any (which is the default), in which case BridgeDB may toss them into "unallocated".
But I'm not yet sure why those 76 bridges are not included in any distributor, not even the "unallocated" distributor. It could be that they're too new (bridge pool assignment files are only synced once per day at UTC midnight). It could have other reasons like older tor versions.
We encourage people to set BridgeDistribution none if they want their bridge to show up on Relay Search, but don't want BridgeDB to distribute it. Most of our default bridges fall into that category.
In any case it seems possible that a bridge will show up with "none" in Relay Search, and we might have to provide information on BridgeDB's information page what that means. In a way these bridges are truly unallocated.
Oh, you are right. That's a great point that I had not considered. Now that we have both "unallocated" and "none", it seems more important to rename "unallocated" to "reserved". It doesn't seem too difficult to change every occurrence of "unallocated" in BridgeDB. How is the Metrics side looking?
Okay, I agree that we should distinguish five bridge distribution mechanisms in Relay Search with links to BridgeDB's information page:
"HTTPS", "Email", and "Moat";
"Reserved": also known as "unallocated" in bridge pool assignment files which most bridge operators will never hear about; and
"None": either not distributed by BridgeDB as requested by the bridge operator, or distributed via one of the four other mechanisms but too new for Relay Search to know. (The info page should probably mention both possibilities.)
If this makes sense, we can tell Relay Search to display these terms (using this capitalization) rather than the raw strings it receives from bridge pool assignment files.
Regarding a possible change to BridgeDB to actually rename these strings in bridge pool assignment files, I'd rather want to avoid that. There's not really a spec for bridge pool assignment files where we could write down when we changed "unallocated" to "reserved" and why. Soon we'd forgot why we renamed this string and whether "unallocated" and "reserved" are actually the same thing or not. It's a bit like onion service directories still using relay flag "HSDir" rather than "OSDir". Historically, "unallocated" was the correct term when the only alternatives were to allocate a bridge to the HTTPS or Email distributor. It's just a bit less correct since there's now another alternative to really not assign a bridge to any distributor and instead drop it.
Okay, I agree that we should distinguish five bridge distribution mechanisms in Relay Search with links to BridgeDB's information page:
"HTTPS", "Email", and "Moat";
"Reserved": also known as "unallocated" in bridge pool assignment files which most bridge operators will never hear about; and
"None": either not distributed by BridgeDB as requested by the bridge operator, or distributed via one of the four other mechanisms but too new for Relay Search to know. (The info page should probably mention both possibilities.)
If this makes sense, we can tell Relay Search to display these terms (using this capitalization) rather than the raw strings it receives from bridge pool assignment files.
Yes, this sounds good to me. BridgeDB's new info page will have anchors for all (https, moat, email, reserved, none), for example: bridges.torproject.org/info#https. Does this work for you?
Regarding a possible change to BridgeDB to actually rename these strings in bridge pool assignment files, I'd rather want to avoid that. There's not really a spec for bridge pool assignment files where we could write down when we changed "unallocated" to "reserved" and why. Soon we'd forgot why we renamed this string and whether "unallocated" and "reserved" are actually the same thing or not. It's a bit like onion service directories still using relay flag "HSDir" rather than "OSDir". Historically, "unallocated" was the correct term when the only alternatives were to allocate a bridge to the HTTPS or Email distributor. It's just a bit less correct since there's now another alternative to really not assign a bridge to any distributor and instead drop it.
Ok, then I would suggest calling it "Reserved" on both Relay Search and on BridgeDB's info page but leaving it as "unallocated" in the bridge pool assignment. I'll explain this discrepancy on the info page to minimise confusion.
"None": either not distributed by BridgeDB as requested by the bridge operator, or distributed via one of the four other mechanisms but too new for Relay Search to know. (The info page should probably mention both possibilities.)
Your latest screenshot doesn't say anything about that second possibility of assignment information not being propagated between services yet. I could imagine that impatient new bridge operators will ask why their bridge ended up in the None bucket. If you left this note out on purpose, maybe in order to keep things short, that's fine by me.
If this makes sense, we can tell Relay Search to display these terms (using this capitalization) rather than the raw strings it receives from bridge pool assignment files.
Yes, this sounds good to me. BridgeDB's new info page will have anchors for all (https, moat, email, reserved, none), for example: bridges.torproject.org/info#https. Does this work for you?
Yes, this works. I have a patch here that I can deploy when that page exists. Please let me know when that is the case, and I'll deploy it.
"None": either not distributed by BridgeDB as requested by the bridge operator, or distributed via one of the four other mechanisms but too new for Relay Search to know. (The info page should probably mention both possibilities.)
Your latest screenshot doesn't say anything about that second possibility of assignment information not being propagated between services yet. I could imagine that impatient new bridge operators will ask why their bridge ended up in the None bucket. If you left this note out on purpose, maybe in order to keep things short, that's fine by me.
Yes, that's an oversight. How about this:
Bridges whose distribution mechanism is "None" are not distributed by BridgeDB. It is the bridge operator's responsibility to distribute their bridges to users. Note that on Relay Search, a freshly set up bridge's distribution mechanism says "None" for a while. Be a bit patient, and it will then change to the bridge's actual distribution mechanism.
Do we have an approximate time frame within which Relay Search should go from "None" to the bridge's actual distribution mechanism?
The delay between BridgeDB assigning a new bridge to a distributor and Relay Search learning about it is roughly linearly distributed from 1 to 25 hours. For example, the bridge pool assignments file written by BridgeDB at 2020-03-16T00:01:45Z was archived by CollecTor at 2020-03-17T00:09:00Z and would be processed by Onionoo at about 2020-03-17T00:45:00Z. That's the worst case scenario, though. How about you write something vague like "usually within one day" and keep the "be patient" part? :)
The delay between BridgeDB assigning a new bridge to a distributor and Relay Search learning about it is roughly linearly distributed from 1 to 25 hours. For example, the bridge pool assignments file written by BridgeDB at 2020-03-16T00:01:45Z was archived by CollecTor at 2020-03-17T00:09:00Z and would be processed by Onionoo at about 2020-03-17T00:45:00Z. That's the worst case scenario, though. How about you write something vague like "usually within one day" and keep the "be patient" part? :)
My only comment is a super unimportant nit that a few lines have broken weirdly.
Ah, right. The file bridgedb.pot is automatically generated by running python setup.py extract_messages. We can ignore that because we never edit bridgedb.pot manually.
Karsten, I'm leaving this ticket open for the remaining work on your side, ok?