Opened 8 months ago

Last modified 8 months ago

#30121 assigned task

Create authoritative, parseable list of Tor Browser's default bridges

Reported by: phw Owned by: phw
Priority: Medium Milestone:
Component: Applications/Tor Browser Version:
Severity: Normal Keywords: tbb-bridges
Cc: irl, mcs, mrphs Actual Points:
Parent ID: Points: 1
Reviewer: Sponsor:

Description

The authoritative list of default bridges that Tor Browser ships with is part of the tor-browser-build.git repository. Other repositories however use this list; most importantly OONI, as part of the "TCP Connect" scan, and possibly others. Having the authoritative list in tor-browser-build.git is error-prone because whenever it changes, we need to manually sync OONI's list. (We now have a wiki page that maps our default bridges to their respective owners but the page can be edited by anyone and can therefore not be authoritative.)

To solve this problem, we could create a separate, authoritative list of these default bridges. We also need this list to be easy to parse, e.g., in the form of a simple CSV file. irl mentioned on IRC that our CI infrastructure can then notify us if repositories that include default bridges are out-of-date. In fact, we could even automate the inclusion of default bridges in other repositories: boklm mentioned that tor-browser-build.git could automatically generate the JavaScript file that includes default bridges, and OONI may be able to do the same.

To get things started, here's what I propose:

  • Create a new git repository, say tor-browser-default-bridges.git, that contains our list of default bridges as CSV files.
  • This repository is public and can therefore be referenced and included by other projects.
  • This repository is maintained by the anti-censorship team, which takes responsibility that its content is up-to-date.

Does this sound reasonable? Is there a simpler solution?

Child Tickets

Change History (9)

comment:1 Changed 8 months ago by irl

Cc: irl added

comment:2 Changed 8 months ago by mcs

Cc: mcs added

comment:3 Changed 8 months ago by mrphs

Cc: mrphs added

comment:4 Changed 8 months ago by gk

I don't really understand why "Having the authoritative list in tor-browser-build.git is error-prone" while this does not hold for repository $foo (e.g. tor-browser-default-bridges.git). The issue is _not_ in which repo the bridges are stored but how to get the manually syncing out of the way.

That said I am fine with your plan especially as it gets us out of the way for dealing with the question of whether the bridges are up-to-date or not. We can easily adapt even though I have not looked yet into getting the bridges from a CSV format into one that Tor Browser actually understands. But I guess that should not be too hard to do.

comment:5 Changed 8 months ago by hellais

I think that if the goal is that of automating the sync of the bridge lists into third party tooling (ex. OONI CSV test lists) it's not strictly necessary to have a separate repo.

As long as there is some sort of assurance that the format the bridges are listed in the tor-browser-build.git is well defined and guaranteed to work in the future we should be ok.
If possible it would be great if this format could be a simple text file, a JSON document or something else which is easily machine readable with standard tooling (IRC we were parsing the bridges in that list using some sketchy regexp).

In OONI we will soon be moving away from having the bridge lists vendored in git anyways and will probably be moving to a system which will automatically update them and serve them via some HTTP API (see: https://github.com/ooni/orchestra/issues/51).

comment:6 Changed 8 months ago by irl

If we wanted to scrape out of the tor-browser-build.git, we could define a subset of JavaScript with strict formatting like https://gitweb.torproject.org/torspec.git/tree/dir-list-spec.txt that allows that file to be easily parsed out.

If we did have one central authoritative location for stuff, maybe it makes sense that we also move the directory authorities list and fallback directories lists there.

comment:7 in reply to:  6 Changed 8 months ago by dcf

Replying to irl:

If we wanted to scrape out of the tor-browser-build.git, we could define a subset of JavaScript with strict formatting

The bridge_prefs.js file already is that. It's not actually JavaScript code, but a configuration file with syntax that resembles JavaScript. Here is the grammar, from Firefox's parser:

https://dxr.mozilla.org/mozilla-central/rev/e152590056cc434823f354f149706d28b6127c66/modules/libpref/parser/src/lib.rs#11-33
https://dxr.mozilla.org/firefox/source/modules/libpref/src/prefread.cpp#184-202 (C++ parser, no longer used)

This is the reason why the bridge line for cymrubridge33 in #21917 is only lightly obfuscated. We initially wanted to try JS-based obfuscation to confuse a censor's scraper like "extensions.torlau"+"ncher.default_br"+"idge", but the prefs file syntax doesn't allow it.

comment:8 in reply to:  4 Changed 8 months ago by phw

Replying to gk:

I don't really understand why "Having the authoritative list in tor-browser-build.git is error-prone" while this does not hold for repository $foo (e.g. tor-browser-default-bridges.git). The issue is _not_ in which repo the bridges are stored but how to get the manually syncing out of the way.

My understanding was that the file is not trivial to parse. Anarcat wasn't particularly happy with it. I wasn't aware, however, that this file format is meant to be easy to parse, as dcf pointed out above.

If parsing is indeed not an issue, then I don't mind keeping the list in tor-browser-build.git. However, we should then add comments that explain who runs these bridges, and provide a link to the trac ticket in which these bridge were added to the file.

comment:9 Changed 8 months ago by hellais

Is there a particular reason to use the current format of bridge_prefs.js as opposed to something which has more common parser implementations in most languages, ex: JSON?

Note: See TracTickets for help on using tickets.