I attached a sample XML response that shows the kind of information the server will need to return. The format must match what the Firefox updater code expects to see. Some background information:
While Kathy Brade and I could write the update responder script, there are probably other Tor Project people who could do the job better and more quickly. Python might be a good choice for implementation.
A sketch of the required functionality follows.
The Tor Browser will periodically perform an HTTP GET request to check for updates (e.g., twice per day). At a high level, what is needed is a script that responds to an HTTPS GET request and returns an XML update manifest. No user-facing interface is required (that is, no HTML). We can decide what the GET URL looks like; maybe something like:
The browser will create the request URL by using string substitution to fill in various things such as browser version, platform, locale, and update channel (e.g., release vs. beta). The responder script will need to have access to information about the current recommended versions (e.g., the data that is in https://check.torproject.org/RecommendedTBBVersions) as well as file names and file hashes for the available updates. The responder script will then need to construct an XML response and return it.
One option would be to adapt the system Mozilla uses to meet our needs. But it seems like overkill, and it will probably not be easy to port to torproject.org's infrastructure (Mozilla is almost certainly the only organization that uses it). There is more info available here:
https://wiki.mozilla.org/Balroghttps://github.com/mozilla/balrog
Until recently, Kathy and I thought we could solve this entire problem using static XML files that we would create during the TBB build and release process. That could probably be done, but since we want to support incremental updates we would need a lot of XML files, and we would also need to update each XML file when we publish a new release of TBB. Therefore, a script that consults a config file, a small database, or one that caches info about available updates (gleaned from the file system) seems like the best solution.
Some info about the additional server resources that will be needed to support automatic updates:
If we use Mozilla's default setting of two update checks per day and we assume there are 500,000 active TBB users, that translates to (worst case) 1MM hits per day to retrieve the update meta information.
Does anyone have a good estimate for the number of TBB users?
Ignoring incremental updates for now, the disk and download sizes for the full MAR files are approximately:
=Platform=
=Size of full MAR (MB)=
=# of Locales=
=Total (MB)=
Linux32
34
15
510
Linux64
37
15
555
MacOS32
31
15
465
Win32
25
15
375
That's a total of 1905 MB per release.
In round numbers, let's estimate 2GB of extra server disk space per release.
Note that the above data is slightly outdated; it was calculated before meek was added to the bundle.
And once a new release is made, the update responder script will no longer return meta data that references the older MAR files, so they could safely be deleted.
Also, after a new release is made, server network bandwidth will of course be consumed to download the MAR files, but it will mostly be offset be a reduction in traffic that comes from the download HTML page.
After looking at this, my idea is to use a yaml config file to store infos about all our supported releases, and write a script that we run as part of the release process that will generate a set of static XML files containing all the possible update responses, and an .htaccess file containing apache mod_rewrite rules to redirect the requests to the corresponding XML file.
I think we will have to generate as many XML files as .mar files we have, so a question related to this is how we will generate the set of .mar files for the incremental and non-incremental updates. Do we already have something for this ?
It seems the tool to generate the mar files will need to know which versions to support for partial and complete updates, and the responder script will need the same information, so maybe both should share the same config file ?
The config.yml file contains informations about the update channels
(stable and alpha in this example), and the current versions
(informations to generate the incremental .mar files could be added
here).
It expects to find .mar files for each release in the directory
release/${version}, and use the filenames to recognize them.
The filename for a complete .mar file should be:
tor-browser-
{operatingsystem}-
{tb_version}_${language}.mar
(unless I made a mistake, it should be the same format you used in
user/brade/tor-browser-bundle.git on branch bug4234-01).
The filename for an incremental .mar file should be:
tor-browser-
{operatingsystem}-
{old_tb_version}-
{new_tb_version}_
{language}.mar
When you run the script update_responses, it generates in directory
htdocs the .xml files for all the responses, removes any obsolete file
and generate an .htaccess file containing rules to redirect requests to
the correct .xml file. When the script finished running, we can rsync
the htdocs directory to the web server.
To test it, I added a few fake .mar files for releases 3.6.4 and 4.0a1.
I put .mar files for linux32, linux64, win32, osx32 updates in en-US,
linux32 and linux64 in fr, and an incremental update for 3.6.3 -> 3.6.4
on linux64 in en-US and fr. And I uploaded the generated files to
http://tmp.boklm.eu/updates/.
Ok, I can't wait to try this out for our next alpha release! Calling this closed until then.
Something else to ponder though: Longer-term (post 31ESR), we may want to have all locales as one bundle for our alpha and/or hardened users. We may want to augment this with a magical "all" locale or something, if it isn't already easy to hack. I guess #4234 (moved)'s patches might also need hacking for this.
Trac: Status: assigned to closed Resolution: N/Ato fixed