Opened 5 years ago

Closed 5 years ago

#12622 closed defect (fixed)

Automate update package distribution for TBB updater

Reported by: gk Owned by: boklm
Priority: High Milestone:
Component: Applications/Tor bundles/installation Version:
Severity: Keywords: TorBrowserTeam201408D
Cc: mcs, brade, intrigeri Actual Points:
Parent ID: #4234 Points:
Reviewer: Sponsor:

Description

In order for our updater to work properly we need to automate the package distribution.

Child Tickets

Attachments (1)

sample-update.xml (886 bytes) - added by mcs 5 years ago.
sample XML response (update manifest)

Download all attachments as: .zip

Change History (14)

comment:1 Changed 5 years ago by intrigeri

Cc: intrigeri added

Changed 5 years ago by mcs

Attachment: sample-update.xml added

sample XML response (update manifest)

comment:2 Changed 5 years ago by mcs

I attached a sample XML response that shows the kind of information the server will need to return. The format must match what the Firefox updater code expects to see. Some background information:

While Kathy Brade and I could write the update responder script, there are probably other Tor Project people who could do the job better and more quickly.  Python might be a good choice for implementation.

A sketch of the required functionality follows.

The Tor Browser will periodically perform an HTTP GET request to check for updates (e.g., twice per day).  At a high level, what is needed is a script that responds to an HTTPS GET request and returns an XML update manifest.  No user-facing interface is required (that is, no HTML).  We can decide what the GET URL looks like; maybe something like:

https://torproject.org/tbupdate/4.0a1/Darwin_x86_64-gcc3-u-i386-x86_64/en-US/release/Darwin%2012.5.0/

The browser will create the request URL by using string substitution to fill in various things such as browser version, platform, locale, and update channel (e.g., release vs. beta).  The responder script will need to have access to information about the current recommended versions (e.g., the data that is in https://check.torproject.org/RecommendedTBBVersions) as well as file names and file hashes for the available updates.  The responder script will then need to construct an XML response and return it.

One option would be to adapt the system Mozilla uses to meet our needs.  But it seems like overkill, and it will probably not be easy to port to torproject.org's infrastructure (Mozilla is almost certainly the only organization that uses it).  There is more info available here:

https://wiki.mozilla.org/Balrog
https://github.com/mozilla/balrog

Until recently, Kathy and I thought we could solve this entire problem using static XML files that we would create during the TBB build and release process.  That could probably be done, but since we want to support incremental updates we would need a lot of XML files, and we would also need to update each XML file when we publish a new release of TBB.  Therefore, a script that consults a config file, a small database, or one that caches info about available updates (gleaned from the file system) seems like the best solution.

comment:3 Changed 5 years ago by mcs

Some info about the additional server resources that will be needed to support automatic updates:

If we use Mozilla's default setting of two update checks per day and we assume there are 500,000 active TBB users, that translates to (worst case) 1MM hits per day to retrieve the update meta information.

  • Does anyone have a good estimate for the number of TBB users?

Ignoring incremental updates for now, the disk and download sizes for the full MAR files are approximately:

PlatformSize of full MAR (MB)# of LocalesTotal (MB)
Linux32 34 15 510
Linux64 37 15 555
MacOS32 31 15 465
Win32 25 15 375

That's a total of 1905 MB per release.

  • In round numbers, let's estimate 2GB of extra server disk space per release.

Note that the above data is slightly outdated; it was calculated before meek was added to the bundle.
And once a new release is made, the update responder script will no longer return meta data that references the older MAR files, so they could safely be deleted.

Also, after a new release is made, server network bandwidth will of course be consumed to download the MAR files, but it will mostly be offset be a reduction in traffic that comes from the download HTML page.

comment:4 Changed 5 years ago by erinn

Keywords: needs-triage added

comment:5 Changed 5 years ago by boklm

Keywords: TorBrowserTeam201407 added; needs-triage removed
Owner: changed from erinn to boklm
Status: newassigned

comment:6 Changed 5 years ago by boklm

After looking at this, my idea is to use a yaml config file to store infos about all our supported releases, and write a script that we run as part of the release process that will generate a set of static XML files containing all the possible update responses, and an .htaccess file containing apache mod_rewrite rules to redirect the requests to the corresponding XML file.

I think we will have to generate as many XML files as .mar files we have, so a question related to this is how we will generate the set of .mar files for the incremental and non-incremental updates. Do we already have something for this ?

It seems the tool to generate the mar files will need to know which versions to support for partial and complete updates, and the responder script will need the same information, so maybe both should share the same config file ?

comment:7 in reply to:  6 ; Changed 5 years ago by intrigeri

Replying to boklm:

After looking at this, my idea is to use a yaml config file to store infos about all our supported releases,

You might be interested in the format we use for Tails upgrade-description files: https://tails.boum.org/contribute/design/incremental_upgrades/#upgrade-description-files

comment:8 in reply to:  7 Changed 5 years ago by boklm

Replying to intrigeri:

Replying to boklm:

After looking at this, my idea is to use a yaml config file to store infos about all our supported releases,

You might be interested in the format we use for Tails upgrade-description files: https://tails.boum.org/contribute/design/incremental_upgrades/#upgrade-description-files

Interesting. It seems to be an equivalent of the XML response used by the firefox updater.

comment:9 Changed 5 years ago by mikeperry

Keywords: TorBrowserTeam201408 added; TorBrowserTeam201407 removed

comment:10 Changed 5 years ago by mikeperry

Keywords: TorBrowserTeam201408D added; TorBrowserTeam201408 removed

comment:11 Changed 5 years ago by mikeperry

Priority: normalmajor

comment:12 Changed 5 years ago by boklm

A first version of an update responses script is available at:

https://github.com/boklm/tb-update-response

Currently it can handle the following URL format:

http://something/${channel}/${operatingsystem}/${tb_version}/${language}

It works like this:

The config.yml file contains informations about the update channels
(stable and alpha in this example), and the current versions
(informations to generate the incremental .mar files could be added
here).

It expects to find .mar files for each release in the directory
release/${version}, and use the filenames to recognize them.

The filename for a complete .mar file should be:

tor-browser-${operatingsystem}-${tb_version}_${language}.mar

(unless I made a mistake, it should be the same format you used in
user/brade/tor-browser-bundle.git on branch bug4234-01).

The filename for an incremental .mar file should be:

tor-browser-${operatingsystem}-${old_tb_version}-${new_tb_version}_${language}.mar

When you run the script update_responses, it generates in directory
htdocs the .xml files for all the responses, removes any obsolete file
and generate an .htaccess file containing rules to redirect requests to
the correct .xml file. When the script finished running, we can rsync
the htdocs directory to the web server.

To test it, I added a few fake .mar files for releases 3.6.4 and 4.0a1.
I put .mar files for linux32, linux64, win32, osx32 updates in en-US,
linux32 and linux64 in fr, and an incremental update for 3.6.3 -> 3.6.4
on linux64 in en-US and fr. And I uploaded the generated files to
http://tmp.boklm.eu/updates/.

So we can check a few URLs.

If we update from 3.6.0 in en-US on linux64, linux32, osx32, win32 on
the stable channel, we get a complete update to version 3.6.4:
http://tmp.boklm.eu/updates/stable/linux64/3.6.0/en-US
http://tmp.boklm.eu/updates/stable/linux32/3.6.0/en-US
http://tmp.boklm.eu/updates/stable/osx32/3.6.0/en-US
http://tmp.boklm.eu/updates/stable/win32/3.6.0/en-US

If we selected the alpha channel, we get the 4.0a1 release:
http://tmp.boklm.eu/updates/alpha/linux64/3.6.0/en-US
http://tmp.boklm.eu/updates/alpha/linux32/3.6.0/en-US
http://tmp.boklm.eu/updates/alpha/osx32/3.6.0/en-US
http://tmp.boklm.eu/updates/alpha/win32/3.6.0/en-US

We can get the fr version of the bundle for linux64:
http://tmp.boklm.eu/updates/stable/linux64/3.6.0/fr

The es-ES version does not exist (in this example), so we are redirected
to the en-US version:
http://tmp.boklm.eu/updates/stable/linux64/3.6.0/es-ES

If we are updating from version 3.6.3 on linux64 in en-US, an
incremental update is available:
http://tmp.boklm.eu/updates/stable/linux64/3.6.3/en-US

comment:13 Changed 5 years ago by mikeperry

Resolution: fixed
Status: assignedclosed

Ok, I can't wait to try this out for our next alpha release! Calling this closed until then.

Something else to ponder though: Longer-term (post 31ESR), we may want to have all locales as one bundle for our alpha and/or hardened users. We may want to augment this with a magical "all" locale or something, if it isn't already easy to hack. I guess #4234's patches might also need hacking for this.

Note: See TracTickets for help on using tickets.