Opened 4 years ago

Last modified 6 months ago

#16028 new defect

Many users seem to be failing incremental updates to 4.5.1

Reported by: mikeperry Owned by: tbb-team
Priority: High Milestone:
Component: Applications/Tor Browser Version:
Severity: Normal Keywords: tbb-update
Cc: mcs, brade Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

More than half of our users appear to be failing the incremental updates. From one of our mirror's webserver logs for yesterday:

$ grep "incremental.mar" dist.torproject.org-access.log-20150514 | grep -v "GET /torbrowser/4.0.8/" | grep 4.5.1 -c
81575
$ grep ".mar " dist.torproject.org-access.log-20150514 | grep -v "GET /torbrowser/4.0.8/" | grep -v incremental | grep -c 4.5.1
50235

Could this be due to the HTTPS-Everywhere update in 5.0.4? I wonder if something about how it is being unpacked is causing the force-update in the incrementals to fail..

(The grep -v for the 4.0.8 update is due to another issue.. We have a lot of users also trying to download 4.0.8 mars for some reason).

Child Tickets

Change History (9)

comment:1 Changed 4 years ago by mcs

Unfortunately it is very difficult to tell why the incremental updates might be failing. Is this happening across all platforms and languages? Kathy and I will try some more 4.0.8 -> 4.5.1 updates (with updates to HTTPS-Everywhere, etc.) to see if we can reproduce this problem.

comment:2 Changed 4 years ago by mcs

Another (small?) factor is that we did not generate Mac OS incremental MARs for the 4.0.8 -> 4.5.1 upgrade path.

comment:3 Changed 4 years ago by mikeperry

The ratios are just as bad for win32 users, and seem to be getting worse over time.

For Windows users today:

grep ".mar " dist.torproject.org-access.log | grep -v "GET /torbrowser/4.0.8/" | grep -v incremental | grep win32 | grep -c 4.5.1
20938
grep "incremental.mar" dist.torproject.org-access.log | grep -v "GET /torbrowser/4.0.8/" | grep win32 | grep 4.5.1 -c
37277

At a glance, it does not appear to be specific to any one locale. I see at least 6 locales in this list.

I think the fact that this ratio is getting worse points strongly to a concurrent update. Maybe test upgrading vs not upgrading HTTPS-Everywhere as your first experiment?

comment:4 Changed 4 years ago by mcs

We are still trying to reproduce the problem (trying on Windows 7 at the moment). It does not seem to matter whether we update H-E first, or after the browser update, or concurrently.

Are the failures with 4.0.8 -> 4.5.1 updates?
Are many of the 4.5 -> 4.5.1 or 4.5 -> 5.0a1 incremental updates failing?
Do the Apache logs tell us whether the entire incremental mar file was downloaded?

If I have access to the Apache logs I can check the above myself.

Since the 4.0.8 -> 4.5.1 incremental MAR effectively does rm -rf .../extensions/https-everywhere@… and then adds the new files, Kathy and I do not think the root cause of this ticket is the H-E update.

comment:5 Changed 4 years ago by mcs

Kathy and I have come up with a few scenarios that may be causing the TB updater to fallback to a full MAR:

  1. The user makes changes to torrc-defaults (the 4.0.8 -> 4.5.1 incremental MAR tries to patch that file). There are a bunch of files that the incremental MAR tries to patch, any of which would cause this same problem; torrc-defaults just seems more likely to be modified by users than the others.
  2. A network failure of the wrong kind occurs during download of the incremental MAR. In our testing on Windows, New Identity triggered a complete MAR download.
  3. The user exits the browser (or a crash occurs) during the download. When this happens, the update service is supposed to resume the incremental MAR download when the browser is restarted, but we have seen it treat this as a network error in some cases.

We should be able to distinguish 1. above from 2. and 3. because if a failure occurs while trying to apply the incremental MAR it should have been completely downloaded (and only partially downloaded in the other two situations). So maybe check the size field within the Apache logs if we have that info.

comment:6 Changed 4 years ago by mikeperry

Wrt the size question, if I ignore 206 requests, I am not seeing any evidence of partial downloads of incrementals. This probably rules out 2 and 3, unless there is an issue in the 206 behavior that I can't see (we scrub IP addresses in logs, so even if concurrent exit IP usage wasn't common I still couldn't total 206 values across requests). Oddly, there are a few 416 requests (partially satisfied range requests), but less than a hundred per day.

There does appear to be at least one crawler involved. It is setting the referer header to the containing directory. It is also performing less than a hundred requests per day.

One more datapoint: The total counts of full update downloads exceeded the incremental download counts on May 17th, and there have been more full update downloads than incremental downloads every since. I am now wondering if we may actually be seeing users failing the full update and retrying repeatedly, perhaps due to #15857 or some other issue? Here's the counts from today on one mirror:

$ grep ".mar " dist.torproject.org-access.log | grep -v "GET /torbrowser/4.0.8/" | grep -v 206 | grep incremental -c
13524
$ grep ".mar " dist.torproject.org-access.log | grep -v "GET /torbrowser/4.0.8/" | grep -v 206 | grep incremental -c -v
16605

Unless we have other ideas, my next plan is to create some munin scripts to monitor this for future releases, so we can get an idea what happens when we don't update the torrc or startup scripts, and don't have #15857 in the mix.

comment:7 in reply to:  6 Changed 4 years ago by mcs

Replying to mikeperry:

One more datapoint: The total counts of full update downloads exceeded the incremental download counts on May 17th, and there have been more full update downloads than incremental downloads every since. I am now wondering if we may actually be seeing users failing the full update and retrying repeatedly, perhaps due to #15857 or some other issue?

#15857 is possible although one would think that if #15857 was a common problem Mozilla would have noticed it and fixed it by now (but a TB installation does have a lot more files / more hierarchy than Firefox).

Unless we have other ideas, my next plan is to create some munin scripts to monitor this for future releases, so we can get an idea what happens when we don't update the torrc or startup scripts, and don't have #15857 in the mix.

That sounds like a good idea. Another thing we could do is to build a prompt and upload mechanism into TB to allow users to send us the updater log after failed incremental updates (but maybe most of our users would just click "No Thanks").

comment:8 Changed 2 years ago by arma

Severity: Normal

Is this still an issue? Is there anything to be learned now from our experiences with 4.5.1?

comment:9 Changed 6 months ago by gk

Keywords: tbb-update added; tbb-updater removed

Renaming keyword to make it a bit broader

Note: See TracTickets for help on using tickets.