Opened 5 months ago

Closed 2 months ago

#26381 closed defect (fixed)

about:tor page does not load on first start on Windows and browser is stuck in endless reload cycle

Reported by: gk Owned by: pospeselr
Priority: Very High Milestone:
Component: Applications/Tor Browser Version:
Severity: Normal Keywords: tbb-torbutton, ff60-esr, TorBrowserTeam201809R, tbb-backport
Cc: mcs, brade, arthuredelstein, tbb-team, sysrqb, ezio, ma1 Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by gk)

When testing the Windows nightly builds I got greeted with an error page on first start

     <title>&aboutTor.title;</title>
------------^

This happened with a 32bit de bundle and I can't reproduce this on Linux. Moreover if I open about:tor in a new tab it works. From the second start on the about:tor page is shown as well.

Furthermore, after the second session Tor Browser is stuck on an endless reload cycle when visiting websites and other extensions are broken.

Child Tickets

Change History (37)

comment:1 Changed 5 months ago by mcs

Kathy and I can reproduce this on Windows (32-bit de build) but not on macOS (de build).

It almost seems like an e10s problem, but I am not sure why it would be Windows-specific.

We also see both the Firefox "spoof language?" prompt as well as the Torbutton one. Is there already a ticket open for that issue?

comment:2 Changed 5 months ago by arthuredelstein

Cc: arthuredelstein added

comment:3 in reply to:  1 Changed 5 months ago by gk

Replying to mcs:

Kathy and I can reproduce this on Windows (32-bit de build) but not on macOS (de build).

It almost seems like an e10s problem, but I am not sure why it would be Windows-specific.

We also see both the Firefox "spoof language?" prompt as well as the Torbutton one. Is there already a ticket open for that issue?

Now we have: #26409.

comment:4 Changed 5 months ago by mcs

Kathy and I spent a little time looking at this today. I am recording our findings here in case someone else has any idea what might be going on. Here is what we know:

  1. The error is due to an undefined entity with aboutTor.xhtml.
  2. We only see the problem on Windows.
  3. This problem does not occur when multiprocess mode is disabled via set MOZ_FORCE_DISABLE_E10S=1.
  4. This problem does not occur when most of Tor Launcher is disabled during startup via set TOR_SKIP_LAUNCH=1.
  5. This problem does not occur when we allow Tor Launcher to do all of its work but suppress the windows that it normally opens (we had to modify the Tor Launcher code to make this possible).

This is all quite strange and leads Kathy and me to think that there is a general Firefox bug in loading of extension DTD files, e.g., maybe when e10s is enabled there is a race during startup which puts things in a bad state after Tor Launcher uses a DTD file during startup. Of course Tor Launcher should not be able to interfere with Torbutton in this way.

comment:5 Changed 5 months ago by gk

One observation I had today while testing a Windows sv-SE bundle: it seems that NoScript is somehow not properly working as well? At least adjusting the security slider does not seem to work and I can't click on the NoScript icon. However, that might be worth a different bug as I have this issue with an en-US bundle, too. But on my Linux box it is working as expected.

comment:6 in reply to:  5 Changed 5 months ago by arthuredelstein

Replying to gk:

One observation I had today while testing a Windows sv-SE bundle: it seems that NoScript is somehow not properly working as well? At least adjusting the security slider does not seem to work and I can't click on the NoScript icon. However, that might be worth a different bug as I have this issue with an en-US bundle, too. But on my Linux box it is working as expected.

I opened #26506 to look at this issue further.

comment:7 in reply to:  4 Changed 5 months ago by brade

Replying to mcs:

Kathy and I spent a little time looking at this today. I am recording our findings here in case someone else has any idea what might be going on. ...

Removing prefs.js allows you to reproduce this bug without installing a new browser.

comment:8 in reply to:  4 Changed 5 months ago by mcs

Replying to mcs:

  1. This problem does not occur when we allow Tor Launcher to do all of its work but suppress the windows that it normally opens (we had to modify the Tor Launcher code to make this possible).

For testing, here is a patch that implements the above (no windows opened by Tor Launcher):
https://gitweb.torproject.org/user/brade/tor-launcher.git/commit/?h=bug26381-test-01&id=698e6fe038d2c8b22efe569b6e2bdd77dd2327b9

comment:9 Changed 5 months ago by mcs

Tracking down the root cause of this is proving to be challenging. My Windows debugging skills seem to be inadequate, especially when the browser is running in multiprocess mode. Here are some additional things that Kathy and I learned:

  • In addition to the undefined entity bug, sometimes a blank page is loaded instead of about:tor. When a blank page is loaded, that tab is useless (all pages silently fail to load, including internal pages such as about:config).
  • The problems with about:tor do not always occur. They seem to occur every time with a clean install or when running with TOR_FORCE_NET_CONFIG=1.
  • When about:tor does not load correctly, Wextensions (all) do not work either. This points to some kind of race or other bug during initialization.
  • Reducing the sandbox level makes these problems disappear: if I set security.sandbox.content.level to 2 about:tor loads correctly every time, but a setting of 3 or higher causes problems.

comment:10 Changed 5 months ago by gk

The sandbox related part is interesting. I wonder whether we could create a minimum PoC for that, so that the issue is visible in a vanilla Firefox, too, and then we could go with that to Mozilla's bugzilla to get further help...

comment:11 Changed 5 months ago by gk

Keywords: TorBrowserTeam201807 added; TorBrowserTeam201806 removed

Moving first batch of tickets to July 2018

comment:12 Changed 4 months ago by gk

Keywords: TorBrowserTeam201808 added; TorBrowserTeam201807 removed

Move our tickets to August.

comment:13 Changed 3 months ago by pospeselr

Owner: changed from tbb-team to pospeselr
Status: newassigned

comment:14 Changed 3 months ago by boklm

Cc: tbb-team added

comment:15 Changed 3 months ago by gk

Cc: sysrqb ezio ma1 added
Description: modified (diff)
Summary: about:tor page does not load on first start in localized Windows bundleabout:tor page does not load on first start in localized Windows bundle and browser is stuck in endless reload cycle

So, it turns out that the endless reload cycle and broken extensions (#27261) are very likely caused by the same underlying issue as this bug: they don't appear once I set security.sandbox.content.level to 2 but appear as soon again as I set it to 3 or above.

So, I closed #27261 as duplicate.

comment:16 in reply to:  15 Changed 3 months ago by arthuredelstein

Replying to gk:

So, it turns out that the endless reload cycle and broken extensions (#27261) are very likely caused by the same underlying issue as this bug: they don't appear once I set security.sandbox.content.level to 2 but appear as soon again as I set it to 3 or above.

So, I closed #27261 as duplicate.

I closed #27291 as well for the same reason.

comment:17 Changed 3 months ago by gk

Summary: about:tor page does not load on first start in localized Windows bundle and browser is stuck in endless reload cycleabout:tor page does not load on first start on Windows and browser is stuck in endless reload cycle

comment:18 Changed 3 months ago by fixtbb

Never try to install portable software to "Windows-protected" folders!
(%PROGRAMFILES%, %USERPROFILE%\Desktop, etc)
(I've had a hard time to reproduce your issue on all my Windows 7-10 machines, because I never install something to Windows-specific folders without a reason.)
The first error is

08:08:53.543 uncaught exception: Error opening input stream (invalid filename?): chrome://browser/content/ext-c-browser.js 1 (unknown)

then there're some from extensions and the last is #27291.

Last edited 3 months ago by fixtbb (previous) (diff)

comment:19 Changed 3 months ago by fixtbb

Delete startupCache folder, and it will work again (until the next restart).

Last edited 3 months ago by fixtbb (previous) (diff)

comment:20 Changed 3 months ago by gk

FWIW: we opened a ticket in Mozilla's bug tracker (https://bugzilla.mozilla.org/show_bug.cgi?id=1485836). Maybe we can get some help that way.

comment:21 Changed 3 months ago by reportUrl

What first comes to mind about sandbox is that startupCache and the other code could use different path specifications: absolute and relative (which uses Shell Folders API for Special Folders). So, when content/extensions sandbox restricts file access to the current folder, it works for absolute or relative path only.

comment:22 Changed 3 months ago by reportUrl

Bugzilla has https://bugzilla.mozilla.org/show_bug.cgi?id=1453746
Also you should mention there that it happens only when you install Firefox into a Special Folder.

comment:23 Changed 3 months ago by pospeselr

Keywords: TorBrowserTeam201808R added; TorBrowserTeam201808 removed
Status: assignedneeds_review

Simple enough patch to set the sandbox level to 2 in the Windows build (until I can figure out the root issue):

https://gitweb.torproject.org/user/richard/tor-browser.git/commit/?h=bug_26381&id=f41857c98e0271c575b74924adca4f0988fa5f8f

Doing a full rbm-build to make sure nothing funky happens.

comment:24 Changed 3 months ago by scriptCache

It affects shitty scriptCache*.bin/urlCache*.bin (what does it cache?!) stuff only. They are not created in startupCache folder in *\Desktop location. Both 'omni.ja's are mapped successfully, and there is no visible problem with tokens (compared to the official build).
Maybe, some mess with paths/rights in https://bugzilla.mozilla.org/show_bug.cgi?id=1359653?
(BTW, why does TB have version 60.1.0.6609, but Firefox - 60.1.0.6746?)

comment:25 in reply to:  23 Changed 3 months ago by gk

Status: needs_reviewnew

Replying to pospeselr:

Simple enough patch to set the sandbox level to 2 in the Windows build (until I can figure out the root issue):

https://gitweb.torproject.org/user/richard/tor-browser.git/commit/?h=bug_26381&id=f41857c98e0271c575b74924adca4f0988fa5f8f

Doing a full rbm-build to make sure nothing funky happens.

Looks good to me. Cherry-picked the workaround to tor-browser-60.1.0esr-8.0-1 (commit b8dcb4f1ab5f06017cee025a2ad35cd17d869679). I guess we can use this ticket for track the issue further down, thus setting state back to new.

comment:26 Changed 3 months ago by gk

Keywords: TorBrowserTeam201808 added; TorBrowserTeam201808R removed

comment:27 Changed 2 months ago by pospeselr

Keywords: TorBrowserTeam201808R added; TorBrowserTeam201808 removed
Status: newneeds_review

With some clues from Bow Owen, I've figured out what's happening and have a patch with a potential fix!

Basically our tor-launcher extension creates a window once tor.exe has launched, initialized and signaled back to the tor-launcher extension. However, this all can happen before XRE_mainRun completes and makes it to SandboxBroker::GeckoDependentInitialize (which is where the static paths are initialized). As a result, these various paths are null, which causes SandboxBroker::AddCachedDirs to fail until XRE_mainRun has had a chance to catch up.

The issue goes away if I move the call to SandboxBroker::GeckoDependenInitialize to before nsXREDirProvider::DoStartup (which is where various services and extensions are initialized). All the child content processes correctly go through the SandboxBroker::AddCachedDirs calls, about:tor loads and the tab works.

This patch moves up the call to SandboxBroker::GeckoDependentInitialize() and removes the temporary fix that reduced the sandbox level.

https://gitweb.torproject.org/user/richard/tor-browser.git/commit/?h=bug_26381&context=9&ignorews=0&dt=1

comment:28 Changed 2 months ago by gk

Keywords: TorBrowserTeam201809 added

Moving our tickets to September 2018

comment:29 Changed 2 months ago by pospeselr

Bob Owen has some issues with this solution, and has suggested having tor-launcher init on the 'final-ui-startup' callback, rather than 'profile-after-change' (which is what we're doing now). I'll try this out and see how it goes.

comment:30 Changed 2 months ago by pospeselr

Keywords: TorBrowserTeam201809R added; TorBrowserTeam201809 removed

Patch for tor-launcher: https://gitweb.torproject.org/user/richard/tor-launcher.git/commit/?h=bug_26381&id=1c660a194b9be54671735db4316fe4bbfb0db4a3&context=10&ignorews=0&dt=1

Verified this change is working as expected, no more dead tabs!

EDIT: We also need to revert f41857c98e0271c575b74924adca4f0988fa5f8f in tor-browser as part of this change.

Last edited 2 months ago by pospeselr (previous) (diff)

comment:31 Changed 2 months ago by brade

Mark and I spent some time looking at potential problems with making this change. Tor Launcher intentionally hooks in early so it can temporarily block certain things such as network activity.

We specifically looked at the code that executes between profile-after-change and final-ui-startup. Here are the things we saw that concerned us:

  • Command line processing. This will not cause problems because the final processing that handles things such as URLs passed on the command line is deferred until after final-ui-startup.
  • Creation of the hidden window. As far as Mark and I know, the hidden window is mainly used to support event processing and Mac menu commands. Therefore, we don't think it will be a problem if the hidden window is created before Tor Launcher is loaded.
  • Startup of components that have the profile-after-change category. We think this one is an issue. For example, after applying the proposed fix we saw the update service attempt an update ping while the Tor Launcher initial configuration window was open.

comment:32 in reply to:  31 Changed 2 months ago by gk

Keywords: TorBrowserTeam201809 added; TorBrowserTeam201808R TorBrowserTeam201809R removed
Status: needs_reviewneeds_revision

Replying to brade:

Mark and I spent some time looking at potential problems with making this change. Tor Launcher intentionally hooks in early so it can temporarily block certain things such as network activity.

We specifically looked at the code that executes between profile-after-change and final-ui-startup. Here are the things we saw that concerned us:

  • Command line processing. This will not cause problems because the final processing that handles things such as URLs passed on the command line is deferred until after final-ui-startup.
  • Creation of the hidden window. As far as Mark and I know, the hidden window is mainly used to support event processing and Mac menu commands. Therefore, we don't think it will be a problem if the hidden window is created before Tor Launcher is loaded.
  • Startup of components that have the profile-after-change category. We think this one is an issue. For example, after applying the proposed fix we saw the update service attempt an update ping while the Tor Launcher initial configuration window was open.

Nice find. It seems we need something else here. The two options proposed in the Mozilla ticket did not seem to make bobowen happy. Could we cheat the startupcache here? It seems getting rid of that one "solves" the problem as well (but I have not looked exactly why and whether we might be able to just remove the single piece in it that is troubling us)...

comment:33 in reply to:  31 Changed 2 months ago by pospeselr

Replying to brade:

  • Startup of components that have the profile-after-change category. We think this one is an issue. For example, after applying the proposed fix we saw the update service attempt an update ping while the Tor Launcher initial configuration window was open.

This does seem to be a problem, and it doesn't seem like we can fix it in the extension. Went through this today and can confirm, tor-launcher blocks on 'profile-after-change' until the network settings dialog is dismissed, which allows the update service and what-not to complete after a tor connection has been set up.

I've done some digging into the original proposed patch (moving the sandbox init up before DoStartup) and I can confirm that it does configure all of the directories correctly, except for the NS_APP_USER_PROFILE_50_DIR directory, which will probably cause some issues (though it seems to be init'd sometime between the entry into DoStartup and the first service initialization in there). I'll see if we can encourage NS_APP_USER_PROFILE_50_DIR to be defined earlier tomorrow.

comment:34 Changed 2 months ago by pospeselr

So after some investigation today, I've figured out where we can insert the call that initializes the whitelisted directory paths such that they are all initialized correctly:

Updated patch: https://gitweb.torproject.org/user/richard/tor-browser.git/commit/?h=bug_26381_v2&context=10&ignorews=0&dt=0

I've also reopened the Mozilla bug since their proposed solution is a non-starter, due to the blocking nature of the tor configuration window and the updater's dependency on tor to work properly.

comment:35 Changed 2 months ago by pospeselr

Keywords: TorBrowserTeam201809R added; TorBrowserTeam201809 removed
Status: needs_revisionneeds_review

comment:36 Changed 2 months ago by brade

mcs and I do not see any problem with this approach or your patch, but it would be reassuring to receive some feedback from the Mozilla engineers.

Last edited 2 months ago by mcs (previous) (diff)

comment:37 Changed 2 months ago by gk

Keywords: tbb-backport added
Resolution: fixed
Status: needs_reviewclosed

Bob seems to be happy with it, so let's take it for the alpha (there are some "." missing at the end of several sentences in the code but I take this nevertheless to finish the 8.5a2 release prep). commit 62001bb019dbf0956454f8fc550ccc1e81ee8d3c has the fix on tor-browser-60.2.0esr-8.5-1. Nice work, Richard!

Note: See TracTickets for help on using tickets.