Opened 5 weeks ago

Closed 3 weeks ago

#30239 closed defect (fixed)

TBA: Gracefully handle auto-restart after crash

Reported by: sysrqb Owned by: tbb-team
Priority: Medium Milestone:
Component: Applications/Tor Browser Version:
Severity: Normal Keywords: tbb-mobile tbb-8.5-must TorBrowserTeam201905R
Cc: sisbell Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Sometimes Android kills background/idle applications (arbitrarily) as a way of freeing memory needed by active/foreground apps or need by other system services. In some cases, the app is automatically restarted after some time period. Tor Browser on Android does not handle this well. When the app is opened, after it is "restarted", the user is shown a blank, purple screen. This leads me to think it is loading the TorBootstrap fragment, but it doesn't render (?).

13:59:19.189  3366  8097 I WindowManager: WIN DEATH: Window{bfd4312 u0 org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp}
13:59:19.223  3366  3380 I ActivityManager: Process org.torproject.torbrowser_alpha (pid 18467) has died: prcp FGS 
13:59:19.223  3366  3380 W ActivityManager: Scheduling restart of crashed service org.torproject.torbrowser_alpha/org.mozilla.gecko.media.MediaControlService in 23746ms
13:59:19.223  3366  3380 W ActivityManager: Scheduling restart of crashed service org.torproject.torbrowser_alpha/org.torproject.android.service.TorService in 23746ms
13:59:43.190  3366  3383 I ActivityManager: Start proc 32619:org.torproject.torbrowser_alpha/u0a163 for service org.torproject.torbrowser_alpha/org.torproject.android.service.TorService

15:46:44.817  3366  5721 I ActivityManager: START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x10200000 cmp=org.torproject.torbrowser_alpha/.App bnds=[237,1607][448,1807] (has extras)} from uid 10031
15:46:44.966  3366  5711 I ActivityManager: START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x200000 cmp=org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp bnds=[237,1607][448,1807] (has extras)} from uid 10163

15:46:45.058 32619 32641 D GeckoSharedPrefs: Current version = 2, prefs version = 2

I can reproduce this by inducing a crash while the app is in the background (open torbrowser, then switch to another app, and then induce the crash, then reopen the torbrowser).

$ adb shell am crash org.torproject.torbrowser_alpha

Child Tickets

Change History (9)

comment:1 Changed 5 weeks ago by sysrqb

For comparison:

Inducing crash when app is in the foreground:

16:45:42.208  3366  3672 I ActivityManager: Killing 18191:org.torproject.torbrowser_alpha/u0a163 (adj 200): crash 
16:45:42.209  3366  3672 W ActivityManager: Scheduling restart of crashed service org.torproject.torbrowser_alpha/org.torproject.android.service.TorService in 1000ms
16:45:42.210  3366  3672 W ActivityManager: Scheduling restart of crashed service org.torproject.torbrowser_alpha/org.mozilla.gecko.media.MediaControlService in 11000ms
16:45:42.223  3366  3672 I ActivityManager: START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x10300000 cmp=org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp (has extras)} from uid 10163 
16:45:42.393  3366  3385 I zygote64: Successfully killed process cgroup uid 10163 pid 18191 in 183ms 
16:45:42.401  3366  3852 I ActivityManager: Start proc 7967:org.torproject.torbrowser_alpha/u0a163 for activity org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp
16:45:42.421  3366  3383 W ActivityManager: Stopping service due to app idle: u0a163 -12h38m17s389ms org.torproject.torbrowser_alpha/org.torproject.android.service.TorService
16:45:42.557  7967  7967 I GeckoApplication: zerdatime 1133026572 - application start 
16:45:42.597  7967  7967 D GeckoMemoryMonitor: onTrimMemory() notification received with level 5
16:45:42.597  7967  7967 D GeckoMemoryMonitor: increasing memory pressure to 2
16:45:42.677  7967  7987 D GeckoSharedPrefs: Current version = 2, prefs version = 2
16:45:42.900  7967  7967 D GeckoThread: State changed to LAUNCHED
16:45:42.901  7967  7991 I GeckoThread: preparing to run Gecko 
16:45:42.902  7967  7991 D GeckoThread: State changed to MOZGLUE_READY
16:45:42.915  7967  7991 E GeckoLibLoad: Load sqlite start

Inducing crash when app is in the background:

16:48:45.775  3366 32292 I ActivityManager: Killing 7967:org.torproject.torbrowser_alpha/u0a163 (adj 200): crash 
16:48:45.776  3366 32292 W ActivityManager: Scheduling restart of crashed service org.torproject.torbrowser_alpha/org.mozilla.gecko.media.MediaControlService in 1000ms
16:48:45.776  3366 32292 W ActivityManager: Scheduling restart of crashed service org.torproject.torbrowser_alpha/org.torproject.android.service.TorService in 1000ms
16:48:45.854  3366  5727 I WindowManager: WIN DEATH: Window{117bf63 u0 org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp}
16:48:45.865  3366  3385 I zygote64: Successfully killed process cgroup uid 10163 pid 7967 in 88ms
16:48:46.793  3366  3383 I ActivityManager: Start proc 8338:org.torproject.torbrowser_alpha/u0a163 for service org.torproject.torbrowser_alpha/org.torproject.android.service.TorService
16:48:46.883  8338  8338 I GeckoApplication: zerdatime 1133210899 - application start 
16:48:46.905  8338  8338 D GeckoMemoryMonitor: onTrimMemory() notification received with level 5
16:48:46.905  8338  8338 D GeckoMemoryMonitor: increasing memory pressure to 2
16:48:46.910  8338  8338 W orbrowser_alpha: type=1400 audit(0.0:467): avc: denied { setattr } for name="libTor.so" dev="dm-2" ino=392562 scontext=u:r:untrusted_app:s0:c512,c768 tcontext=u:object_r:apk_data_file:s0 tclass=file permissive=0
16:48:46.917  8338  8338 I TorService: onCreate end
16:48:46.930  8338  8338 D Orbot   : Got null onStartCommand() intent
16:48:47.039  8338  8354 D torResources: deleting existing torrc.custom
16:48:59.927  3946  3946 D ServiceStateProvider: subId=1


16:50:24.837  3366  1981 I ActivityManager: START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x10200000 cmp=org.torproject.torbrowser_alpha/.App bnds=[237,1607][448,1807] (has extras)} from uid 10031 
16:50:24.897  3366  3379 I ActivityManager: START u0 {act=android.intent.action.MAIN cat=[android.intent.category.LAUNCHER] flg=0x200000 cmp=org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp bnds=[237,1607][448,1807] (has extras)} from uid 10163 
16:50:24.952  8338  8353 D GeckoSharedPrefs: Current version = 2, prefs version = 2
16:50:24.960  8338  8338 D GeckoThread: State changed to LAUNCHED
16:50:24.961  8338  8469 I GeckoThread: preparing to run Gecko 
16:50:24.962  8338  8469 D GeckoThread: State changed to MOZGLUE_READY

comment:2 Changed 5 weeks ago by sysrqb

Keywords: tbb-8.5-must TorBrowserTeam201904 added

Adding these as 8.5-must, because this is a very bad user experience.

comment:3 Changed 4 weeks ago by sysrqb

The correct solution involves handling the Fragment lifecycle correctly.

The simple explanation for why this is happening is Fennec starts its services (like MediaControlService) with the START_NOT_STICKY value which means the service is not automatically restarted when the system kills the app. On the other hand. Orbot's TorService is started with START_STICKY, so the system restarts it automatically. This results in part of the previous state being restored when it is restarted, but not the entire state. In this case, the Fragment instances are recreated, but they aren't correctly initialized. We can (and should) change how we implement TorBootstrapPager in the future, but the easy solution is tweaking TorService so it doesn't automatically restart after being killed. Automatically restarting the service doesn't provide any noticeable performance improvements and it is only causing problems right now.

comment:4 Changed 4 weeks ago by sysrqb

Cc: sisbell added

Shane, I'll create a patch for the tor-android-service project in tor-browser-build for this, but do you think this is reasonable? I haven't tested it yet (so this is a little bit of a guess). Essentially, I'm thinking about this:

diff --git a/service/src/main/java/org/torproject/android/service/TorService.java b/service/src/main/java/org/torproject/android/service/TorService.java
index a748241..80c1dfa 100644
--- a/service/src/main/java/org/torproject/android/service/TorService.java
+++ b/service/src/main/java/org/torproject/android/service/TorService.java
@@ -270,7 +270,7 @@ public final class TorService extends Service implements TorServiceConstants, Or
             mExecutor.execute(new IncomingIntentRouter(intent));
         else
             Log.d(OrbotConstants.TAG, "Got null onStartCommand() intent");
-        return Service.START_STICKY;
+        return Service.START_NOT_STICKY;
     }
 
     @Override

comment:5 in reply to:  4 ; Changed 4 weeks ago by sisbell

Replying to sysrqb:

Shane, I'll create a patch for the tor-android-service project in tor-browser-build for this, but do you think this is reasonable? I haven't tested it yet (so this is a little bit of a guess). Essentially, I'm thinking about this:

diff --git a/service/src/main/java/org/torproject/android/service/TorService.java b/service/src/main/java/org/torproject/android/service/TorService.java
index a748241..80c1dfa 100644
--- a/service/src/main/java/org/torproject/android/service/TorService.java
+++ b/service/src/main/java/org/torproject/android/service/TorService.java
@@ -270,7 +270,7 @@ public final class TorService extends Service implements TorServiceConstants, Or
             mExecutor.execute(new IncomingIntentRouter(intent));
         else
             Log.d(OrbotConstants.TAG, "Got null onStartCommand() intent");
-        return Service.START_STICKY;
+        return Service.START_NOT_STICKY;
     }
 
     @Override

Looking through the code, It looks like a restart is essentially a NOOP. A sticky restart will restart with a null intent, which essentially does nothing. This is not what we want, since it leaves the event handlers to still receive events from the main app, even though a tor process is not running. Good catch.

One alternative to removing the STICKY flag would be to startTor on a null intent so it also starts up with the Android service. This really depends on what the main app is expecting. Without the sticky flag the main app will need to detect the crash and restart the service.

comment:6 in reply to:  5 Changed 4 weeks ago by sysrqb

Replying to sisbell:

Looking through the code, It looks like a restart is essentially a NOOP. A sticky restart will restart with a null intent, which essentially does nothing. This is not what we want, since it leaves the event handlers to still receive events from the main app, even though a tor process is not running. Good catch.

Yeah, that was my thought, too - thanks.

One alternative to removing the STICKY flag would be to startTor on a null intent so it also starts up with the Android service. This really depends on what the main app is expecting. Without the sticky flag the main app will need to detect the crash and restart the service.

I can see this being useful in other apps, like Orbot, if they correctly handle this case, but in the case of Tor Browser we basically scrub all previous state on restart in any case (due to #28507), so restarting the service is not what we want right now. I wonder if this should be configurable by a preference (which can be checked in onCreate() or onStartCommand()).

I'll mention here, in the future I think we want to handle "the system killed us" in a different way than "the user killed us" because losing state in the app (and therefore all open tabs) when the system kills the app is a terrible user experience. So, it may be the case in the future we want to restart the service automatically when it is killed by the system (using the same configuration it used previously). But, we're not there yet.

comment:7 Changed 3 weeks ago by sysrqb

Status: newneeds_review

This required more debugging. TorService setting START_STICKY was a red herring and has no connection with this bug, as far as I can see. After stepping through the code with the debugger, I found this bug is caused because the UI is not created exactly the same when the app is first started compared with after the app is killed and it is restarted.

In the first instance, when the app is started the first time, the UI components are created in the logical order (started at the root View, each layer is created on top of the previous layer).

However, when the app is restarting after a crash or being killed, some UI components are created earlier than we expect. Specifically, the TorBootstrap{,Log}Panel Fragments are recreated by Android's FragmentManager early in the app startup process because the FragmentManager caches this information about the fragments previously existing. However, these fragments are created before the app creates the ViewPager (TorBootstrapPager) which is the underlying container of the Fragments. As a result, the Fragments are created without a parent, and the ViewPager isn't told about these fragments when it is created. Therefore, after all the Views and UI components are created, the ViewPager has zero children and the Fragments have a null parent reference.

This seems like a bug other app developers would experience too, so there may be a more Androidy-way of correctly handling this situation, but I didn't see it.

The first commit is only simplifying existing code, the second commit is similar and moves Fragment instantiation from getItem() to the Constructor. The third commit is the bugfix. When TorBootstrapPager.load() is called, we instantiate a new ViewPagerAdapter and call setAdapter() as we did before, but now we iterate over the Adapter's items and call addView() if that item's getView().getParent() is null.

Branch bug30239_02. This is tested on Android 19, 23, and 28.

The easiest steps for reproducing are:

  1. Launch app normally
  2. Press "Home" button so Tor Browser goes into the background and you see the your device's home screen
  3. run adb shell am kill org.torproject.torbrowser_alpha (substitute adb with whatever you need so it works)
  4. Open the "Recent Apps" menu and select Tor Browser

Without this proposed patch, you should see a blank purple screen, and with this patch you should see the expected bootstrapping screen.

In logcat, you should see something like this when you successfully kill the app:

(Android 19)

D/AndroidRuntime( 3996): Calling main entry com.android.commands.am.Am
D/AndroidRuntime( 3996): Shutting down VM                  
I/ActivityManager( 1603): Killing 3845:org.torproject.torbrowser_alpha/u0a62 (adj 7): kill background
D/jdwp    ( 3996): Got wake-up signal, bailing out of select
D/dalvikvm( 3996): Debugger has detached; object registry had 1 entries
W/InputDispatcher( 1603): channel '9d2c4198 org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp (server)' ~ Consumer closed input channel or an error occurred.  events=0x9   
E/InputDispatcher( 1603): channel '9d2c4198 org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp (server)' ~ Channel is unrecoverably broken and will be disposed!
W/InputDispatcher( 1603): Attempted to unregister already unregistered input channel '9d2c4198 org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp (server)'
I/WindowState( 1603): WIN DEATH: Window{9d2d2278 u0 SurfaceView}
I/WindowState( 1603): WIN DEATH: Window{9d2c4198 u0 org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp}

(Android 23)

05-01 23:05:19.533  8002  8002 D AndroidRuntime: Calling main entry com.android.commands.am.Am                                                                                                              
05-01 23:05:19.534  1597  2804 I ActivityManager: Killing 7895:org.torproject.torbrowser_alpha/u0a143 (adj 7): kill background
05-01 23:05:19.546  1597  1637 W InputDispatcher: channel 'cf88dca org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp (server)' ~ Consumer closed input channel or an error occurred.  events=0x9 
05-01 23:05:19.546  1597  1637 E InputDispatcher: channel 'cf88dca org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp (server)' ~ Channel is unrecoverably broken and will be disposed!
05-01 23:05:19.548  1597  1609 D GraphicsStats: Buffer count: 4
05-01 23:05:19.548  1597  2649 I WindowState: WIN DEATH: Window{cf88dca u0 org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp}
05-01 23:05:19.548  1597  2649 W InputDispatcher: Attempted to unregister already unregistered input channel 'cf88dca org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp (server)'
05-01 23:05:19.548  1597  2649 W WindowManager: Force-removing child win Window{8b0b2a3 u0 SurfaceView} from container Window{cf88dca u0 org.torproject.torbrowser_alpha/org.mozilla.gecko.BrowserApp}      
05-01 23:05:19.552  1597  2185 W WindowManager: Failed looking up window
05-01 23:05:19.552  1597  2185 W WindowManager: java.lang.IllegalArgumentException: Requested window android.os.BinderProxy@ff9bc96 does not exist
05-01 23:05:19.552  1597  2185 W WindowManager:         at com.android.server.wm.WindowManagerService.windowForClientLocked(WindowManagerService.java:8733)
05-01 23:05:19.552  1597  2185 W WindowManager:         at com.android.server.wm.WindowManagerService.windowForClientLocked(WindowManagerService.java:8724)
05-01 23:05:19.552  1597  2185 W WindowManager:         at com.android.server.wm.WindowState$DeathRecipient.binderDied(WindowState.java:1209)
05-01 23:05:19.552  1597  2185 W WindowManager:         at android.os.BinderProxy.sendDeathNotice(Binder.java:558)   
05-01 23:05:19.552  1597  2185 I WindowState: WIN DEATH: null
05-01 23:05:19.617  1597  2804 W ActivityManager: Scheduling restart of crashed service org.torproject.torbrowser_alpha/org.mozilla.gecko.media.MediaControlService in 1000ms
05-01 23:05:19.617  1597  2804 W ActivityManager: Scheduling restart of crashed service org.torproject.torbrowser_alpha/org.torproject.android.service.TorService in 1000ms
05-01 23:05:19.618  8002  8002 D AndroidRuntime: Shutting down VM
05-01 23:05:19.618  1597  1721 W ActivityManager: Spurious death for ProcessRecord{6bbe9a0 0:org.torproject.torbrowser_alpha/u0a143}, curProc for 7895: null

comment:8 Changed 3 weeks ago by gk

Keywords: TorBrowserTeam201905R added; TorBrowserTeam201904 removed

comment:9 Changed 3 weeks ago by gk

Resolution: fixed
Status: needs_reviewclosed

Thanks for the deatailed explanation and instructions. The changes look good to me. I've cherry-picked them onto tor-browser-60.6.1esr-8.5-1 (commits 7b6b87d4766fbf7642cd967d0c37dbe0ddef8fdd, 2665a59cc56b12e27ac050e31b2db65b66cd1878, and 8a9df95b91c15cd1923833acadceb0dce777a665).

I was not able to test the patch as much as I wanted, though. Using kill is not possible on my non-rooted device and force-stop does not trigger the bug for me...

Note: See TracTickets for help on using tickets.