Opened 12 months ago

Last modified 12 months ago

#25803 new defect

Infinite restart loop when daemon crashes

Reported by: tiejohg2sahth Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: systemd, tor-relay, security-low
Cc: Actual Points:
Parent ID: Points: 0.1
Reviewer: Sponsor:

Description

On Ubuntu, using Tor from the official PPA, when the Tor daemon crashes (im my case because of this bug https://trac.torproject.org/projects/tor/ticket/23693), it is automatically restarted.

As a result Tor uses 100% CPU continuously because of its hopeless start->crash->start->crash... behavior.

A better behavior could be to detect the crash (SIGSEV, SIGABRT...) from the process return code and do not restart in that case.

The crash-restart loop behavior can also be dangerous security wise, because for example exploits against ASLR have a framework to do brute force attacks if the process automatically restarts.

I could not understand where this behavior comes from in the code, because the systemd service file in /lib/systemd/system/tor.service seem to be empty.

Child Tickets

Change History (9)

comment:1 Changed 12 months ago by dgoulet

Component: - Select a componentCore Tor/Tor
Keywords: systemd? added
Milestone: Tor: unspecified

comment:2 Changed 12 months ago by tiejohg2sahth

It appears that the file causing this behavior is /lib/systemd/system/tor[AT]default.service (replace [AT] by @, Trac thinks it's an email), because of the line:

Restart=on-failure

Possibles fix for this:

  • remove the Restart= line, the daemon won't restart if it crashed
  • add RestartSec=X to increase the time systemd waits before restarting the daemon, by default it is 100ms

comment:3 Changed 12 months ago by teor

Keywords: systemd tor-relay security-low added; systemd? removed
Points: 0.1

I think we might want to use Restart=on-success instead.

It doesn't make sense to restart in any of the listed failure modes:
https://www.freedesktop.org/software/systemd/man/systemd.service.html

Would you like to submit a patch for this?

The relevant file is:
https://gitweb.torproject.org/tor.git/tree/contrib/dist/tor.service.in

You may also need to submit separate bugs to the Debian and Ubuntu bug trackers to get them fixed.

I am marking this as security-low, because it could make exploitation easier.
But, it also makes maintaining availability harder, so it weakens our DoS resistance.

comment:4 in reply to:  3 ; Changed 12 months ago by arma

Replying to teor:

It doesn't make sense to restart in any of the listed failure modes:

I haven't learned much about systemd yet, so please ignore this if you have a better handle on things, but: in the past one of Tor's transient failure modes was that the system would start it before the system had set up its IP addresses (especially true with the world of ipv6), or before the system had set up its network interfaces, and if it just gave up right then, the system Tor would stay down. So retrying some times, especially at first boot, used to make sense.

comment:5 in reply to:  4 ; Changed 12 months ago by teor

Replying to arma:

Replying to teor:

It doesn't make sense to restart in any of the listed failure modes:

I haven't learned much about systemd yet, so please ignore this if you have a better handle on things, but: in the past one of Tor's transient failure modes was that the system would start it before the system had set up its IP addresses (especially true with the world of ipv6), or before the system had set up its network interfaces, and if it just gave up right then, the system Tor would stay down. So retrying some times, especially at first boot, used to make sense.

It still does, see #25182.

Here's what I suggest we do:

Restart after 60 seconds, rather than 0.1 seconds. Slowing the restart rate limits automated exploitation, and increases the likelihood that the network will be available.

RestartSec=60

We could also avoid restarting when Tor crashes, or exits badly. We would need to work out a list of signals and exit statuses that should prevent a restart. For example:

RestartPreventExitStatus= 1 6 SIGABRT SIGSEGV

comment:6 in reply to:  4 Changed 12 months ago by tiejohg2sahth

Replying to arma:

I haven't learned much about systemd yet, so please ignore this if you have a better handle on things, but: in the past one of Tor's transient failure modes was that the system would start it before the system had set up its IP addresses (especially true with the world of ipv6), or before the system had set up its network interfaces, and if it just gave up right then, the system Tor would stay down. So retrying some times, especially at first boot, used to make sense.

I am in no way a systemd expert either, but usually adding theses 2 lines in the service file ensures the network is ready when the service starts:

After=network-online.target
Wants=network-online.target

See: https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/

The goal of a modern init system is precisely to handle this cases properly and remove the burden of having to manually do retry loops.

comment:7 in reply to:  5 ; Changed 12 months ago by tiejohg2sahth

Replying to teor:

We could also avoid restarting when Tor crashes, or exits badly. We would need to work out a list of signals and exit statuses that should prevent a restart. For example:

RestartPreventExitStatus= 1 6 SIGABRT SIGSEGV

Why not use:

Restart=on-success

which would automatically prevent restart for non zero exit codes (tor error, or process killed by signal).
See the table at https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart=

comment:8 in reply to:  7 Changed 12 months ago by teor

Replying to tiejohg2sahth:

Replying to teor:

We could also avoid restarting when Tor crashes, or exits badly. We would need to work out a list of signals and exit statuses that should prevent a restart. For example:

RestartPreventExitStatus= 1 6 SIGABRT SIGSEGV

Why not use:

Restart=on-success

which would automatically prevent restart for non zero exit codes (tor error, or process killed by signal).
See the table at https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart=

That will only work if network-online.target always ensures that IPv4 and IPv6 are up.

comment:9 Changed 12 months ago by teor

Which it does not:

https://trac.torproject.org/projects/tor/ticket/25182#comment:12

Looks like we're stuck with the restart, but we can delay it by 10-60 seconds,

Note: See TracTickets for help on using tickets.