Opened 8 months ago

Last modified 5 days ago

#28113 needs_revision defect

notify systemd if shutdown will be longer than 30 seconds

Reported by: Hello71 Owned by: Hello71
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.6.2-alpha
Severity: Normal Keywords: tor-systemd, 029-backport-maybe, 033-backport-maybe, 034-backport-maybe, 035-backport-maybe, 033-backport-unreached, reviewer-was-teor-20190422
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

currently systemd just kills tor if the user sets ShutdownWaitLength more than 30 seconds. we should tell systemd not to kill tor.

Child Tickets

TicketStatusOwnerSummaryComponent
#28378reopenedteorAdd libsystemd-devel to the travis config, if availableCore Tor/Tor

Change History (24)

comment:1 Changed 8 months ago by Hello71

Status: assignedneeds_review

comment:2 Changed 8 months ago by teor

Keywords: tor-systemd 029-backport-maybe 033-backport-maybe 034-backport-maybe 035-backport-maybe added
Milestone: Tor: 0.3.5.x-final
Version: Tor: unspecifiedTor: 0.2.6.2-alpha

This has been a bug since we introduced systemd support in 0.2.6.2-alpha (or possibly 0.2.6.3-alpha).

It's a simple patch, so I think we might want to backport it. Tenatively assigning it to 0.3.5.

comment:3 Changed 8 months ago by teor

Reviewer: teor
Status: needs_reviewneeds_revision

I reviewed the pull request: a minor refactor and a comment would be helpful.

comment:4 Changed 8 months ago by Hello71

Status: needs_revisionneeds_review

comment:5 Changed 8 months ago by dgoulet

Failing Travis. Easy fix, on the changes file:

./changes/ticket28113:
	Missing subcategory on 'Minor features'

comment:6 Changed 8 months ago by teor

Status: needs_reviewneeds_revision

Please see my comments on the pull request.

We should test this pull request in an alpha before backporting it.

comment:7 Changed 8 months ago by teor

(It's pretty safe, because it only extends the systemd timeout. But we should check it doesn't extend the timeout too much.)

comment:8 Changed 8 months ago by nickm

I'd recommend that if we do backport, we backport no farther than 0.3.5.

comment:9 in reply to:  6 ; Changed 8 months ago by Hello71

Replying to teor:

Please see my comments on the pull request.

Replied there.

We should test this pull request in an alpha before backporting it.

Agreed.

Replying to teor:

(It's pretty safe, because it only extends the systemd timeout. But we should check it doesn't extend the timeout too much.)

As I explained on the PR, it doesn't matter too much if the number sent is too high, since systemd will continue immediately as long as tor actually exits before the timeout (it's a maximum limit, not a minimum or exact). If tor doesn't exit expeditiously, then that's a bug in tor. IMO it's worth trying to work around the (still hypothetical AFAIK) bug by waiting a little longer, rather than crashing and almost certainly corrupting cache files. Also, I figure nobody will care excessively if they set ShutdownWaitLength to 2 minutes and tor actually takes 2 minutes 15 seconds to quit, as long as we don't accidentally make it 20 minutes (but again, that would require another separate bug in tor).

comment:10 Changed 8 months ago by Hello71

Status: needs_revisionneeds_review

comment:11 in reply to:  9 Changed 8 months ago by teor

Status: needs_reviewneeds_revision

Thanks for these changes.

Let's fix to changes file, and add an extra comment explaining the default systemd timeout.

Version 0, edited 8 months ago by teor (next)

comment:12 Changed 7 months ago by teor

Status: needs_revisionmerge_ready

I pushed three small commits to your pull request to fix these issues:

  1. If tor's systemd TimeoutStopSec is less than the ShutdownWaitLength, plus the time it takes tor to shut down after its ShutdownWaitLength has expired, then systemd may terminate Tor abnormally. Let's explain that better in the changes file.
  1. Tor's systemd file in dist/ has a TimeoutSec of 30, which is too low:

https://gitweb.torproject.org/tor.git/tree/contrib/dist/tor.service.in
Debian has a TimeoutStopSec of 60, which seems like a good change to upstream:
https://gitweb.torproject.org/debian/tor.git/tree/debian/systemd/tor@default.service

  1. This is a bugfix, so it should get a bugfix changes file

See my branch ticket28113-029 for the squash and cherry-pick to 0.2.9:
https://github.com/torproject/tor/pull/474

This branch merges cleanly all the way to master:
https://github.com/torproject/tor/pull/475
(GitHub's pull requests can't do this merge, but git can.)

comment:13 Changed 7 months ago by nickm

Milestone: Tor: 0.3.5.x-finalTor: 0.3.4.x-final

Merging to 0.3.5, marking for possible backport.

comment:14 Changed 7 months ago by nickm

Hm. Apparently this broke with systemd-devel installed. 212bd9778b5c249f02f8fbdc1e8ccbe4c108f03a made it work for me, but I'm a little worried about how tested this could be.

comment:15 Changed 7 months ago by teor

Replying to nickm:

Hm. Apparently this broke with systemd-devel installed. 212bd9778b5c249f02f8fbdc1e8ccbe4c108f03a made it work for me, but I'm a little worried about how tested this could be.

I didn't test with libsystemd-devel, and travis doesn't have libsystemd-devel installed.

So I guess that just leaves any testing that Hello71 did.

comment:16 Changed 7 months ago by Hello71

Status: merge_readyneeds_information

I use Arch which includes devel files in the standard package. It turns out, though, that I either accidentally removed the patch from my local list or never included it in the first place, so it would seem that I haven't tested this at all. I may have assumed that Travis had systemd-devel already. Patch is already applied though, and seems like too much work to back it out, so let's say test it for a little while and see if anyone complains. :P

comment:17 Changed 6 months ago by Hello71

Status: needs_informationneeds_revision

unfortunately it seems that it does not actually work, I still get tor killed.

comment:18 Changed 4 months ago by teor

Milestone: Tor: 0.3.4.x-finalTor: unspecified

This ticket still needs changes and testing.

comment:19 Changed 4 months ago by teor

These open, non-merge_ready tickets can not get backported to 0.3.3, because 0.3.3 is now unsupported.

comment:20 Changed 4 months ago by teor

Keywords: 033-backport-unreached added

Hmm, I guess they should still get 033-backport-unreached

comment:21 Changed 8 weeks ago by teor

Keywords: reviewer-was-teor-20190422 added
Reviewer: teor

If these tickets go back in to needs_review, and I am on leave, they will need another reviewer.

comment:22 Changed 10 days ago by arma

Is this still a bug? Maybe somebody should open a Debian bug, and the systemd script there can get fixed?

comment:23 Changed 9 days ago by Hello71

well the code was committed, and as far as I can tell it just doesn't work. I haven't had time to figure out why, and I doubt I'll have time in the near to medium term future.

I think the systemd service doesn't require any (more) fixing. as far as I know, the TimeoutStopSec is sufficient in all distributions now. the problem is just that if you increase ShutdownWaitLength (in tor), you may need to also increase TimeoutStopSec in systemd. this is really a nice-to-have though; the code can be removed and this ticket closed if nobody wants to work on it.

comment:24 Changed 5 days ago by nickm

Removing 034-backport from all open tickets: 034 has reached EOL.

Note: See TracTickets for help on using tickets.