Opened 11 months ago

Closed 5 months ago

#31614 closed defect (fixed)

Implement clean_up_backtrace_handler()

Reported by: teor Owned by: teor
Priority: Medium Milestone: Tor: 0.4.1.x-final
Component: Core Tor/Tor Version: Tor: 0.3.5.1-alpha
Severity: Normal Keywords: consider-backport-after-042-stable, diagnostics, 042-should, 041-backport-maybe, regression, BugSmashFund
Cc: Actual Points: 0.4
Parent ID: Points: 0.2
Reviewer: nickm Sponsor:

Description (last modified by teor)

Split off #31594:

clean_up_backtrace_handler() doesn't do anything, but it should:

  • disable backtrace signal handlers
  • destroy the backtrace mutex (regression to #21788)

Child Tickets

TicketStatusOwnerSummaryComponent
#31594closedteorClose all the log fds before abortingCore Tor/Tor
#31734closedteorAdd accessor functions for cb_buf, which enforce locking and unlockingCore Tor/Tor
#31736closedteorStop using mutex_destroy(), when multiple threads can still access the mutexCore Tor/Tor

Change History (29)

comment:1 Changed 11 months ago by teor

Actual Points: 0.2
Cc: gaba removed
Keywords: BugSmash added; android macos removed
Points: 0.2
Status: newneeds_review
Version: Tor: 0.3.5.1-alpha

See my pull request:

The merge forward was clean.

Here are the test branches for merging forwards:

comment:2 Changed 11 months ago by teor

Description: modified (diff)
Keywords: regression added

comment:3 Changed 11 months ago by teor

Keywords: 035-backport-maybe 040-backport-maybe 041-backport-maybe added; 035-backport 040-backport 041-backport removed

comment:4 Changed 11 months ago by teor

Keywords: consider-backport-after-042-stable consider-backport-if-needed added

I don't think we should backport this fix, unless the bug is actually causing issues in older versions.

comment:5 Changed 11 months ago by teor

Owner: set to teor
Status: needs_reviewassigned

comment:6 Changed 11 months ago by teor

Status: assignedneeds_review

comment:7 Changed 11 months ago by teor

Keywords: BugSmashFund added; BugSmash removed

Fix bug smash fund spelling

comment:8 Changed 11 months ago by dgoulet

Reviewer: dgoulet

comment:9 Changed 11 months ago by teor

Parent ID: #31594

This ticket is independent of its parent now.

comment:10 Changed 11 months ago by dgoulet

Status: needs_reviewneeds_revision

Question that might result in no change but still there. One tiny nitpick also.

comment:11 Changed 11 months ago by teor

You're right, there are a bunch of nasty race conditions here.

This is what we need to fix:

  1. Always lock the mutex before using cb_buf (there are some places where we don't do this)
  2. Lock the mutex before adding or removing signal handlers (might not be strictly required, but it makes reasoning about locks easier)
  3. Abort if destroying the mutex fails (doesn't save us from undefined behaviour, but does notify us when it happens)

We could also

  1. Lock a mutex before logging anywhere in the err module (this could cause more issues than it solves)

The best way to do this might be to put cb_buf in its own file, and use accessor functions.

comment:12 Changed 11 months ago by teor

Status: needs_revisionneeds_review

I removed the commit that added undefined behaviour, and edited the changes file.

Please review my PR:

The merge forward is clean, the test branches are here:

comment:13 Changed 11 months ago by teor

Actual Points: 0.20.4

comment:14 Changed 11 months ago by nickm

Reviewer: dgouletnickm

comment:15 Changed 11 months ago by nickm

Status: needs_reviewneeds_revision

This mostly LGTM. Before merging, let's think make a conscious decision about what we should do about restart testing (see discussion on #31735).

Also, I worry that this could interfere with our sandbox code: I think that in order to allow us to call sigaction() on all of these values, we might need to add them to sb_rt_sigaction() in sandbox.c.

Please feel free to merge-ready once the above issues are taken care of.

[edited to include actual ticket number]

Last edited 11 months ago by nickm (previous) (diff)

comment:16 in reply to:  15 Changed 11 months ago by teor

Replying to nickm:

This mostly LGTM. Before merging, let's think make a conscious decision about what we should do about restart testing (see discussion on #31735).

(I think you meant #31736 - the quick fix mutex ticket.)

I think we should leave restart testing for #31735 - the eventual fix mutex ticket. I'm out of time on this issue, and the changes are smaller than I expected.

Also, I worry that this could interfere with our sandbox code: I think that in order to allow us to call sigaction() on all of these values, we might need to add them to sb_rt_sigaction() in sandbox.c.

Yes, the sandbox would also need to use these values after a re-initialise.

I made the following changes:

  • added the signals to the sandbox
  • rewrote one of the commits based on #31736
  • rebased on the latest maint-0.3.5 to fix a test-stem error

Here is the PR:

The merge forward is clean.

comment:17 Changed 11 months ago by teor

Status: needs_revisionneeds_review

comment:18 Changed 11 months ago by teor

Status: needs_reviewmerge_ready

comment:19 Changed 11 months ago by teor

Nick, do you think we should backport the commit that destroys the backtrace signal handler?
Otherwise we'll trigger undefined behaviour when Tor is re-initialised.
(But I'm not sure if we should backport, it seems risky.)

comment:20 Changed 11 months ago by nickm

Hm. I think we should hold ourselves ready to do a backport, but not do one unless we actually run into problems in practice.

comment:21 Changed 11 months ago by teor

Sure. Apparently that was my opinion about this whole branch 2 weeks ago: consider-backport-if-needed

comment:22 Changed 11 months ago by teor

Status: merge_readyneeds_review

I was wrong about the backtrace buffer mutex: we can't destroy it, because it is statically allocated.

I removed the call to mutex_destroy(), and explained why in a comment.

All the other commits are unchanged.

comment:23 Changed 11 months ago by teor

I also added a comment explaining why it's ok to not destroy the log mutex.

comment:24 Changed 11 months ago by nickm

Keywords: asn-merge added
Status: needs_reviewmerge_ready

I think this is okay.

comment:25 Changed 10 months ago by asn

Keywords: asn-merge removed
Milestone: Tor: 0.4.2.x-finalTor: 0.4.1.x-final

Merged. Leaving open for potential backports?

comment:26 Changed 10 months ago by nickm

Milestone: Tor: 0.4.1.x-finalTor: 0.4.0.x-final

Merged to 0.4.1. Marking for consideration for further backport.

comment:27 Changed 10 months ago by teor

I'm leaning towards "no backport" on this ticket, unless we discover a specific bug.
Leaving open, so we check again after 042-stable.

comment:28 Changed 7 months ago by teor

This change doesn't seem to make much of a difference, still thinking "no backport".

comment:29 Changed 5 months ago by teor

Keywords: consider-backport-if-needed 035-backport-maybe 040-backport-maybe removed
Milestone: Tor: 0.4.0.x-finalTor: 0.4.1.x-final
Resolution: fixed
Status: merge_readyclosed

Similar code has caused some bugs, and 0.4.0 is obsolete, so I'm going to close these tickets without backport.

Note: See TracTickets for help on using tickets.