Opened 6 days ago

Last modified 2 days ago

#33290 merge_ready defect

Add diagnostics for confusing corruption issue #32564 in ewma

Reported by: nickm Owned by: nickm
Priority: Medium Milestone: Tor: 0.4.2.x-final
Component: Core Tor/Tor Version: Tor: 0.4.1.6
Severity: Normal Keywords: consider-backport-after-0434, 043-should 041-backport 042-backport
Cc: Actual Points: .1
Parent ID: #32564 Points:
Reviewer: dgoulet Sponsor:

Description

I haven't been able to figure out why we might be hitting #32564, so the logical solution is to try to make the diagnosis better if it happens.

Child Tickets

Change History (11)

comment:1 Changed 6 days ago by nickm

Status: assignedneeds_review

Branches are ticket33290_{041,042,043,master}. I am recommending that we try this out in 043 first, so here is an 043 PR: https://github.com/torproject/tor/pull/1729 .

Note that there was a merge conflict on the 043 merge.

comment:2 in reply to:  1 ; Changed 6 days ago by cypherpunks

Replying to nickm:

Branches are ticket33290_{041,042,043,master}. I am recommending that we try this out in 043 first, so here is an 043 PR: https://github.com/torproject/tor/pull/1729 .

Is there a way to stop the compiler from optimizing away the dead store to pol->base_.magic = 0xDEAD901C; right before it gets freed?

comment:3 in reply to:  2 Changed 6 days ago by dgoulet

Reviewer: dgoulet
Status: needs_reviewneeds_revision

Replying to cypherpunks:

Replying to nickm:

Branches are ticket33290_{041,042,043,master}. I am recommending that we try this out in 043 first, so here is an 043 PR: https://github.com/torproject/tor/pull/1729 .

Is there a way to stop the compiler from optimizing away the dead store to pol->base_.magic = 0xDEAD901C; right before it gets freed?

So yeah I think that is a very good point. And gcc uses -O2 which will likely optimize it out. I did a quick test case here and it does optimize it out. And also looked at tor disassembled and it is not there in those free function.

Maybe we could set the magic to volatile which in "theory", but not enforced, should leave the variable alive on most compilers.

comment:4 Changed 6 days ago by nickm

whoa -- we've actually found a compiler that optimizes out set-before-free? I know that they're allowed to do that, but previously, they didn't. How do I replicate this test you did?

I guess we could use memwipe here instead.

comment:5 Changed 6 days ago by dgoulet

Quick C test I did. With objdump -d you'll notice it is not there. Compile with gcc -O2:

#include <stdio.h>
#include <stdlib.h>

struct a {
  int val;
};

int main(int argc, char **argv)
{
  struct a *A = malloc(sizeof(struct a));

  A->val = 0xCAFE;
  fprintf(stderr, "a: %d\n", A->val);

  A->val = 0xDEAD;
  free(A);

  return 0;
}

With gcc without anything, you will find CAFE and DEAD but with -O2, it is optimized out.

I've confirmed it by compiling your PR and looking at the disassemble code.

If val is set with volatile, then it reappears.

Last edited 6 days ago by dgoulet (previous) (diff)

comment:6 Changed 6 days ago by nickm

Gosh. I see the same result.

New branches: ticket33290_v2_{master,041,042,043} . PR for 043 at https://github.com/torproject/tor/pull/1736

comment:7 Changed 6 days ago by nickm

Status: needs_revisionneeds_review

comment:8 Changed 6 days ago by dgoulet

Keywords: 041-backport 042-backport added; backport? removed
Status: needs_reviewmerge_ready

lgtm. I confirmed the memwipe is still there :).

Once CI has finished, we can merge this.

comment:9 Changed 6 days ago by nickm

Milestone: Tor: 0.4.3.x-finalTor: 0.4.2.x-final

OK, CI is happy. I'll merge to 0.4.3 and mark for possible backport.

comment:10 Changed 5 days ago by nickm

Coverity has spotted the error in this line:

    memwipe(hashent, 0xef, sizeof(*hashent));

I'm applying the fix to the ticket33290_v2_{041,042} branches and merging it to 043.

Last edited 5 days ago by nickm (previous) (diff)

comment:11 Changed 2 days ago by teor

Keywords: consider-backport-after-0434 added

Looks like a simple enough change. Scheduling for backport after 0.4.3.4 is released, so it gets at least one full release of testing.

Note: See TracTickets for help on using tickets.