Opened 2 years ago

Closed 10 months ago

#23250 closed defect (duplicate)

tor-0.3.0.10: test failure on NetBSD

Reported by: wiz Owned by: dgoulet
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: bsd, test, 032-backport, 033-triage-20180320, 033-removed-20180320, 031-unreached-backport
Cc: Actual Points:
Parent ID: #17808 Points:
Reviewer: Sponsor:

Description

When running the self tests on NetBSD, there is one problem:

===> Testing for tor-0.3.0.10
/usr/bin/make  check-TESTS check-local
PASS: src/test/test
PASS: src/test/test-slow
PASS: src/test/test-memwipe
PASS: src/test/test_workqueue
PASS: src/test/test_keygen.sh
PASS: src/test/test-timers
SKIP: src/test/fuzz_static_testcases.sh
PASS: src/test/test_zero_length_keys.sh
PASS: src/test/test_workqueue_cancel.sh
SKIP: src/test/test_workqueue_efd.sh
SKIP: src/test/test_workqueue_efd2.sh
PASS: src/test/test_workqueue_pipe.sh
PASS: src/test/test_workqueue_pipe2.sh
PASS: src/test/test_workqueue_socketpair.sh
SKIP: src/test/test_switch_id.sh
PASS: src/test/test_ntor.sh
FAIL: src/test/test_bt.sh
============================================================================
Testsuite summary for tor 0.3.0.10
============================================================================
# TOTAL: 17
# PASS:  12
# SKIP:  4
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0
============================================================================
See ./test-suite.log
============================================================================

The test log:

# less ./src/test/test_bt.sh.log
OK
[1]   Abort trap (core dumped) "${builddir:-.}/... |
      Done                    "${PYTHON:-pytho...
BAD

============================================================ T= 1502824395
Tor died: Caught signal 11
0x73c0a4bd <crash_handler+0x73c00041> at ./src/test/test-bt-cl
[1]   Abort trap (core dumped) "${builddir:-.}/... |
      Done(1)                 "${PYTHON:-pytho...
-158318
FAIL src/test/test_bt.sh (exit status: 1)

Child Tickets

Change History (22)

comment:1 Changed 2 years ago by cypherpunks

Component: - Select a componentCore Tor/Tor

comment:2 Changed 2 years ago by catalyst

Keywords: bsd test added
Milestone: Tor: unspecified

comment:3 Changed 2 years ago by nickm

Keywords: 032-backport 031-backport 030-backport added
Milestone: Tor: unspecifiedTor: 0.3.3.x-final

comment:4 Changed 19 months ago by nickm

Keywords: 030-backport removed

Remove 030-backport from all open tickets that have it: 0.3.0 is now deprecated.

comment:5 Changed 19 months ago by dgoulet

Anyone has an idea of what's failing exactly here? ...

comment:6 Changed 19 months ago by nickm

Status: newneeds_information

Something seems to have gone wrong inside the backtrace test. But I think there's no real hope of solving this one unless we can reproduce it. Does this still happen on master with NetBSD?

comment:7 Changed 19 months ago by teor

Status: needs_informationnew

This is probably the same backtrace bug that we see on FreeBSD, and macOS gcc x86_64.
The backtrace is simply not where we expect, or not in the format we expect.

See #18204, where we SKIP the backtrace test on FreeBSD.
See also #17808, where we thought about doing the same thing for macOS, but the backtrace works fine with clang, and with gcc i386.

To resolve this issue, we can either SKIP on NetBSD, or get compiler and arch and a system where we can reproduce.
Fixing this bug requires a motivated NetBSD developer.

So I think we should SKIP until that happens,

comment:8 Changed 19 months ago by dgoulet

Owner: set to dgoulet
Status: newaccepted

Here is a SKIP patch (can't test it though :S...) so we can move forward. Maybe someday we have a NetBSD pro to fix this... who knows.

Based on 031 for backport: bug23250_031_01

comment:9 Changed 19 months ago by dgoulet

Status: acceptedneeds_review

comment:10 Changed 19 months ago by dgoulet

Resolution: not a bug
Status: needs_reviewclosed

Thanks to Riastradh on IRC, seems 0.3.2.9 and master are not failing on NetBSD.

Lets consider this over and *not* merge the SKIP patch above. We can reopen if that comes back.

comment:11 Changed 19 months ago by teor

Just like macOS, it is likely that only some compilers and architectures fail this test on FreeBSD.

comment:12 Changed 18 months ago by leot

Resolution: not a bug
Status: closedreopened

Hello!
On NetBSD/amd64 8.99.9 and Tor 0.3.2.9 the problem still exists.

After trying to investigate a bit more here what's going on: `test-bt-cl assert'
seems to pass while `test-bt-cl crash' does not print out the entire stack
trace:

% ./src/test/test-bt-cl assert
Feb 15 21:23:47.472 [err] tor_assertion_failed_(): Bug: src/test/test_bt_cl.c:43: crash: Assertion 1 == 0 failed; aborting. (on Tor 0.3.2.9 9e8b762fcecfece6)
Feb 15 21:23:47.473 [err] Bug: Assertion 1 == 0 failed in crash at src/test/test_bt_cl.c:43. Stack trace: (on Tor 0.3.2.9 9e8b762fcecfece6)
Feb 15 21:23:47.473 [err] Bug:     0x8540a675 <log_backtrace+0x45> at ./src/test/test-bt-cl (on Tor 0.3.2.9 9e8b762fcecfece6)
Feb 15 21:23:47.473 [err] Bug:     0x8541d5b0 <tor_assertion_failed_+0x90> at ./src/test/test-bt-cl (on Tor 0.3.2.9 9e8b762fcecfece6)
Feb 15 21:23:47.473 [err] Bug:     0x8540a427 <crash+0x77> at ./src/test/test-bt-cl (on Tor 0.3.2.9 9e8b762fcecfece6)
Feb 15 21:23:47.473 [err] Bug:     0x8540a451 <oh_what+0x21> at ./src/test/test-bt-cl (on Tor 0.3.2.9 9e8b762fcecfece6)
Feb 15 21:23:47.473 [err] Bug:     0x8540a4a2 <a_tangled_web+0x22> at ./src/test/test-bt-cl (on Tor 0.3.2.9 9e8b762fcecfece6)
Feb 15 21:23:47.473 [err] Bug:     0x8540a4f1 <we_weave+0x21> at ./src/test/test-bt-cl (on Tor 0.3.2.9 9e8b762fcecfece6)
Feb 15 21:23:47.473 [err] Bug:     0x854239a2 <main+0xa2> at ./src/test/test-bt-cl (on Tor 0.3.2.9 9e8b762fcecfece6)
Abort (core dumped)
% ./src/test/test-bt-cl crash

============================================================ T= 1518726235
Tor died: Caught signal 11
0x14840a558 <crash_handler+0x148400038> at ./src/test/test-bt-cl
Abort (core dumped)

As a side note the `.core' seems to contain all the information though:

% gdb -core test-bt-cl.core src/test/test-bt-cl
Reading symbols from src/test/test-bt-cl...(no debugging symbols found)...done.
[New process 1]
Core was generated by `test-bt-cl'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007904dad29eaa in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0  0x00007904dad29eaa in _lwp_kill () from /usr/lib/libc.so.12
#1  0x00007904dad29ac5 in abort () at /usr/src/lib/libc/stdlib/abort.c:74
#2  0x000000010e00a5ed in crash_handler ()
#3  <signal handler called>
#4  0x000000010e00a3f0 in crash ()
#5  0x000000010e00a451 in oh_what ()
#6  0x000000010e00a4a2 in a_tangled_web ()
#7  0x000000010e00a4f1 in we_weave ()
#8  0x000000010e0239a2 in main ()

Unfortunately I have no idea how to address this issue further but
I'm happy to test any possible patches and/or trying to provide
any possible further information!

Thank you for the attention!

comment:13 in reply to:  11 Changed 18 months ago by teor

Status: reopenedneeds_information

Replying to teor:

Just like macOS, it is likely that only some compilers and architectures fail this test on FreeBSD.

(and NetBSD).

Please tell us the compiler you are using.

In our experience on macOS (#17808), clang works on x86_64 and i386, and gcc works on i386, but not x86_64.

I don't think we want to merge a SKIP patch if some compilers are working.
(We might also want to revert the FreeBSD skip patch in #18204 if some of its compilers work.)

There is probably some compiler flag or coding trick that gets us the backtrace we want.
I tried modifying some of the constants on macOS, but I couldn't get x86_64 gcc to work.
(Tor Browser uses clang for macOS, so it's less of an issue.)

comment:14 Changed 18 months ago by leot

Hello teor,

[...]
Please tell us the compiler you are using.

In our experience on macOS (#17808), clang works on x86_64 and i386, and
gcc works on i386, but not x86_64.

[...]

Whooops, sorry for omitting this information, sure:

% gcc --version
gcc (nb1 20171112) 5.5.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

comment:15 Changed 18 months ago by teor

What happens if you compile with clang?

comment:16 Changed 17 months ago by nickm

Keywords: 033-triage-20180320 added

Marking all tickets reached by current round of 033 triage.

comment:17 Changed 17 months ago by nickm

Keywords: 033-removed-20180320 added

Mark all not-already-included tickets as pending review for removal from 0.3.3 milestone.

comment:18 Changed 17 months ago by nickm

Milestone: Tor: 0.3.3.x-finalTor: unspecified

comment:19 Changed 14 months ago by teor

Keywords: 031-unreached-backport added; 031-backport removed

0.3.1 is end of life, there are no more backports.
Tagging with 031-unreached-backport instead.

comment:20 Changed 10 months ago by dgoulet

Resolution: wontfix
Status: needs_informationclosed

comment:21 Changed 10 months ago by teor

Parent ID: #17808
Resolution: wontfix
Status: closedreopened

This test awas skipped in #27948, we want to fix the underlying issue in #17808.

comment:22 Changed 10 months ago by teor

Resolution: duplicate
Status: reopenedclosed
Note: See TracTickets for help on using tickets.