Opened 3 years ago

Closed 3 years ago

#23077 closed defect (fixed)

Channelpadding tests rely on actual time; can fail

Reported by: nickm Owned by: mikeperry
Priority: High Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: unittest, channelpadding
Cc: mikeperry Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


We're seeing an intermittent failure on Jenkins:

14:46:53 channelpadding/channelpadding_consensus: [forking] 
14:46:53   FAIL ../tor/src/test/test_channelpadding.c:445: assert(decision OP_EQ CHANNELPADDING_PADDING_SCHEDULED): 4 vs 2
14:46:53   [channelpadding_consensus FAILED]

Looking at the code, it seems that the underlying channelpadding code depends on the actual time (from monotime_coarse_abosolute_*()) to make its decisions. But we aren't doing anything to mock those functions from inside the test case, which may be making the outcome of this test dependent on the code running fast enough.

Child Tickets

Change History (7)

comment:1 Changed 3 years ago by nickm

I found an easy way to reproduce this. Run the test in a loop, as in:

while ./src/test/test channelpadding/channelpadding_consensus  --verbose ; do true ; done

Then use Ctrl-Z to suspend the process, wait a second, and use fg to foreground the process. That will usually make the test fail, though sometimes I needed to try it a few times.

I tried to fix this unit test with monotime_enable_test_mocking, but that makes the unit test freeze.

comment:2 Changed 3 years ago by mikeperry

Is there a branch for what you tried?

My guess is that the mocked time is messing with the timerwheel code and that we need to advance time in the timerwheel while waiting for the callback to fire. But this is just a guess..

We could also boost this timeout value (the assignment to chan->next_padding_time_ms on line 399 and possibly elsewhere), but that will of course slow the tests down, since we actually wait for the callbacks...

comment:3 Changed 3 years ago by nickm

No branch; sorry. I just used monotime_enable_test_mocking() at the start of the function, and made sure that the test function was running inside a TT_FORK

Having the tests depend on actual time is really fragile and we should avoid it. Not only does it slow down the tests, but there's _always_ a machine out there more slow than we had imagined.

comment:4 Changed 3 years ago by nickm

Priority: MediumHigh

comment:5 Changed 3 years ago by nickm

Owner: set to mikeperry
Status: newassigned

comment:6 Changed 3 years ago by mikeperry

Status: assignedneeds_review

Ok, with nick's new patch for calling pending callbacks from timers.c, this was easy to do. Thanks, nick!

Patch+changes file in mikeperry/bug23077.

comment:7 Changed 3 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

No problem, Mike! Merged to 0.3.1 and forward.

Note: See TracTickets for help on using tickets.