Opened 8 months ago

Closed 8 months ago

#29990 closed defect (fixed)

Handle zero monotonic time differences in the circuit padding code

Reported by: teor Owned by: mikeperry
Priority: Medium Milestone: Tor: 0.4.1.x-final
Component: Core Tor/Tor Version: Tor: 0.4.0.1-alpha
Severity: Normal Keywords: wtf-pad, tor-relay, tor-cell, padding
Cc: nickm, ahf Actual Points:
Parent ID: #29500 Points: 2
Reviewer: Sponsor: Sponsor2

Description (last modified by teor)

I can reliably get the following failure on my macOS VM when its wall clock time is out of sync with the host time.

Failures like this also intermittently happen when the underlying API is low-resolution, or not actually monotonic (for example, Windows).

circuitpadding/circuitpadding_circuitsetup_machine: [forking] 
  FAIL ../src/test/test_circuitpadding.c:1900: assert(client_side->padding_info[0]->padding_scheduled_at_usec OP_NE 0): 0 vs 0
  [circuitpadding_circuitsetup_machine FAILED]

I've worked around the issue in #29500 by weakening the test conditions.

But we need to fix the circuitpadding code so it handles zero monotonic time differences correctly.

Child Tickets

Change History (10)

comment:1 Changed 8 months ago by teor

Here's a longer explanation of monotonic time:
https://trac.torproject.org/projects/tor/ticket/29500#comment:20

comment:2 Changed 8 months ago by teor

Sponsor: Sponsor2

Move all WTF-PAD tickets that do not have a sponsor into sponsor2.

comment:3 Changed 8 months ago by teor

Summary: test_circuitpadding_circuitsetup_machine() fails when monotonic time difference is zeroHandle zero monotonic time differences in the circuit padding code

comment:4 Changed 8 months ago by teor

Description: modified (diff)

comment:5 Changed 8 months ago by mikeperry

I do not understand how the host or VM time could have any effect on our unittests, which strictly use mocked monotime values. Is something wrong with our mocking on some platforms? Or are we just not incrementing our mocked values sufficiently in the unittests for the resolution of some platforms?

I do not have these platforms to verify (I assume by mac os VM you mean hackintosh?

comment:6 Changed 8 months ago by mikeperry

Aha, I think I found it! Our mocking forces us to call monotime_init() *before* we set the mocked time value. monotime_init() thus stores the first ratchet value at whatever the platform is at, and then we set fake mocked time to some later value.

If monotime_init() gets a value from the host that is greater than what we choose to mock time at for our unittests, all subsequent monotime_abosolute() calls return zero.

So, the right way to fix all of this mess is to either fix our mocking to allow us to call monotime_init() with a mocked time, or just always set our mocked time to some time after whatever monotime_init() stored.

I am going to attempt the later, to see if it fixes whatever hackintosh VM is involved here (I don't have one of those).

comment:7 in reply to:  6 Changed 8 months ago by teor

Replying to mikeperry:

Aha, I think I found it! Our mocking forces us to call monotime_init() *before* we set the mocked time value. monotime_init() thus stores the first ratchet value at whatever the platform is at, and then we set fake mocked time to some later value.

If monotime_init() gets a value from the host that is greater than what we choose to mock time at for our unittests, all subsequent monotime_abosolute() calls return zero.

So, the right way to fix all of this mess is to either fix our mocking to allow us to call monotime_init() with a mocked time, or just always set our mocked time to some time after whatever monotime_init() stored.

Oh wow. That's a tricky bug.

I am going to attempt the later, to see if it fixes whatever hackintosh VM is involved here (I don't have one of those).

It's just a standard mac, running macOS as host, and macOS inside some virtual machine software. You can probably achieve the same thing using:

  • a script that rapidly changes your hardware or software clock, or
  • VMWare, VirtualBox, or qemu with clock skew between guest and host, and a VM setting that applies the host time to the guest (and an apparent VM bug which causes the time to switch rapidly between host and guest time).

comment:8 Changed 8 months ago by teor

Keywords: 040-backport added
Version: Tor: unspecifiedTor: 0.4.0.1-alpha

Please base your code on maint-0.4.0, or make a pull request to delete the buggy tests from maint-0.4.0.

comment:9 Changed 8 months ago by teor

Keywords: 040-must added
Owner: set to mikeperry
Status: newassigned

comment:10 Changed 8 months ago by teor

Keywords: 041-must 040-must 040-backport removed
Resolution: fixed
Status: assignedclosed

Mike posted a unit test fix to #29500.

I summarised the remaining issues in #30031.

Note: See TracTickets for help on using tickets.