Opened 5 months ago

Closed 5 months ago

#29585 closed defect (not a bug)

Intermittent test failures in dir/dirserv_read_measured_bandwidths

Reported by: teor Owned by:
Priority: High Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: unspecified
Severity: Normal Keywords: tor-ci, tor-test, tor-bwauth
Cc: juga Actual Points:
Parent ID: Points: 1
Reviewer: Sponsor:

Description (last modified by teor)

I observed this failure once out of about 10 tests when testing #29541 on commit:

commit 5614960e94 (HEAD, tor-github/pr/723/merge)
Merge: 69238ca2da 065e7da8e6

Here is the test error:

dir/dirserv_read_measured_bandwidths: [forking] 
  FAIL ../src/test/test_dir.c:1802: assert(0 OP_EQ dirserv_read_measured_bandwidths(fname, NULL, bw_file_headers, NULL)): 0 vs -1
  [dirserv_read_measured_bandwidths FAILED]

It looks like this test was last modified in #26698 in master.

Is it an obvious fix?
If not, let's see if it happens again.

Child Tickets

Change History (8)

comment:1 Changed 5 months ago by teor

#29585 and #29586 occurred in the same test run: what do they both do that might have failed?

comment:2 Changed 5 months ago by teor

Description: modified (diff)

Be more precise: it only failed once

comment:3 in reply to:  description ; Changed 5 months ago by juga

Replying to teor:

commit 5614960e94 (HEAD, tor-github/pr/723/merge)
Merge: 69238ca2da 065e7da8e6

This means you've merged 065e7da8e6 on top of 69238ca2da?

Here is the test error:

dir/dirserv_read_measured_bandwidths: [forking] 
  FAIL ../src/test/test_dir.c:1802: assert(0 OP_EQ dirserv_read_measured_bandwidths(fname, NULL, bw_file_headers, NULL)): 0 vs -1
  [dirserv_read_measured_bandwidths FAILED]

It looks like this test was last modified in #26698 in master.

Is it an obvious fix?

Not obvious to me, since they don't touch same files.

comment:4 in reply to:  3 ; Changed 5 months ago by juga

Replying to juga:

This means you've merged 065e7da8e6 on top of 69238ca2da?

I did that and run the tests, no error.

comment:5 in reply to:  4 ; Changed 5 months ago by teor

Replying to juga:

Replying to juga:

This means you've merged 065e7da8e6 on top of 69238ca2da?

I did that and run the tests, no error.

I only saw the error once. I don't know how often it happens.

Replying to juga:

Replying to teor:

commit 5614960e94 (HEAD, tor-github/pr/723/merge)
Merge: 69238ca2da 065e7da8e6

This means you've merged 065e7da8e6 on top of 69238ca2da?

This is the merge HEAD of a GitHub pull request:
https://github.com/torproject/tor/pull/723

GitHub automatically merged torproject:master and nmathewson:bug29541.

Here is the test error:

dir/dirserv_read_measured_bandwidths: [forking] 
  FAIL ../src/test/test_dir.c:1802: assert(0 OP_EQ dirserv_read_measured_bandwidths(fname, NULL, bw_file_headers, NULL)): 0 vs -1
  [dirserv_read_measured_bandwidths FAILED]

It looks like this test was last modified in #26698 in master.

Is it an obvious fix?

Not obvious to me, since they don't touch same files.

Unstable tests can fail, even if there are no changes to any code run by that test.

Is there an obvious error in dirserv_read_measured_bandwidths that caused the failure?
Can we fix that error?

comment:6 in reply to:  5 ; Changed 5 months ago by juga

Replying to teor:

Unstable tests can fail, even if there are no changes to any code run by that test.

what is an unstable test?

Is there an obvious error in dirserv_read_measured_bandwidths that caused the failure?

No, but let me try to explain what i see in case helps you to come out with an explanation.

The test [0] that is failing is testing a bandwidth file with only the timestamp and a new line.
dirserv_read_measured_bandwidths should parse the timestamp correctly, assign bw_file_headers and return 0.

It could return -1 instead of 0 in case:

  • Can't open the bandwidth file
  • The bandwidth file is empty
  • The timestamp line doesn't end with new line
  • The timestamp can't be parsed as integer
  • The timestamp is old

All of this is initialized in the test, and in theory, correctly. And if it's not, it should then fail all the times.
I can only think on write_str_to_file not writing the file because of some temporal problem in the filesystem. Could be that in your case?.
Or i didn't realize something else.
[0] https://github.com/nmathewson/tor/blob/bug29541/src/test/test_dir.c#L1802

Edit: typo, syntax

Last edited 5 months ago by juga (previous) (diff)

comment:7 in reply to:  6 Changed 5 months ago by teor

Replying to juga:

Replying to teor:

Unstable tests can fail, even if there are no changes to any code run by that test.

what is an unstable test?

A test that sometimes passes, and sometimes fails.

Is there an obvious error in dirserv_read_measured_bandwidths that caused the failure?

No, but let me try to explain what i see in case helps you to come out with an explanation.

The test [0] that is failing is testing a bandwidth file with only the timestamp and a new line.
dirserv_read_measured_bandwidths should parse the timestamp correctly, assign bw_file_headers and return 0.

It could return -1 instead of 0 in case:

  • Can't open the bandwidth file
  • The bandwidth file is empty

There could have been a temporary file descriptor shortage, but that is unlikely.
Permissions issues usually cause tests to fail all the time.

  • The timestamp line doesn't end with new line
  • The timestamp can't be parsed as integer

File corruption is unlikely.

  • The timestamp is old

How old?

All of this is initialized in the test, and in theory, correctly. And if it's not, it should then fail all the times.

What if the the time changed on the local system between tests?

comment:8 Changed 5 months ago by teor

Resolution: not a bug
Status: newclosed

We think this issue happened due to a clock change during the test run. If it happens again, we can reopen this ticket.

Note: See TracTickets for help on using tickets.