Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#6227 closed defect (fixed)

util/threads testcase and starving workers

Reported by: weasel Owned by:
Priority: Medium Milestone: Tor: 0.2.3.x-final
Component: Core Tor/Tor Version: Tor: 0.2.3.17-beta
Severity: Keywords: tor-relay
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

On Debian's octeon machines (16 core mips), the util/threads testcase does on occasion run into timeouts:

cf https://buildd.debian.org/status/fetch.php?pkg=tor&arch=mips&ver=0.2.3.17-beta-2&stamp=1340049144

Test cause failures cause a build to fail.

Child Tickets

Change History (11)

comment:1 Changed 7 years ago by weasel

util/threads: That took 66 seconds
util/threads: That took 18 seconds
util/threads: That took 33 seconds
util/threads: That took 3 seconds
util/threads: That took 1 seconds
util/threads: That took 22 seconds
util/threads: That took 3 seconds
util/threads: That took 2 seconds
util/threads: That took 2 seconds
util/threads: That took 5 seconds
util/threads: That took 17 seconds
util/threads: That took 37 seconds
util/threads: That took 8 seconds
util/threads: That took 31 seconds

  • by default tor only waits for 25 seconds.

comment:2 Changed 7 years ago by weasel

Enabling the select like on windows won't help:
util/threads: That took 22 seconds
util/threads: That took 49 seconds

comment:3 Changed 7 years ago by nickm_mobile

Ug. Fwicr, that test probably isn't fixable to work on systems with non-desktop-like timings. I will have another look to be sure. If so, best solution is probably to partition tests into a default set and an extended set. Will investigate. Either way this is not likely to mean a bug in tor per se.

comment:4 Changed 7 years ago by weasel

Can we, just to make things build more likely, raise the timeout from 25 to say 150 seconds in the next release?

That isn't a fix per se, but it should make the problem not cause issues that often.

comment:5 Changed 7 years ago by nickm

Milestone: Tor: 0.2.3.x-final

seems like a good-enough-for-0.2.3 fix. Did you want this in 0.2.2 as well? I'd imagine "yes" if it's causing build problems, but "absolutely not" given your regular stance on 0.2.2 patches.

comment:6 Changed 7 years ago by nickm

Status: newneeds_review

Branch bug6227 has the obvious tweak for maint-0.2.3

Incidentally, we've been down this road before: see ce8edc964cd6e05b2 by weasel.

comment:7 Changed 7 years ago by weasel

No, 0.2.3 only is fine. And yes, I checked if we had somehow reverted that change, but we didn't.

Are you sure your change in the win32 block is sane?

comment:8 in reply to:  7 Changed 7 years ago by nickm

Replying to weasel:

No, 0.2.3 only is fine. And yes, I checked if we had somehow reverted that change, but we didn't.

Okay; will merge.

Are you sure your change in the win32 block is sane?

I believe so. (That's not a win32 block; that's an #ifndef _WIN32 block, so it's everything *but* win32.) The purpose of that select there is to make it so the main loop polls periodically rather than busy-waiting... but a 10 microsecond delay is insanely low, and might as well be a busy-wait.

comment:9 Changed 7 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

Merged into 0.2.3 and beyond.

comment:10 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:11 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.