Opened 8 months ago

Last modified 5 months ago

#33411 assigned defect

Make DirCache default to 0 on Windows relays, if we can't fix the mmap issues

Reported by: teor Owned by: ahf
Priority: Medium Milestone: Tor: 0.4.4.x-final
Component: Core Tor/Tor Version: Tor: 0.3.1.9
Severity: Normal Keywords: 044-should, cpu, windows, linux, performance, regression, 033-triage-20180326, 033-removed-20180326, 034-deferred-20180602, 035-removed-20180711, 032-unreached-backport, 040-roadmap-proposed, 033-unreached-backport-maybe, network-health, postfreeze-ok
Cc: chelseakomlo, ahf Actual Points:
Parent ID: #24857 Points: 0.2
Reviewer: Sponsor:

Description

In #24857, tor's consensus diff cache implementation causes high CPU load on Windows.

As a workaround, we could make DirCache default to 0 on Windows.

Child Tickets

Change History (8)

comment:1 Changed 8 months ago by gvanem

Having a DirCache 0 can still cause >= 90% CPU for me.
But looking at the call-stacks of these CPU-hog threads, I noticed
tor_cond_wait() was a candidate.

So just adding a SleepEx() in that function, I no longer see any such high CPU usage
(I think after only 5 hours of run-time). Here's my patch:

--- a/lib/thread/compat_winthreads.c 2020-01-24 21:59:07
+++ b/lib/thread/compat_winthreads.c 2020-03-05 14:11:58
@@ -170,6 +170,8 @@
   do {
     DWORD res;
     res = WaitForSingleObject(cond->event, ms);
+
+    SleepEx(1, TRUE);
     EnterCriticalSection(&cond->lock);
     if (cond->n_to_wake &&
         cond->generation != generation_at_start) {

comment:2 Changed 8 months ago by nickm

Hmm. If that helps, then there is probably another bug in the function. Are you saying that this function is busy-looping even when there is nothing to do? If you're looking at this in a debugger, can you tell me the values for ms, cond->n_to_wake, and cond->generation_at_start, when the critical section is entered?

Hm. Now that WinXP is obsolete, maybe we should move this function to use CONDITION_VARIABLE instead...

(I've opened a ticket #33554 for this CPU issue you're talking about here, since this ticket seems to be about changing the dircache default. If you can answer there, that would be great!)

comment:3 Changed 6 months ago by gvanem

If this bug-tracker had been on Github, I would have seen your reply sooner.
But you insist on using this German Trac crap?

Last edited 6 months ago by gvanem (previous) (diff)

comment:4 in reply to:  3 Changed 6 months ago by nickm

Replying to gvanem:

If this bug-tracker had been on Github, I would have seen your reply sooner.
But you insist on using this German Trac crap?

We're hoping to migrate to a gitlab installation some time this year. Sorry that trac isn't working out for you. Do you think you might be able to answer any of the questions on this or the other ticket?

comment:5 Changed 5 months ago by nickm

Add 044-must to all "regression" tickets in 0.4.4

comment:6 Changed 5 months ago by nickm

Summary: Make DirCache default to 0 on Windows relaysMake DirCache default to 0 on Windows relays, if we can't fix the mmap issues

comment:7 Changed 5 months ago by nickm

Keywords: postfreeze-ok added

Mark tickets which are important or safe enough to look at post-freeze for 0.4.4.

comment:8 Changed 5 months ago by nickm

Owner: set to ahf
Status: newassigned
Note: See TracTickets for help on using tickets.