dannenberg reports the following problems with
0.2.22-alpha2 and libevent2:
Mar 07 14:45:19.645 [notice] Circuit build measurement period of 79360ms is more than twice the maximum build time we have ever observed. Capping it to 34274ms.
Mar 08 12:25:51.578 [warn] Warning from libevent: kevent: Bad file descriptor
Mar 08 12:25:51.579 [err] libevent call with kqueue failed: Bad file descriptor [9]
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
I can't tell from the log line if this is: A) a bug in libevent2, B) a bug in freebsd's kqueue implementation, or C) a resource limitation that can be fixed by better ulimit choices.
Is this really with .22-alpha? And which libevent2 version?
The error implies that kevent() in libevent's kqueue.c is giving an error "EBADF", which implies that we are getting a failure because "the specified file descriptor is invalid." (See kqueue.c , in function kq_dispatch.)
That's quite weird! If a particular file descriptor that we were passing to kevent were bad, we would (in theory) be getting that error passed back in events[i].flags, not set with an error when kevent returns. So this sounds to me as if freebsd were complaining that the kqueue fd (the one that was returned from kqueue()) was bad!
Possibilities that come to mind if that analysis is right:
Bug in freebsd. (not so likely -- the freebsd kqueue code is usually pretty solid)
Something closed the kqueue fd.
Something corrupted the libevent kqop structure, so that it's looking at the wrong fd.
Reasons why that analysis might be wrong.
Maybe I am misunderstanding why kevent() would give EBADF.
Maybe there is a bug in the error-reporting code, and the error with which kevent is failing is not really EBADF.
Hm. We should think about next steps on this. First thing to do is write test code and see if we can make kevent() report EBADF in the cases we think it should (and not in the cases where we think it shouldn't). We should also have a quick look at the kernel in question to see if there are other circumstances where kevent() can return EBADF that appear from reading the source.
Once that's done, next step is to make absolutely sure that libevent is reporting this bug right. We should look at the code, and make sure it's obviously correct. Also we should have a quick gander at which patches, if any, freebsd applies to libevent.
And once that's done, if we want to go on the theory that something is closing the kqueue fd, let's see if we can instrument close() in libevent and tor such that every time it's about to close, it asserts that the fd is not the kqueue fd. That'd be a hack thatwe would never want to merge, but if the bug does turn out to be a bogus close, we'll get a stack trace to tell us where it is.
Having it show up right after the "going dormant" message is mildly suggestive, I guess, but it doesn't seem to have done so in the orriginal report's case. Joehall -- how often does this happen for you?
Oh hey, here's a crazy idea. In compat.c, there's a line "#undef DEBUG_SOCKET_COUNTING". If we turn that into "#define DEBUG_SOCKET_COUNTING", it's supposed to track which sockets we open and close and look for descrepencies. It's not perfect by any means, and it only looks at fds which it thinks represents sockets, but it might turn up something here.
(This is an experiment only for people who are running into the bug and who know how to edit C and build from source.)
Having it show up right after the "going dormant" message is mildly suggestive, I guess, but it doesn't seem to have done so in the orriginal report's case. Joehall -- how often does this happen for you?
Hi, this happens everyday on 0.2.2.24 and 0.2.2.25 alphas, ... going to go back to 23 unless I can be helpful. best, Joe
Ordinarily, if there's a bad fd or something in its input array, it reports the error in its output array of events... but if the output array isn't big enough, kevent will return -1.
Libevent's kqueue.c backend needs to be made aware of this. Looks like it'll be time for a libevent 2.0.12-stable sometime in the next week or two.
Libevent 2.0.12-stable should have a fix for this when it comes out. The relevant patch is 28317a0 , which should apply cleanly to 2.0.11-stable and probably 2.0.10-stable too.
Trac: Status: new to closed Resolution: N/Ato fixed