Opened 15 years ago

Last modified 7 years ago

#117 closed defect (Fixed)

event poll failed: Invalid argument [22]

Reported by: goodell Owned by: nickm
Priority: Low Milestone:
Component: Core Tor/Tor Version: 0.1.0.1-rc
Severity: Keywords:
Cc: goodell Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Running 0.1.0.1-rc-cvs:

The Tor server summarily crashes within minutes of starting, with this error:

Mar 31 12:25:25.818 [err] do_main_loop(): event poll failed: Invalid argument [22]

[Automatically added by flyspray2trac: Operating System: Other Linux]

Child Tickets

Change History (9)

comment:1 Changed 15 years ago by nickm

Does this happen reliably for you, or once in a while?

What version of Linux?

Can you try doing this with the latest CVS version of libevent, and the latest CVS version of Tor?
(Make sure to re-run Tor's ./configure after installing the new libevent.) It should print some
slightly more helpful debugging output.

comment:2 Changed 15 years ago by arma

See also http://archives.seul.org/or/dev/Feb-2005/msg00002.html
So this isn't a new or libevent issue. And it is apparently quite rare.

comment:3 Changed 15 years ago by nickm

It seems this can only happen with poll(2) on Linux when nfds is "too large" compared to the number of fds we have
allocated space for. But if this is happening, we should also be seeing EBADF some times, right? Confusing.
Are we forgetting to connection_remove() things?

comment:4 Changed 15 years ago by nickm

This is, I think, a libevent bug. To wit:

So on linux, poll() sets EINVAL when you have nfds > the number of
fds you are allowed to have (by RLIMIT_NOFILES or whatever). But
libevent's poll.c adds *two* struct pollfds for every fd
that wants to read _and_ write. This makes the nfds argument to
libevent potentially get much larger than the actual number of fds,
and makes heavily-loaded Tor servers crash.

This probably accounts for bug 117:
http://bugs.noreply.org/flyspray/index.php?id=117&do=details

Do you see an easy way to fix this one? We could require that each fd
be used in at most one event (but that would change the libevent
interface, and is not what you want). We could search for events with
matching fds and compress them to a single struct pollfd, but to do so
naively would be inefficient. Thoughts?

comment:5 Changed 15 years ago by nickm

Here are the known workarounds:

  1. Set the environment variable EVENT_NOPOLL.

1'. Use a system with working epoll, kqueue, or /dev/poll.

  1. Use Tor version 0.0.9.x.
  2. Debug libevent.

3'. Tell me a good efficient way to make libevent poll.c do the right thing.
3. Wait for somebody to think of a good efficient way to make libevent poll.c

do the right thing on their own.

Obviously, I'd prefer 3. :)

comment:6 Changed 15 years ago by nickm

There is an untested, and very inefficient (quadratic in number of active fds!)
patch to libevent's poll.c at

http://wangafu.net/~nickm/poll.patch

Somebody should fix it.

comment:7 Changed 15 years ago by nickm

This was fixed in libevent-1.0d. The fix had a bug, but libevent-1.0e
should be better. There's a patch to libevent-1.0d at

http://wangafu.net/~nickm/patch.poll.pollerr

comment:8 Changed 15 years ago by nickm

flyspray2trac: bug closed.

comment:9 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.