Opened 4 years ago

Closed 3 years ago

#16992 closed defect (duplicate)

"Failure from drain_fd: no error" in Tor v0.2.7.2-alpha-dev running on Windows XP

Reported by: TORques Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.7.2-alpha
Severity: Normal Keywords: tor-03-unspecified-201612
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

What exactly means this warning? My OS can't handle the multithreading nonblocking reads from a circuit stream? I use "NumCPUs 2" in torrc file. My OS settings are raised to performance (tcpsys changed to 256 connections, 1 Gbit internet, MTU 1492, tcp window raised,...).

I run a non-exit relay from a cmd command line window with Tor v0.2.7.2-alpha (git-36c0ae6f7834af16) running on Windows XP with Libevent 2.0.22-stable, OpenSSL 1.0.2d and Zlib 1.2.8. compiled by myself.

Sep 07 07:59:01.000 [info] evaluate_evp_for_aes(): This version of OpenSSL has a known-good EVP counter-mode implementation. Using it.

Also, my router seems to be very restrictive even though I have opened the required ports in firewall and the port forwarding was done.

Starting Nmap 6.47 ( http://nmap.org ) at 2015-09-07 09:07 GTB Daylight Time
Initiating SYN Stealth Scan at 09:08
Scanning torques.ed25519.edu (109.103.57.102) [65535 ports]
Discovered open port 53/tcp on 109.103.57.102
Discovered open port 80/tcp on 109.103.57.102
Discovered open port 9001/tcp on 109.103.57.102
Discovered open port 7547/tcp on 109.103.57.102 (this port is reserved by router for updates, etc.)
Discovered open port 9030/tcp on 109.103.57.102
Completed SYN Stealth Scan at 09:08, 23.97s elapsed (65535 total ports)

The relay server looks to work as expected only from the router DMZ.

Sep 07 07:59:00.000 [notice] Tor 0.2.7.2-alpha (git-36c0ae6f7834af16) opening log file.
Sep 07 07:59:00.437 [notice] Tor v0.2.7.2-alpha (git-36c0ae6f7834af16) running on Windows XP with Libevent 2.0.22-stable, OpenSSL 1.0.2d and Zlib 1.2.8.
Sep 07 07:59:00.437 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Sep 07 07:59:00.437 [notice] This version is not a stable Tor release. Expect more bugs than usual.
Sep 07 07:59:00.546 [notice] Read configuration file "path".
Sep 07 07:59:00.562 [notice] Based on detected system memory, MaxMemInQueues is set to 2048 MB. You can override this by setting MaxMemInQueues by hand.
Sep 07 07:59:00.750 [notice] Opening Socks listener on 127.0.0.1:9050
Sep 07 07:59:00.750 [notice] Opening DNS listener on 127.0.0.1:53
Sep 07 07:59:00.750 [notice] Opening Control listener on 127.0.0.1:9151
Sep 07 07:59:00.765 [notice] Opening OR listener on 0.0.0.0:9001
Sep 07 07:59:00.796 [notice] Opening Directory listener on 0.0.0.0:9030
Sep 07 07:59:00.000 [warn] Your log may contain sensitive information - you're logging more than "notice". Don't log unless it serves an important reason. Overwrite the log afterwards.
Sep 07 07:59:01.000 [notice] Parsing GEOIP IPv4 file "path".
Sep 07 07:59:01.000 [notice] Parsing GEOIP IPv6 file "path".
Sep 07 07:59:01.000 [notice] Your Tor server's identity key fingerprint is 'TORques CD0EB85333B9A20F6D676605F01CC5143E69F82F'
Sep 07 07:59:01.000 [notice] Bootstrapped 0%: Starting
Sep 07 07:59:14.000 [notice] Bootstrapped 80%: Connecting to the Tor network
Sep 07 07:59:15.000 [notice] Bootstrapped 85%: Finishing handshake with first hop
Sep 07 07:59:15.000 [notice] Bootstrapped 90%: Establishing a Tor circuit
Sep 07 07:59:17.000 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Sep 07 07:59:17.000 [notice] Bootstrapped 100%: Done
Sep 07 07:59:17.000 [notice] Now checking whether ORPort 109.103.57.102:9001 and DirPort 109.103.57.102:9030 are reachable... (this may take up to 20 minutes -- look for log messages indicating success)
Sep 07 07:59:18.000 [notice] Self-testing indicates your DirPort is reachable from the outside. Excellent.
Sep 07 07:59:18.000 [info] connection_free_(): Freeing linked Directory connection [client finished] with 0 bytes on inbuf, 0 on outbuf.
Sep 07 07:59:18.000 [info] channel_tls_process_versions_cell(): Negotiated version 4 with [scrubbed]:50560; Sending cells: VERSIONS CERTS AUTH_CHALLENGE NETINFO
Sep 07 07:59:18.000 [info] router_pick_published_address(): Success: chose address '109.103.57.102'.
Sep 07 07:59:18.000 [info] router_rebuild_descriptor(): Rebuilding relay descriptor
Sep 07 07:59:18.000 [info] router_pick_published_address(): Success: chose address '109.103.57.102'.
Sep 07 07:59:18.000 [info] channel_tls_process_certs_cell(): Got some good certificates from [scrubbed]:50560: Waiting for AUTHENTICATE.
Sep 07 07:59:18.000 [info] channel_tls_process_authenticate_cell(): Got an AUTHENTICATE cell from [scrubbed]:9001: Looks good.
Sep 07 07:59:18.000 [info] channel_tls_process_netinfo_cell(): Got good NETINFO cell from [scrubbed]:9001; OR connection is now open, using protocol version 4. Its ID digest is 4FBE3430F3A583463BA4E961B8AADF0CD3C71D24. Our address is apparently 109.103.57.102.

Sep 07 07:59:18.000 [warn] Failure from drain_fd

Sep 07 07:59:18.000 [notice] Self-testing indicates your ORPort is reachable from the outside. Excellent.
Sep 07 07:59:22.000 [notice] Performing bandwidth self-test...done.

Sep 07 08:30:51.000 [notice] New control connection opened from 127.0.0.1.
Sep 07 08:30:52.000 [info] handle_control_authenticate(): Authenticated controlconnection (1392)
Sep 07 08:30:52.000 [notice] Heartbeat: It seems like we are not in the cached consensus.
Sep 07 08:30:52.000 [notice] Heartbeat: Tor's uptime is 0:31 hours, with 6 circuits open. I've sent 951 kB and received 4.34 MB.
Sep 07 08:30:52.000 [info] log_heartbeat(): Average packaged cell fullness: 96.311%. TLS write overhead: 3%
Sep 07 08:30:52.000 [notice] Circuit handshake stats since last time: 0/0 TAP, 0/0 NTor.
Sep 07 08:30:52.000 [notice] Since startup, we have initiated 0 v1 connections,0 v2 connections, 0 v3 connections, and 28 v4 connections; and received 0 v1 connections, 0 v2 connections, 0 v3 connections, and 4 v4 connections.

This is my torrc:

Address 109.103.57.102
AvoidDiskWrites 1
Nickname TORques
ContactInfo <torques@gooqle.com>

ORPort 9001
DirPort 9030
SocksPort 9050
ControlPort 9151

DNSPort 53
AutomapHostsOnResolve 1
AutomapHostsSuffixes .onion,.com,.net,.exit

DataDirectory "path"
GeoIPFile "path"
GeoIPv6File "path"
Log info stdout
Log notice file "path"

## tor --hash-password ***********************
HashedControlPassword 16:***********************

NumEntryGuards 8
RelayBandwidthRate 5 MB
RelayBandwidthBurst 7 MB
BandwidthRate 10 MB
BandwidthBurst 20 MB
NumCPUs 2

NewCircuitPeriod 1
CircuitIdleTimeout 2700
LearnCircuitBuildTimeout 1
MaxCircuitDirtiness 600
KeepalivePeriod 600

ExitRelay 0
ExitPolicy reject *:*

FetchUselessDescriptors 1
ProtocolWarnings 1
HeartbeatPeriod 1800
ShutdownWaitLength 60

PublishServerDescriptor 0
DirReqStatistics 0
EntryStatistics 0
CellStatistics 0
ExitPortStatistics 0
ExtraInfoStatistics 0

Child Tickets

Change History (17)

comment:1 Changed 4 years ago by arma

#16741 was a previous report of this issue. We fixed a bug (the fix will be out in 0.2.7.3-alpha). But it is likely that we did not fix *the* bug.

comment:2 in reply to:  description ; Changed 4 years ago by arma

Replying to TORques:

I run a non-exit relay

[...]

PublishServerDescriptor 0

I think this means you run an unpublished relay, that is, no users will ever learn about it? Not really related to the bug at hand, but you might want to know. :)

comment:3 in reply to:  2 Changed 4 years ago by TORques

Replying to arma:

Replying to TORques:

I run a non-exit relay

[...]

PublishServerDescriptor 0

I think this means you run an unpublished relay, that is, no users will ever learn about it? Not really related to the bug at hand, but you might want to know. :)

Thanks, I missed that! ;-) Fixed.

Last edited 4 years ago by TORques (previous) (diff)

comment:4 Changed 4 years ago by TORques

Status: newneeds_review

comment:5 Changed 4 years ago by yawning

Status: needs_reviewnew

So should this be marked as a duplicate?

It's worth noting that Windows is a poor platform to run a relay on, and will remain so until the Tor side bufferevent code (kind of abandoned, known to be buggy) is done. Such issues are orthogonal to this ticket though. AFAIK no one is currently working on improving the state of things, and should work be done with WinXP in mind.

comment:6 in reply to:  5 ; Changed 4 years ago by TORques

Replying to yawning:

So should this be marked as a duplicate?

It's worth noting that Windows is a poor platform to run a relay on, and will remain so until the Tor side bufferevent code (kind of abandoned, known to be buggy) is done. Such issues are orthogonal to this ticket though. AFAIK no one is currently working on improving the state of things, and should work be done with WinXP in mind.>

However, this non-exit relay can easily reach 5 MB/sec or more on my tweaked Windows XP platform. Not bad for the TOR network, isn't it? My Ookla test says 24 MB/s download and 10 MB/s upload and my CPU can process easily a TOR stream of 5 MB/s.

http://www.speedtest.net/result/4643118509.png

comment:7 in reply to:  6 ; Changed 4 years ago by yawning

Replying to TORques:

However, this non-exit relay can easily reach 5 MB/sec or more on my tweaked Windows XP platform. Not bad for the TOR network, isn't it? My Ookla test says 24 MB/s download and 10 MB/s upload and my CPU can process easily a TOR stream of 5 MB/s.

Bulk single connection throughput is not why I brought that up. See #98 among other issues. If your router/relay can't handle a large number of simultaneous connections, it'll end up underutilized and be bad for the network. Past that if it works, great. If it breaks, patches accepted.

comment:8 in reply to:  7 ; Changed 4 years ago by TORques

Replying to yawning:

Replying to TORques:

However, this non-exit relay can easily reach 5 MB/sec or more on my tweaked Windows XP platform. Not bad for the TOR network, isn't it? My Ookla test says 24 MB/s download and 10 MB/s upload and my CPU can process easily a TOR stream of 5 MB/s.

Bulk single connection throughput is not why I brought that up. See #98 among other issues. If your router/relay can't handle a large number of simultaneous connections, it'll end up underutilized and be bad for the network. Past that if it works, great. If it breaks, patches accepted.>

Thanks for your replay. My router/relay can handle a large number of simultaneous connections (tcpsys changed to 256 connections). I just built the latest tor from your github and I see the modified message related to "Failure from drain_fd"

C:\tor-0.2.7.2-alpha\src\test>test_workqueue.exe
OK
Sep 07 12:34:25.000 [notice] Tor 0.2.7.2-alpha-dev opening new log file.
Sep 07 12:34:25.781 [notice] Tor v0.2.7.2-alpha-dev running on Windows XP with Libevent 2.0.22-stable, OpenSSL 1.0.2d and Zlib 1.2.8.
Sep 07 12:34:27.000 [notice] This version of Tor (0.2.7.2-alpha-dev) is newer than any recommended version, according to the directory authorities. Recommended versions are: 0.2.4.23,0.2.4.24,0.2.4.25,0.2.4.26,0.2.4.27,0.2.5.8-rc,0.2.5.9-rc,0.2.5.10,0.2.5.11,0.2.5.12,0.2.6.5-rc,0.2.6.6,0.2.6.7,0.2.6.8,0.2.6.9,0.2.6.10,0.2.7.1-alpha,0.2.7.2-alpha
Sep 07 12:34:42.000 [notice] Now checking whether ORPort 109.103.57.102:9001 and DirPort 109.103.57.102:9030 are reachable... (this may take up to 20 minutes -- look for log messages indicating success)
Sep 07 12:34:43.000 [notice] Self-testing indicates your DirPort is reachable from the outside. Excellent.

Sep 07 12:34:43.000 [warn] Failure from drain_fd: No error

Sep 07 12:34:43.000 [notice] Self-testing indicates your ORPort is reachable from the outside. Excellent. Publishing server descriptor.
Sep 07 12:34:47.000 [notice] Performing bandwidth self-test...done.

Also, I used bench.exe to see the tor performances on my PC but I have no idea if those values are good or bad for a relay. Can you look into and write your opinion about?

Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.

C:\tor-0.2.7.2-alpha\src\test>bench.exe
===== dmap =====
nbits=65536
digestmap_set: 500.68 ns per element
digestmap_get: 488.52 ns per element
digestset_add: 422.00 ns per element
digestset_contains: 416.99 ns per element.
Hits == 32825344
False positive rate on digestset: 0.22%
===== siphash =====
siphash24g(7): 260.42 ns per call
siphash24g(8): 312.50 ns per call
siphash24g(15): 312.50 ns per call
siphash24g(16): 416.67 ns per call
siphash24g(20): 364.58 ns per call
siphash24g(32): 520.83 ns per call
siphash24g(111): 989.58 ns per call
siphash24g(128): 1197.92 ns per call
===== aes =====
1 bytes: 70.78 nsec per byte
2 bytes: 47.50 nsec per byte
4 bytes: 36.32 nsec per byte
8 bytes: 29.80 nsec per byte
16 bytes: 23.28 nsec per byte
32 bytes: 21.42 nsec per byte
64 bytes: 21.42 nsec per byte
128 bytes: 21.42 nsec per byte
256 bytes: 23.28 nsec per byte
512 bytes: 21.42 nsec per byte
1024 bytes: 19.56 nsec per byte
2048 bytes: 20.49 nsec per byte
4096 bytes: 20.49 nsec per byte
8192 bytes: 20.49 nsec per byte
===== onion_TAP =====
Client-side, part 1: 2349.853516 usec.
Server-side, key guessed right: 6164.550781 usec
Server-side, key guessed wrong: 7995.605469 usec.
Client-side, part 2: 1953.125000 usec.
===== onion_ntor =====
Ed25519-based basepoint multiply = disabled.
Client-side, part 1: 1495.361328 usec.
Server-side: 4531.860352 usec
Client-side, part 2: 3021.240234 usec.
Ed25519-based basepoint multiply = enabled.
Client-side, part 1: 442.504883 usec.
Server-side: 3494.262695 usec
Client-side, part 2: 3005.981445 usec.
===== ed25519 =====
Ed25519-donna = disabled.
Generate public key: 476.84 usec
Sign a short message: 484.47 usec
Verify signature: 1602.17 usec
Convert public point from curve25519: 95.37 usec
Blind a public key: 1399.99 usec
Ed25519-donna = enabled.
Generate public key: 442.50 usec
Sign a short message: 457.76 usec
Verify signature: 1514.43 usec
Convert public point from curve25519: 91.55 usec
Blind a public key: 1373.29 usec
===== cell_aes =====
509 bytes, misaligned by 0: 20.61 nsec per byte
509 bytes, misaligned by 1: 20.14 nsec per byte
509 bytes, misaligned by 2: 20.61 nsec per byte
509 bytes, misaligned by 3: 20.14 nsec per byte
509 bytes, misaligned by 4: 20.61 nsec per byte
509 bytes, misaligned by 5: 20.14 nsec per byte
509 bytes, misaligned by 6: 20.14 nsec per byte
509 bytes, misaligned by 7: 20.61 nsec per byte
509 bytes, misaligned by 8: 20.14 nsec per byte
509 bytes, misaligned by 9: 20.61 nsec per byte
509 bytes, misaligned by 10: 20.14 nsec per byte
509 bytes, misaligned by 11: 20.14 nsec per byte
509 bytes, misaligned by 12: 20.61 nsec per byte
509 bytes, misaligned by 13: 20.14 nsec per byte
509 bytes, misaligned by 14: 20.61 nsec per byte
509 bytes, misaligned by 15: 20.14 nsec per byte
===== cell_ops =====
 Inbound cells: 10490.42 ns per cell. (20.61 ns per byte of payload)
Outbound cells: 10490.42 ns per cell. (20.61 ns per byte of payload)
===== dh =====
Complete DH handshakes (1024 bit, public and private ops):
      7.980347 millisec each.
===== ecdh_p256 =====
Complete ECDH P-256 handshakes (2 public and 2 private ops):
      9.048462 millisec each.
===== ecdh_p224 =====
Complete ECDH P-224 handshakes (2 public and 2 private ops):
      6.301880 millisec each.
Last edited 4 years ago by TORques (previous) (diff)

comment:9 in reply to:  8 ; Changed 4 years ago by yawning

This is rapidly getting to be tech support-ish, but I'll write one more reply.

Replying to TORques:

Thanks for your replay. My router/relay can handle a large number of simultaneous connections (tcpsys changed to 256 connections). I just built the latest tor from your github and I see the modified message related to "Failure from drain_fd"

I hope the 256 isn't the upper limit.

Speaking purely in terms of theory, relays that can't handle having 1 connection open to every single other relay in the Tor network simultaneously are bad for the network, since at some point a client will request a circuit that spans that pair of relays. If this connection fails due to resource limitations (router/os/whatever), that has a impact on the client's path selection and hurts anonymity.

In practice, the actual number is somewhat lower, but I do not have a good estimate for it, beyond "I seem to recall hearing a number that was much higher than 256 for a busy relay" (See #16997 for "we should collect stats on this number").

Edit: Asked someone randomly on IRC, got ~3800 open connections.

Sep 07 12:34:25.000 [notice] Tor 0.2.7.2-alpha-dev opening new log file.
Sep 07 12:34:25.781 [notice] Tor v0.2.7.2-alpha-dev running on Windows XP with Libevent 2.0.22-stable, OpenSSL 1.0.2d and Zlib 1.2.8.
Sep 07 12:34:27.000 [notice] This version of Tor (0.2.7.2-alpha-dev) is newer than any recommended version, according to the directory authorities. Recommended versions are: 0.2.4.23,0.2.4.24,0.2.4.25,0.2.4.26,0.2.4.27,0.2.5.8-rc,0.2.5.9-rc,0.2.5.10,0.2.5.11,0.2.5.12,0.2.6.5-rc,0.2.6.6,0.2.6.7,0.2.6.8,0.2.6.9,0.2.6.10,0.2.7.1-alpha,0.2.7.2-alpha
Sep 07 12:34:42.000 [notice] Now checking whether ORPort 109.103.57.102:9001 and DirPort 109.103.57.102:9030 are reachable... (this may take up to 20 minutes -- look for log messages indicating success)
Sep 07 12:34:43.000 [notice] Self-testing indicates your DirPort is reachable from the outside. Excellent.

Sep 07 12:34:43.000 [warn] Failure from drain_fd: No error

Sep 07 12:34:43.000 [notice] Self-testing indicates your ORPort is reachable from the outside. Excellent. Publishing server descriptor.
Sep 07 12:34:47.000 [notice] Performing bandwidth self-test...done.

So the actual bug is still present? nickm? Is this still something that should just be dup-ed?

Also, I used bench.exe to see the tor performances on my PC but I have no idea if those values are good or bad for a relay. Can you look into and write your opinion about?

*shrugs* Doesn't look like you have AES-NI, and the public key crypto operations are an order of magnitude slower than a mobile Haswell i5. People do run relays on slower things, so there's nothing wrong with the fact that all the crypto is slow, but I wouldn't expect amazing performance.

Last edited 4 years ago by yawning (previous) (diff)

comment:10 in reply to:  9 Changed 4 years ago by TORques

Replying to yawning:

I hope the 256 isn't the upper limit.
Edit: Asked someone randomly on IRC, got ~3800 open connections.>

TcpNumConnections=16777214
half-open connections=500 (tcpsys)

I think is good enough for my ancient PC .

comment:11 Changed 4 years ago by arma

Summary: "[warn] Failure from drain_fd" in Tor v0.2.7.2-alpha (git-36c0ae6f7834af16) running on Windows XP"Failure from drain_fd: no error" in Tor v0.2.7.2-alpha-dev running on Windows XP

comment:12 Changed 4 years ago by nickm

Milestone: Tor: 0.2.7.x-finalTor: 0.2.8.x-final

Moving these tickets into 0.2.8. Not expecting to take patches for any into 0.2.7 at this late date. As usual, please say something if you disagree! :)

comment:13 Changed 4 years ago by nickm

Status: newneeds_information

comment:14 Changed 4 years ago by nickm

Milestone: Tor: 0.2.8.x-finalTor: 0.2.???

Move a few tickets out of 0.2.8. I would take a good patch for most of these if somebody writes one. (If you do, please make the ticket needs_review and move it back into maint-0.2.8 milestone. :) )

comment:15 Changed 3 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:16 Changed 3 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:17 Changed 3 years ago by nickm

Resolution: duplicate
Severity: Normal
Status: needs_informationclosed

Closing as duplicate of #21540.

Note: See TracTickets for help on using tickets.