Opened 2 years ago

Last modified 3 weeks ago

#18496 needs_information enhancement

stateful firewalling on relays

Reported by: thomas Owned by: Nusenu
Priority: Medium Milestone:
Component: Community/Relays Version:
Severity: Normal Keywords: tor-doc needs-analysis turn-into-a-wiki-page
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Le me suggest doing a documentation enhancement and raise some awareness regarding operation of Tor relays behind stateful firewalls/packet filters.

Recently, I’ve enabled a stateful packet filter on my relays. Later on, consensus weight, guard probability and observed bandwidth started to decrease. Additionally, the avg count of open connections decreased. After some effort in debugging this I found out that the stateful tracking algorithm in the firewall started to drop packets due to TCP sequence number mismatches. I had approx. 3 - 7 mismatches per second, with relays serving an avg of 8mbyte/s and an avg of 13k open connections. When I reconfigured the ruleset to not perform sequence number verification, drops no longer occured and consensus weight and observed bandwidth is now slowly increasing again.

Of course, this can’t be fixed in Tor, as it is a fault in the TCP/IP stacks on the remote end, or might be caused by NAT operations in residential CPEs. Or maybe this happens because residential equipment is usually not designed to carry long-lasting TCP sessions and might have issues with sequence number wraps. But nevertheless, documentation should cover this and raise awareness:

1.) If possible, TCP sequence number verification should be disabled in the firewall, as a Tor relay must expect to receive packets that might not pass such verification.
2.) Even if the sequence number verification recovers after the remote end retransmits the packets, this might still trigger congestion avoidance in TCP/IP stacks - resulting in performance degradation for the end-user and resource starvation on the relay side.
3.) Firewalls should be configured to not drop invalid packets, but send an RST instead. Even if some fault with stateful firewalling happens, this prevents that TCP sessions are stalled and connections established by end-users run into timeout.
4.) With the increasing shortage of IPv4 addresses a lot of ISPs will start placing their customers behind CGNAT or NAT444. This might result in causing more issues on Tor nodes behind firewalls with security-cautious stateful filtering.
5.) Some areas of the world might already place their users behind gateways, that are doing unexpected TCP header modifications on NAT. This might also cause security-cautious stateful firewalling on a relay to fail.

Another issue might be the following: Some stateful firewalling implementations don’t allow either side of the connection to increase/decrease their MSS after the TCP connection is established. Maybe some low-end and residential devices do this, which also results in packets being dropped.

From my point of view, the overall risk is as follows: Due to the shortage of IPv4 addresses, more and more ISPs put their customers behind NAT. As such residential implementations often don’t implement the RFCs properly, stateful firewalling on the relays results in an increasing instability in the Tor network.

Maybe it’s also a good idea having Tor generate a warning message if it sees repeating bursts of timeouts in TCP connections.

This happened with Tor 0.2.7.6 on FreeBSD 10.1, using PF as firewall. Relays have public IP addresses, so no NAT is performed by the packet filter. Disabling TCP sequence number verification can be configured with the “sloppy” option on relevant rules. Most certainly, this can also happen with other firewalls, like iptables or commercial vendors (but haven’t verified this).

Child Tickets

Change History (5)

comment:1 Changed 2 years ago by nickm

Milestone: Tor: unspecified

comment:2 Changed 13 months ago by nickm

Keywords: tor-doc needs-analysis turn-into-a-wiki-page added

comment:3 Changed 3 months ago by teor

Component: Core Tor/TorCommunity/Relays
Milestone: Tor: unspecified
Owner: set to Nusenu
Version: Tor: 0.2.7.6

comment:4 Changed 3 months ago by cypherpunks

  • is this problem still observable?
  • does this also happen on datacenter relays? (not connected via end user CPEs)
  • if TCP sequence numbers are wrong, will these packets even reach the tor daemon (even with no firewall)?

comment:5 Changed 3 weeks ago by cypherpunks

Status: newneeds_information
Note: See TracTickets for help on using tickets.