Tor Relay Security and Best Practices
This document aims to motivate and describe some best practices for Tor Relay security, grounded in a realistic threat model.
This document is written only from the point of view of protecting your Tor Relay from that threat model. It assumes that Tor is the most important process on the machine, and that protecting other things from Tor is therefore out of scope. For general information on securing your machine against attacks through the Tor daemon itself, see the Operational Security page.
For information on running an exit relay, see Tips for Running an Exit Node with Minimal Harassment as well as the set of Tor Abuse Templates.
As of Tor 0.2.7 we use ed25519 identities for relays with an optional feature to generate and keep the ed25519 master identity secret key offline. The relay will use a temporary signing key with limited lifetime, so it will require periodic renewals for the temporary signing key. This will ensure the relay keeps the same identity regardless what happens to the temporary signing keys. Read this guide to learn more about offline relay identity keys.
= Adversary Goals and Threat Model =
There is a significant difference between adversaries that can see inside of router-to-router TLS vs those that cannot. I believe this capability distinction governs the adversary goals in terms of compromising relays as opposed to merely externally observing them.
Adversaries that can unwrap router TLS can perform every attack that an actual node can perform, at any location between the user and the node, and/or between the node and other nodes.
In particular, adversaries that can see inside router TLS can perform tagging attacks as well as perform circuit-specific active and passive timing analysis.
These attacks can be quite severe. An adversary that is able to obtain Guard identity keys is free to perform a tagging attack anywhere on the Internet. In other words, if the adversary is interested in monitoring a particular user, the adversary need only obtain the identity keys for that user's 3 guard nodes, and from that point on, the adversary will be able to transparently monitor everything that user does by way of using tagging to bias the users paths to connect only to surveilled exit nodes who also have had their identity keys compromised.
Attack Vectors
There are two high-level vectors towards seeing inside node-to-node TLS (which uses ephemeral keys that are rotated daily and authenticated via the node's identity key). Both high-level vectors therefore revolve around node identity key theft.
Attack Vector !#1: One-Time Key Theft
The one-time adversary is interested in performing a grab of keys and then operating transparently upstream afterwords. This adversary will take the form of a coercive request at a datacenter/ISP to extract identity node key material and from then on, operate externally as a transparent upstream MITM, creating fake ephemeral TLS keys authenticated with the stolen identity key. Tor nodes that encounter this adversary will likely see it in the form of unexplained reboots/mysterious downtime, which are inevitable in the lifespan of any Tor node.
#2 (closed): Persistent Key Theft
Attack Vector !If one-time methods fail or are beyond reach, the adversary has to resort to persistent machine compromise to retain access to node key material.
The Persistent attacker can use the same vector as !#1 or perhaps an external vector such as daemon compromise, but they then must also plant a backdoor that would do something like trawl through the RAM of a machine, sniff out the keys (perhaps even grabbing the ephemeral TLS keys directly), and transmit them offsite for collection.
This is a significantly more expensive position for the adversary to maintain, because it is possible to notice upon a thorough forensic investigation during a perhaps unrelated incident, and it may trigger firewall warnings or other common least privilege defense alarms inadvertently.
Unfortunately, it is also a more expensive attack to defend against, because it requires extensive auditing and assurance mechanisms on the part of the relay operator.
Defenses
It seems clear that the above indicates that at minimum relays should protect against one-time key compromise. Some further thought shows that it is possible to make the Persistent adversary's task harder as well, albeit with significantly more effort.
Let's deal with defending against each vector in turn.
Vector !#1: Deploy Ephemeral Identity Keys
The simplest way to defend against the adversary who attempts to extract relay keys through a reboot is to take advantage of the fact that even node identity keys can be ephemeral, and do not need to persist long term (certainly not past a reboot). This can be achieved with a boot script that wipes your keys (they live in /var/lib/tor/keys) at startup, or by using a ramdisk.
Periodically (on the order of every 12 to 18 months), you should completely wipe your node identity keys as a best practice and restart fresh even in the absence of suspicious reboots.. The cost to the network of tossing node keys away is small. It only takes 2 weeks for your node to regain the Guard flag, for example.
Additionally, ssh server key theft is another one-time vector that can be used to quickly bootstrap into node key theft. For this reason, node admins should always use ssh key auth for tor node administration accounts, since it prevents ssh server key theft from allowing continuous server compromise.
Vector !#1: Offline Master Keys
Even stronger than ephemeral identity keys are offline master keys that are never exposed to the relay at all. This makes regular reputation resets (due to key resets) unnecessary and provides stronger assurances to the identity key when compared to keys that are deleted on reboot or every 12 to 18 months.
#2 (closed): Isolation Hardening and Readonly Runtime
Vector !Once one-time key theft has been dealt with, you can begin to consider how to deal with the Persistent threat.
The effort required to defend against this adversary is considerable, and it is not expected that all operators will devote the effort to do so.
To limit scope, we are not going to deal with the daemon compromise vector; for that see your Operating System's least-privilege mechanisms (such as SElinux, AppArmor, Grsec RBAC, Seatbelt, etc). Instead, we will deal with how you can attempt to protect your identity keys once an adversary already has root access.
Disabling the ptrace syscall
If you are serious about defending against this adversary, the first thing you will want to do is disable access to the 'ptrace' system call from userland, which allows easy Tor key theft using debugging tools such as gdb. Note that all currently deployed mechanisms to do this still allow root users to use ptrace on arbitrary processes. In order to disable ptrace for root users, you need to load a kernel module to delete the ptrace call from the syscall table.
Once access to the ptrace system call is removed, you need to disable module loading to prevent it from being restored. On Linux, this is accomplished via 'sysctl kernel.modules_disabled=1'. You should perform this operation as early in the boot process as possible. One technique that works on Redhat-based systems is to place a shell script in /etc/rc.modules to load the modules you need for operation, insert the ptrace module, and then issue the sysctl to disable further module loading. Redhat-derivatives launch /etc/rc.modules first thing at the top of /etc/rc.sysinit.
Ensuring Runtime Integrity
After that comes ensuring runtime integrity. There are several ways to achieve this, but most are easily subverted by an attacker with direct access to the hardware. The most robust approach seems to be to create a small encrypted loopback filesystem that contains all of the libraries required to run the 'tor' process as well as all of the requisite configuration files. This wiki page has several scripts attached to aid in collecting these files.
The root filesystem itself doesn't need to be more than ~25M in size, but you will also need an auxiliary var loopback that needs to be a hundred megs or so. You should only have to authenticate and update the root filesystem, not the var filesystem, but both should be encrypted, since node keys are stored in var.
Here are the commands for creating the root loopback filesystem:
dd if=/dev/urandom of=./tor-root.img bs=1k count=25k
losetup /dev/loop1 ./tor-root.img
cryptsetup luksFormat /dev/loop1
cryptsetup luksOpen /dev/loop1 tor-root
mkfs.ext4 /dev/mapper/tor-root
When you use this loopback, you will mount it readonly, and mount an unencrypted var directory inside of it, and a ramdisk for your keys inside of that. For now, you'll leave it readwrite for setup.
dd if=/dev/urandom of=./tor-var.img bs=1k count=200k
losetup /dev/loop2 ./tor-var.img
cryptsetup luksFormat /dev/loop2
cryptsetup luksOpen /dev/loop2 tor-var
mkfs.ext4 /dev/mapper/tor-var
mkdir /mnt/tor-root
mount /dev/mapper/tor-root /mnt/tor-root
mkdir /mnt/tor-root/var
mount /dev/mapper/tor-var /mnt/tor-root/var
Volume Setup and Authentication
Once you've got your volumes set up, you can then run the scripts attached to this wiki page to copy your known-good Tor runtime into the volume.
XXX: demo script use
The attached scripts have been tested to work on RHEL/CentOS and Ubuntu systems, and may work on Fedora and Debian systems as well.
Don't forget to periodically update the libraries stored on your loopback root using a trusted offsite source, as they won't receive security updates from your distribution. You want to avoid using static tor binaries, as they also suffer from the update problem, and additionally do not recieve the benefit of per-library ASLR.
Identity Key Management
Once you start your tor process(es), you will want to copy your identity key offsite, and then remove it. Tor does not need it to remain on disk after startup, and removing it ensures that an attacker must deploy a kernel exploit to obtain it from memory. While you should not re-use the identity key after unexplained reboots, you may want to retain a copy for planned reboots and tor maintenance.
scp /mnt/tor-root/var/lib/tor/keys/secret_id_key offsite_backup:/mnt/usb/tor_key
rm /mnt/tor-root/var/lib/tor/keys/secret_id_key
Upon suspicious reboots, you can verify the integrity of your tor image by simply calculating the sha1sum (perhaps copying the image offsite first). You do not need to do anything special with the var loopback.
These steps should prevent even adversaries who compromise the root account on your system (by rebooting it, for example) from obtaining your identity keys directly, forcing them to resort to kernel exploits and memory gymnastics in order to do so.
Auditing the Kernel and Boot Scripts
After suspicious reboots, you should audit your initrd, kernel image, modules, and init scripts as best you can.