Why does Tor need to bootstrap?

There are thousands of relays in the Tor network. It's not reasonable to ship information about every single one of them as part of the Tor software. Directory authorities (and relays that act as directory caches) store information (generally self-reported descriptors) about each relay, some of which a client needs to connect to the Tor network. Directory authorities vote to produce a network status consensus document, which includes trusted observed state information about each relay that clients can use to make decisions about which relays to use.

The initial bootstrap information includes the directory authorities and FallbackDirectoryMirrors, hardcoded in the Tor software.

Obtaining directory information

During bootstrap, a client connects to a directory server to obtain a fresh copy of the consensus. Since Tor release 0.2.8, all clients make encrypted directory connections to Tor relays.

After obtaining a copy of the consensus, the client starts to download descriptors for relays. The client needs information in the descriptors that's not available in the consensus itself in order to connect:

  • Supported protocols and relay families are used to choose relays.
  • Exit port lists are used to choose exits.
  • IPv6 ORPorts are used by IPv6 clients to choose IPv6 Guards.
  • Onion keys are used to build circuits.

Connecting to guards

Guards help prevent some traffic analysis attacks. A client must connect to a guard before it can build a circuit for use with actual application traffic.

Building circuits

After a guard connection is established, a client extends circuits through the Tor network, with some of them ending at exit relays.

Bootstrap progress reporting

Currently Tor reports bootstrap progress to its log files and via the control protocol. Progress is generally assumed to not go backwards, but it can under some circumstances (e.g., consensus information expires).

Last modified 4 months ago Last modified on May 24, 2017, 1:33:36 AM