Opened 4 years ago

Closed 3 years ago

#19769 closed defect (fixed)

Round down DNS TTL to the nearest DEFAULT_DNS_TTL (30 minutes)

Reported by: teor Owned by: nickm
Priority: Very High Milestone: Tor: 0.2.9.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: 029-proposed, dns, 029-backport
Cc: phw, pulls, nicoo Actual Points: .2
Parent ID: Points: 0.5
Reviewer: Sponsor:


In #19025, we fix a bug that prevented exits sending DNS TTLs to clients for IPv4 and IPv6 addresses.

But we don't want to have too many potential values for these TTLs, to avoid tagging attacks.

So I propose

  • Exits round down (truncate) the TTL received from the DNS server, and
  • Clients round down the TTL received from the Exit,

to the nearest of:

  • MIN_DNS_TTL (1 minute), or
  • DEFAULT_DNS_TTL (30, 60, 90, 120, 150, 180 minutes)

MAX_DNS_TTL is 3 hours, so there are only 7 possible values for the TTL.
I chose to round down because that way, Tor DNS TTLs are only ever shorter than the lifetime specified by the DNS server.

I don't think we need to add noise to the TTL received from either the DNS server or Exit. I can't see the value in randomising it, and allowing randomisation could hide a tagging attack.

Child Tickets

Change History (27)

comment:1 Changed 4 years ago by teor

This would also require a change to torspec to describe the TTL rounding at:

comment:2 Changed 4 years ago by teor

Keywords: dns TorCoreTeam201607 added

comment:3 Changed 4 years ago by nickm

So, clients don't do DNS cacheing by default any more, because of risks like this. Do you think it might make more sense to simply remove client-side DNS cacheing entirely?

comment:4 Changed 4 years ago by pulls

We have ongoing research on DNS-based traffic correlation attacks ( that relates to this. While fixing #19025 will help in mitigating attacks to an extent, the most important change to consider related to DNS is to also significantly increase MIN_DNS_TTL. This is because useful domains for our attacks today have low TTLs: about 50% of Alexa top 1M have a useful domain with TTL <= 60 seconds, and 75% a TTL <= 30 min. Do you think it would be practical to have MIN_DNS_TTL set to, say, 30 min? Would too much break?

If I understand the proposal here in #19769, rounding TTLs between [0s,30m) to MIN_DNS_TTL also for exits (?), then this will actually benefit an attacker who can observe both entry traffic and DNS requests for about 25% of Alexa top 1M (but for the remaining 25% it's an improvement together with #19025 over the status quo).

Sorry if this is the wrong place for this, especially since we don't have a paper to share yet.

comment:5 Changed 4 years ago by phw

Cc: phw added

comment:6 Changed 4 years ago by pulls

Cc: pulls added

comment:7 Changed 4 years ago by nickm

Keywords: TorCoreTeam201608 added; TorCoreTeam201607 removed

No further code or documentation will be written in July, due to time itself. (Leaving needs_revision tickets as-is)

comment:8 Changed 4 years ago by nickm

Keywords: TorCoreTeam201609 added; TorCoreTeam201608 removed

Move unassigned items in August to September.

comment:9 Changed 4 years ago by teor

Status: newneeds_information

It would be nice to fix this, but we need to decide what to do first.

comment:10 Changed 4 years ago by nickm

Milestone: Tor: 0.2.???Tor: 0.3.0.x-final

comment:12 Changed 4 years ago by nickm

Priority: MediumVery High

comment:13 Changed 4 years ago by pulls


MIN_DNS_TTL = 5*60;
MAX_DNS_TTL = 60*60;

dns_clip_ttl(uint32_t ttl)
  if (ttl < MIN_DNS_TTL)
    return MIN_DNS_TTL;
    return MAX_DNS_TTL;
  • Fix #19025 (otherwise ttl above will always be MIN_DNS_TTL).
  • Potentially refactor the DNS caching code to support evictions (while doing this, maybe rip out all old client-side DNS caching code?).
  • Add some form of logging to track cache size, usage, and eviction rate.
  • dns_get_expiry_ttl should be the same as dns_clip_ttl above. Please note that we are not sure these are the only relevant functions.


  • For popular websites, caching at exits is highly likely, and DefecTor attacks are the same as WF attacks.
  • For unpopular websites, caching and TTLs are moot, since the probability of an DNS record being chached is negligible. Caching these records are just an extra burden on the exit and in a sense also a risk due to leaking recent activity at the exit on compromise. DefecTor attacks will be more precise than WF attacks here, and Tor needs WF defenses to mitigate (another long-term topic).
  • For long TTLs, for what it is worth, we know from #19025 that the real-world impact of ignoring these long TTLs are not a serious issue.
  • For short TTLs, the impact of increasing it is our primary worry since we might break something.
  • We want to prevent fine-grained TTLs to protect against tagging attacks.
  • We do not want too high TTLs to have a chance to auto-magically resolve DNS cache poisoning.
  • The total size of the cache might be a vector for DoS.
  • The client-side DNS cache remains off.

Goals for DefectTor mitigation

  • (Read about DefecTor attacks here:
  • Allow long TTLs to be long(er).
  • For short TTLs, go as far up as we are comfortable to without significantly risking breaking things.


Stage changes: start with repairing the TTL bug #19025 and change clipping to 5*60 seconds for MIN_DNS_TTL and 60*60 seconds for MAX_DNS_TTL, honoring no intermediate values (see code above). Wait for feedback on unexpected breaks. If all manageable, increase MIN_DNS_TTL to 60*60 in a future patch, effectively always caching for 60 minutes. If DefecTor attacks become a real concern short-term, encourage concerned site owners to consider longer TTLs to hit the MAX_DNS_TTL value. Make the cache size limited and eviction when full uniformly random. We random to give an attacker less control since it can presumably cause evictions at will (LIFO is easy to manipulate for an attacker).

Getting feedback on real-world cache size, usage, and eviction rate from exit operators would be useful, so perhaps some form of log output is reasonable?

Last edited 4 years ago by pulls (previous) (diff)

comment:14 Changed 4 years ago by nicoo

Cc: nicoo added

Regarding MIN_DNS_TTL, Microsoft Windows used (still does?) to have a minimum TTL of 15 minutes for it's client-side cache, IIRC.
Given how prevalent that platform is, I guess a 1 minute MIN_DNS_TTL is very unlikely to break things.

Are there plans to allow more values than {MIN,MAX}_DNS_TTL?

comment:15 Changed 4 years ago by nicoo

Since pulls asked for feedback from exit operators, here is some based on my experience with Nos oignons.

Our configuration is publicly documented, but in French, so here is a summary:

  • We use Unbound as a local, DNSSEC-validating resolver on the exit nodes.
    • It obviously only listens locally.
    • We use its private-address feature to prevent RFC1918 addresses from figuring in results, to mitigate DNS rebinding attacks.
    • We use hide-{identity,version}, mostly out of general principle: anybody reading our documentation would learn that we run Unbound; however, it's unclear to me whether those could be exploited to tie users to specific exits being used for DNS resolution (and if that's relevant).
    • We use harden-short-bufsize and harden-large-queries to make Unbound return SERVFAIL on edge cases that can be exploited for DoSing the resolver.
    • We forward queries for nos-oignons.{net,org,fr} directly to our authoritative resolver. This is not especially relevant for the exit, but error logs mails and so on will break if the domain fails to resolve.
  • /etc/resolv.conf always specifies search (does little-t tor honor that? that could be awkward) and as the first nameserver. If a fallback resolver is specified, it is either operated by the network hosting the exit node or by a close-by (network-wise) organization we have friendly ties to (typically, a non-profit, associative ISP).

While writing this, I'm realising it might be useful to have “DNS resolution best-practices” for exit operators, since this is mostly something adhoc we came up with based on what our sysadmins were doing in other places, not something we systematically researched.

comment:16 Changed 4 years ago by nickm

Actual Points: .2
Status: needs_informationneeds_review

My branch bug19769_029 implements this.

phw, pulls: Is this about what you had in mind? I took some liberties and might have messed things up.

comment:17 Changed 4 years ago by nickm

When we merge this, we should also merge the patch from #19025.

comment:18 Changed 4 years ago by nickm

Keywords: 029-backport added

This is potentially backportable to 0.2.9.

Potential enhancement: these values could be consensus parameters rather than hardcoded #defines.

comment:19 Changed 4 years ago by nickm

My bug19769_029 branch has been updated, based on feedback and corrections from pulls. Please review?

comment:20 Changed 4 years ago by nickm

Owner: set to nickm
Status: needs_reviewaccepted

setting owner

comment:21 Changed 4 years ago by nickm

Status: acceptedneeds_review

comment:22 Changed 4 years ago by pulls

Looks good to me, sorry for the delay.

comment:23 Changed 4 years ago by nickm

Status: needs_reviewmerge_ready

comment:24 Changed 4 years ago by nickm

Keywords: review-group-15 added

comment:25 Changed 4 years ago by nickm

Keywords: review-group-15 removed
Milestone: Tor: 0.3.0.x-finalTor: 0.2.9.x-final

Squashed, with phw's fix for #19025, as bug19769_19025_029.

Merged bug19769_19025_029 to master; possible 0.2.9 backport.

comment:26 Changed 4 years ago by nickm

Keywords: TorCoreTeam201609 removed

comment:27 Changed 3 years ago by nickm

Resolution: fixed
Status: merge_readyclosed

This has baked long enough without problems; backporting to 0.2.9

Note: See TracTickets for help on using tickets.