MAX_DNS_TTL is 3 hours, so there are only 7 possible values for the TTL.
I chose to round down because that way, Tor DNS TTLs are only ever shorter than the lifetime specified by the DNS server.
I don't think we need to add noise to the TTL received from either the DNS server or Exit. I can't see the value in randomising it, and allowing randomisation could hide a tagging attack.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
So, clients don't do DNS cacheing by default any more, because of risks like this. Do you think it might make more sense to simply remove client-side DNS cacheing entirely?
We have ongoing research on DNS-based traffic correlation attacks (https://nymity.ch/dns-traffic-correlation/) that relates to this. While fixing #19025 (moved) will help in mitigating attacks to an extent, the most important change to consider related to DNS is to also significantly increase MIN_DNS_TTL. This is because useful domains for our attacks today have low TTLs: about 50% of Alexa top 1M have a useful domain with TTL <= 60 seconds, and 75% a TTL <= 30 min. Do you think it would be practical to have MIN_DNS_TTL set to, say, 30 min? Would too much break?
If I understand the proposal here in #19769 (moved), rounding TTLs between [0s,30m) to MIN_DNS_TTL also for exits (?), then this will actually benefit an attacker who can observe both entry traffic and DNS requests for about 25% of Alexa top 1M (but for the remaining 25% it's an improvement together with #19025 (moved) over the status quo).
Sorry if this is the wrong place for this, especially since we don't have a paper to share yet.
Fix #19025 (moved) (otherwise ttl above will always be MIN_DNS_TTL).
Potentially refactor the DNS caching code to support evictions (while doing this, maybe rip out all old client-side DNS caching code?).
Add some form of logging to track cache size, usage, and eviction rate.
dns_get_expiry_ttl should be the same as dns_clip_ttl above. Please note that we are not sure these are the only relevant functions.
Given
For popular websites, caching at exits is highly likely, and DefecTor attacks are the same as WF attacks.
For unpopular websites, caching and TTLs are moot, since the probability of an DNS record being chached is negligible. Caching these records are just an extra burden on the exit and in a sense also a risk due to leaking recent activity at the exit on compromise. DefecTor attacks will be more precise than WF attacks here, and Tor needs WF defenses to mitigate (another long-term topic).
For long TTLs, for what it is worth, we know from #19025 (moved) that the real-world impact of ignoring these long TTLs are not a serious issue.
For short TTLs, the impact of increasing it is our primary worry since we might break something.
We want to prevent fine-grained TTLs to protect against tagging attacks.
We do not want too high TTLs to have a chance to auto-magically resolve DNS cache poisoning.
The total size of the cache might be a vector for DoS.
For short TTLs, go as far up as we are comfortable to without significantly risking breaking things.
Proposal
Stage changes: start with repairing the TTL bug #19025 (moved) and change clipping to 560 seconds for MIN_DNS_TTL and 6060 seconds for MAX_DNS_TTL, honoring no intermediate values (see code above). Wait for feedback on unexpected breaks. If all manageable, increase MIN_DNS_TTL to 60*60 in a future patch, effectively always caching for 60 minutes. If DefecTor attacks become a real concern short-term, encourage concerned site owners to consider longer TTLs to hit the MAX_DNS_TTL value. Make the cache size limited and eviction when full uniformly random. We random to give an attacker less control since it can presumably cause evictions at will (LIFO is easy to manipulate for an attacker).
Getting feedback on real-world cache size, usage, and eviction rate from exit operators would be useful, so perhaps some form of log output is reasonable?
Regarding MIN_DNS_TTL, Microsoft Windows used (still does?) to have a minimum TTL of 15 minutes for it's client-side cache, IIRC.
Given how prevalent that platform is, I guess a 1 minute MIN_DNS_TTL is very unlikely to break things.
Are there plans to allow more values than {MIN,MAX}_DNS_TTL?
Since pulls asked for feedback from exit operators, here is some based on my experience with Nos oignons.
Our configuration is publicly documented, but in French, so here is a summary:
We use Unbound as a local, DNSSEC-validating resolver on the exit nodes.
It obviously only listens locally.
We use its private-address feature to prevent RFC1918 addresses from figuring in results, to mitigate DNS rebinding attacks.
We use hide-{identity,version}, mostly out of general principle: anybody reading our documentation would learn that we run Unbound; however, it's unclear to me whether those could be exploited to tie users to specific exits being used for DNS resolution (and if that's relevant).
We use harden-short-bufsize and harden-large-queries to make Unbound return SERVFAIL on edge cases that can be exploited for DoSing the resolver.
We forward queries for nos-oignons.{net,org,fr} directly to our authoritative resolver. This is not especially relevant for the exit, but error logs mails and so on will break if the domain fails to resolve.
/etc/resolv.conf always specifies search nos-oignons.net (does little-t tor honor that? that could be awkward) and 127.0.0.1 as the first nameserver.
If a fallback resolver is specified, it is either operated by the network hosting the exit node or by a close-by (network-wise) organization we have friendly ties to (typically, a non-profit, associative ISP).
While writing this, I'm realising it might be useful to have “DNS resolution best-practices” for exit operators, since this is mostly something adhoc we came up with based on what our sysadmins were doing in other places, not something we systematically researched.