Tor 0.2.3.x would generate an ADDRMAP event for all addresses added to the cache, and since all resolved IPv4 addresses were added to the cache, all IPv4 addresses generated an ADDRMAP event. Ticket #7570 (moved) introduced circuit-based separation of the DNS cache and "broke" that mechanism: Resolves generated via the DNSPort or a control connection do not end up in the cache and thus don't generate an ADDRMAP event. Also, when the cache is disabled by adding the SocksPort 9050 NoCacheDNS to the torrc, no ADDRMAP event is generated.
Git-bisect shows commit 7536c40e9641a0724f0c9e6f994306d762d37e4d[1] introduced this problem.
First, we should be clear about when to generate ADDRMAP events. From the control spec:
4.1.7. New Address mapping These events are generated when a new address mapping is entered in Tor's address map cache, or when the answer for a RESOLVE command is found.
This would mean that DNS data retrieved for DNSPort queries or when NoCacheDNS is set would not trigger an event. Do we want this behavior? Or do we want to trigger ADDRMAP events for any mapping that is not already in the cache, even if it is not going to be cached anyway?
Hm. My thought would be that an ADDRMAP event would be appropriate, perhaps with some new flag to indicate that it wasn't getting cached. (Or we could have an expiry set to some time in the past to indicate that it isn't cacheable.)
The description in control-spec seems unsuitable.
I wonder how hard this would be to fix in 0.2.4. If the fix isn't too hard, we could put it in there.
I like the idea of using a flag or setting the expiry time in the past. It really depends on the meaning of the expiry time: Does it mean "The remote DNS server specified this TTL for this A/AAAA/PTR record" or does it mean "Tor will cache this record until that time"? For the first, a flag would be better, for the latter I think that setting the expiry time to the first of January 1970 would be best. Since the cache is now per circuit, it would also be possible to add an CIRC_ID field that indicates for which circuit the hostname was cached.
I hope that this can go into 0.2.4. The fix shouldn't be too difficult once we have decided on how to change the control spec. Having ADDRMAP work in 0.2.3, then not always work in 0.2.4 and then work again in 0.2.5 would be bad.
I like the idea of using a flag or setting the expiry time in the past. It really depends on the meaning of the expiry time: Does it mean "The remote DNS server specified this TTL for this A/AAAA/PTR record" or does it mean "Tor will cache this record until that time"? For the first, a flag would be better, for the latter I think that setting the expiry time to the first of January 1970 would be best. Since the cache is now per circuit, it would also be possible to add an CIRC_ID field that indicates for which circuit the hostname was cached.
I'm inclined to say "let's not lose info", and do the flag.
(The cache is not actually per-circuit. Mostly, we just don't cache at the client side at all. This is approximately equivalent to a per-circuit cache, since the exit node caches too. Is there a lingering documentation bug too?)
I hope that this can go into 0.2.4. The fix shouldn't be too difficult once we have decided on how to change the control spec. Having ADDRMAP work in 0.2.3, then not always work in 0.2.4 and then work again in 0.2.5 would be bad.
I agree, but we should consider the risks too:
If it turns out that Vidalia or controller can't cope with a new flag, it would be bad to have to delay 0.2.4 packages until that controller could catch up.
If the fix is tricky enough, it would be bad to delay 0.2.4 stability for it.
But with any luck, if the fix can come soon, and the relevant controllers don't choke on it, and it's nice and simple, I think it can be 0.2.4 material.
(The cache is not actually per-circuit. Mostly, we just don't cache at the client side at all. This is approximately equivalent to a per-circuit cache, since the exit node caches too. Is there a lingering documentation bug too?)
No, I only skimmed the documentation and simply misunderstood the change.
If it turns out that Vidalia or controller can't cope with a new flag, it would be bad to have to delay 0.2.4 packages until that controller could catch up.
I have not found Vidalia listening for ADDRMAP events. Stem ignores unknown arguments, but changes are required to access the value of the new flag.
If the fix is tricky enough, it would be bad to delay 0.2.4 stability for it.
There is already code for sending an ADDRMAP event as response to RESOLVE requests, because failed resolves do not get added to the cache and would thus not generate such events. Changing this code to also generate an ADDRMAP event on success is all that is needed. The attached patch does this and also adds a new flag CACHE="YES"/"NO" to ADDRMAP events.
This patch looks plausible to me. I'd consider naming the new flag "CACHED" instead of "CACHE" because it indicates that Tor has cached the result, not that the user should cache the result. Once it has an accompanying controller spec patch, it's mergeable. It would also be great to have a changes/ file for this.
I have changed the patch to use CACHED instead of CACHE[1]. You can pull the changes directly from the bug-8596 branch of [2]. The changes to the control spec[3] are in the bug-8596 branch, also on github.[4]
One thing to consider before putting this into 0.2.4: When a mapaddress command is issued or when a host is automapped, it will also include the "CACHED" flag. If you do not want that, you could omit the "CACHED" flag if expiry is "NEVER".
From the point of view of controller sanity only: I suggest EXPIRES=NEVER rather than just the bare word NEVER. In other places in the control protocol, we've suddenly decided to announce that argument order doesn't matter anymore, which breaks stuff that has been written to depend on it. Omitting the EXPIRES keyword for the special case "NEVER" will break more stuff that will inevitably get written to handle a bare "NEVER".
I have split the change into two parts: One commit for genereating ADDRMAP events if they are the answer to a RESOLVE command, and the second commit adding the CACHE=YES/NO to ADDRMAP events. If the spec change breaks something, I suggest to only use the first commit in 0.2.4 and worry about the spec change in 0.2.5.
From the point of view of controller sanity only: I suggest EXPIRES=NEVER rather than just the bare word NEVER.
I agree the NEVER was a mistake, though for a different reason (the "maybe this is a quoted positional argument and maybe it's a bare word" made this event type quite a special snowflake). That said, as Nick mentioned that isn't what this ticket is about.
If the spec change breaks something, I suggest to only use the first commit in 0.2.4 and worry about the spec change in 0.2.5.
It really shouldn't. In terms of the spec this is a very simple addition, and should be perfectly ok for controllers. If it isn't then that's a bug with forward compatibility in the controller.
Sorry to reopen but I need a little clarification about the new CACHED arg. In an example above the YES/NO was quoted - is that how it ended up being implemented?
If so then the spec change should be changed from...
Cached = "YES" / "NO"
... which, contrary to the quotes, mean bare word arguments. I suspect the right specification would be something like...
Cached = DQUOTE "YES" DQUOTE / DQUOTE "NO" DQUOTE
Also, this is more of a stem-specific question but what would be the best behavior for when this flag isn't present? Should the event's cached attribute be defaulted to True or False? Or should the attribute take on the values of True / False / None (for undefined)?
Thanks! -Damian
Trac: Resolution: fixed toN/A Status: closed to reopened
I got confused by the quotes that were already around it, but those are part of the ABNF notation, not of the actual protocol. (I'm not very good with ABNF.) I commited a fix at [1].
Also, this is more of a stem-specific question but what would be the best behavior for when this flag isn't present? Should the event's cached attribute be defaulted to True or False? Or should the attribute take on the values of True / False / None (for undefined)?
It's a bit complicated: If the DNS resolution worked and an addrmap event is generated without a cached flag, then it was added to the cache. If resolution failed (e.g. because the domain doesn't exist) and there is no cached flag, then it was not be cached (this only happens when using the RESOLVE command).
I got confused by the quotes that were already around it, but those are part of the ABNF notation, not of the actual protocol. (I'm not very good with ABNF.) I commited a fix at [1].
Thanks! Looks good to me. :)
It's a bit complicated: If the DNS resolution worked and an addrmap event is generated without a cached flag, then it was added to the cache. If resolution failed (e.g. because the domain doesn't exist) and there is no cached flag, then it was not be cached (this only happens when using the RESOLVE command).
Hmm, sounds like I should stick with True / False / None then. Thanks for the clarification.