Opened 8 years ago

Last modified 2 years ago

#2991 new defect

Confusing log messages when a DA starts using a new key

Reported by: rransom Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-client authority download
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

From IRC:

[04:06:59] <ln5> karsten: seeing this repeatedly in maatuska logs:
[04:07:02] <ln5> Apr 26 13:06:09.964 [notice] We're missing a certificate from authority with signing key B3F2CB9D75F87AE4D4A8ECEB3CAAECDC4B131010: launching request.
[04:07:02] <ln5> Apr 26 13:06:10.318 [warn] Got a certificate for maatuska, but we already have it. Maybe they haven't updated it. Waiting for a while.
[04:08:27] <ln5> do i need to do anything else than dropping new authority_certificate and authority_signing_key files in the keys directory and restart tor?
[04:09:50] <rransom> That ‘We're missing a certificate’ log message means it fetched the previous consensus as a client, and saw a signature with maatuska's old key, and set out to ask maatuska for its certificate for that old key.
[04:10:35] <ln5> a, roles. makes sense. thanks.
[04:10:55] <rransom> The second one you pasted means that when it got maatuska's (new) certificate, the certificate didn't match the directory-signing key used for that consensus.

[04:29:37] <ln5> rransom: what makes you think that? to me it seems like it got a certificate it already had (maatuska's). i suppose that might happen if i fetch it from a DA that hasn't updated maatuskas cert yet.
[04:30:43] <rransom> ln5: It did. It was hoping to get maatuska's old certificate, which would contain the key with which maatuska had signed the then-current consensus.

I suspect that a client bootstrapped between the time that a DA upgraded its signing key and the time that it used that key to sign a new consensus would emit these confusing messages as well.

Child Tickets

Change History (12)

comment:1 Changed 8 years ago by linus

FWIW, tz in logs are CEST (UTC+2 at the time) and the last time that this particular message was logged was 13:59:01, exactly one minute before publishing the new consensus.

comment:2 Changed 8 years ago by rransom

The IRC log timestamps are PDT (Pacific Daylight Time, UTC-07:00) on 2011-04-26.

comment:3 in reply to:  description Changed 8 years ago by rransom

Milestone: Tor: 0.2.2.x-finalTor: 0.2.3.x-final

Replying to rransom:

I suspect that a client bootstrapped between the time that a DA upgraded its signing key and the time that it used that key to sign a new consensus would emit these confusing messages as well.

If a client can emit these log messages, we should fix that someday.

comment:4 Changed 7 years ago by nickm

Milestone: Tor: 0.2.3.x-finalTor: unspecified

comment:5 Changed 7 years ago by rransom

Priority: minornormal

Some possible fixes for this are:

  1. make dirauths keep their old certificates around for a while to be served to clients who request them (assuming the directory protocol permits this -- it may not);
  2. make clients not complain if they get a sufficiently new certificate, even if it's not the one that signed the consensus they have, unless they can't get enough of the certificates they need to validate the consensus;
  3. make clients not try to download a signing certificate if they (a) already have a newer one for that dirauth, and (b) have enough certificates to validate the consensus they have.

This may be related to #5595; if so, option 2 is not actually an option. Bumping the priority because of the possible connection to #5595, too.

comment:6 Changed 7 years ago by nickm

Keywords: tor-client added

comment:7 Changed 7 years ago by nickm

Component: Tor ClientTor

comment:8 Changed 7 years ago by sysrqb

Based on the two messages, at the time maatuska's:

a) newest key was not in the trusted_dir_certs table
b) trusted_dir_certs contained its most-current descriptor (?)

Assumption:
a) maatuska called authority_certs_fetch_missing which resulted it in checking if it had a certificate for each of the signatures on the status. It didn't have a cert for itself (with the correct sig) stored in its digestmap, so it launched the request.

b) when the request came in, already_have_cert was called to ensure it wasn't a duplicate. already_have_cert takes the cert and compares its cache_info.signed_descriptor_digest with the cache_info.signed_descriptor_digest for every cert that's stored in trusted_dir_certs.

So, assuming (a) and (b) are reasonably well founded, is there a reason the two checks are comparing different digests? I understand that (b) is more comprehensive/accurate but the descriptor digest doesn't include the DA's signing key, does it? Because if not then the descriptor digest wouldn't change when the signing key changed, if I understand what's contained in the descriptor correctly.

Also, based on the above and the assumption there is a connection to #5595 (which seems likely), the discrepancy between the two comparisons would lead to the repeated downloading of the certificate until the new cert was actually added to the digestmap.

I'll keep digging and please correct me where I am wrong.

comment:9 Changed 7 years ago by nickm

The field name "signed_descriptor_digest" is a little misleading -- it refers to the signed digest of whatever directory object we're talking about -- in this case, that's a certificate.

I think it's looking at two digests in this case because it's possible for the certificate to change without the signing key changing, but what tells us to download a new cert is encountering a signature using a signing key that we don't recognize.

comment:10 in reply to:  9 ; Changed 7 years ago by sysrqb

Replying to nickm:

The field name "signed_descriptor_digest" is a little misleading -- it refers to the signed digest of whatever directory object we're talking about -- in this case, that's a certificate.

Thanks, that makes more sense.

I think it's looking at two digests in this case because it's possible for the certificate to change without the signing key changing, but what tells us to download a new cert is encountering a signature using a signing key that we don't recognize.

Right...now that I've actually looked at the cert parsing code this makes a lot more sense!

I suspect that a client bootstrapped between the time that a DA upgraded its signing key and the time that it used that key to sign a new consensus would emit these confusing messages as well.

If the client was bootstrapping then this would make sense, but it doesn't appear that it was, and #5595 had been running for just under 5 days.

Per nickm's comment on #5595:

The right solution here will probably involve looking very hard at the code that currently makes certificate downloads get attempted and/or reattempted, and seeing what, if anything, rate-limits attempts to re-download a certificate that has failed, then figuring out why this is happening.

The downloads are rate-limited by download failures only, not by requesting and receiving multiple duplicates. A possible solution to this is to catch this edge-case by only allowing N attempts that result in "duplicate" downloads and then skip subsequent download attempts until after the current consensus' fresh-until time at which point we should hopefully retrieve a new consensus that was signed with the new key.

Or we can modify networkstatus_check_consensus_signature to take into account the case where the consensus' valid-after date/time is before the given cert's published-on date/time and we have reached the threshold of duplicate downloads. In this case, we can ignore the signature on the consensus entirely. There are currently enough active DAs that such an action should not invalidate the consensus. If more than hald of the DAs change their certs within the same hour however, then clients will fail to accept a consensus for at most(?) an hour until the newer consensus is voted on and distributed.

We can modify authority_certs_fetch_missing to ensure it does not attempt to redownload a cert for which it has already received N duplicates, in effect ignoring the sig (or invalidating it).

I'll keep looking to see if there's a better way to handle this. I was thinking maybe request the cert from a mirror because it may still have the old cert, but finding such a mirror would be extremely difficult.

comment:11 in reply to:  10 Changed 7 years ago by sysrqb

Replying to sysrqb:

Or we can modify networkstatus_check_consensus_signature to take into account...

That shouldn't be Or, sorry about the typo.

comment:12 Changed 2 years ago by nickm

Keywords: authority download added
Severity: Normal
Note: See TracTickets for help on using tickets.