Opened 5 weeks ago

Closed 3 weeks ago

#28398 closed enhancement (fixed)

Please provide a method on descriptors for calculating digests

Reported by: irl Owned by: atagar
Priority: Medium Milestone:
Component: Core Tor/Stem Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


All throughout documents available through dir-spec there are SHA1 and SHA256 digests to reference other documents. It's easy enough to follow these but if you already have the document but want to get its digest, I can't see the method for doing so.

This would ideally return a lower-case SHA1 hex digest, or a base64 encoded SHA256 digest, depending on which function you call.

Child Tickets

Attachments (2)

reference-checker.png (64.6 KB) - added by irl 4 weeks ago.
detached-signature.txt (1.2 KB) - added by irl 4 weeks ago.

Download all attachments as: .zip

Change History (15)

comment:1 Changed 5 weeks ago by atagar

Hi irl, would you mind citing the dir-spec types and digests you mean? My understanding is that there isn't a universal notion of a 'digest' that applies to all descriptors. Rather, there's individual 'digest' fields that are specific to certain descriptor types.

Our ServerDescriptor and ExtraInfoDescriptor classes already have digest() methods if that's what you mean.

comment:2 Changed 5 weeks ago by irl

For ServerDescriptor and ExtraInfoDescriptor, the digest() method is exactly what I want for SHA-1, but there is no equivalent for SHA-256.

       "sha256-digest" is a base64-encoded SHA256 digest of the extra-info
       document, computed over the same data.

For votes only SHA-1 seems to be specified but SHA-256 could be implemented in the same way as above:

        A digest of the vote from the authority that contributed to this
        consensus, as signed (that is, not including the signature).
        (Hex, upper-case.)

For flavoured consensuses:

    "additional-digest" SP flavor SP algname SP digest NL

        [Any number.]

        For each supported consensus flavor, every directory authority
        adds one or more "additional-digest" lines.  "flavor" is the name
        of the consensus flavor, "algname" is the name of the hash
        algorithm that is used to generate the digest, and "digest" is the
        hex-encoded digest.

        The hash algorithm for the microdescriptor consensus flavor is
        defined as SHA256 with algname "sha256".

comment:3 Changed 5 weeks ago by atagar

Thanks irl. Sunk some time into this but unfortunately I'm a bit stuck. I began with sha256 digests of extrainfo descriptors because we can easily check their corresponding server descriptor to see if we got the correct hash or not.

A commit that attempts to do this is available in the 'sha256_digest' branch of my personal repo...

Unfortunately though the spec seems pretty clear I'm somehow not matching what server descriptors expect. Would you mind seeing if you can spot what I'm buggering up here?

comment:4 Changed 5 weeks ago by irl

See #28415

comment:5 Changed 5 weeks ago by atagar

Status: newneeds_information

Thanks Iain! Just merged a branch I've been chewing on these last few days. Does this do the trick for ya?

Changed 4 weeks ago by irl

Attachment: reference-checker.png added

Changed 4 weeks ago by irl

Attachment: detached-signature.txt added

comment:6 Changed 4 weeks ago by irl

Status: needs_informationnew

This is great for server and extra-info descriptors! I do also need to calculate digests for basically everything though. I've attached a digram that shows how documents reference each other, and I've attached an example detached signature as there are currently only available for 5 minutes every hour (during the voting process, for DistSeconds).

There is not yet a parser for detached signatures so the digest for that can be thought about in another ticket.

Can you add the digest() method to NetworkStatusDocumentV3 and Microdescriptor?

comment:7 Changed 4 weeks ago by atagar

Can you add the digest() method to NetworkStatusDocumentV3 and Microdescriptor?

Happy to. Thank you for the diagram! That helps a lot. :P

comment:8 Changed 4 weeks ago by atagar

Hi irl, just a quick update that microdescriptors now have a digest() method. In implementing this I found and fixed a few past mistakes of mine...

  • Tor has now implemented microdescriptor downloading via DirPorts. As such, undeprecated our stem.descriptor.remote method for this and improved it a bit (commit).
  • Our newly added from_str() method didn't work for router status entries (commit).
  • Router status entries referenced a hex rather than base64 digest for microdescriptors (commit).

comment:9 Changed 4 weeks ago by irl

It seems that this has surfaced a bug in parsing microdescriptors. Sometimes there is a newline on the end of the _raw_content and sometimes it disappears. I've checked all the authorities and they are all putting newlines on the end (0x0a). This is messing up the digest when the final newline is missing. It is odd that it sometimes works, and sometimes doesn't, and seems to mostly not work when there is a family line.

comment:11 Changed 3 weeks ago by atagar

Hi irl, gave consensus digesting a shot this morning but having a little difficulty getting the right value. Filed #28664 to ask for a spec clarification.

comment:12 Changed 3 weeks ago by atagar

Status: newneeds_information

Ah ha! Through more experimentation on the bus got it...

Is there anything else you need for this ticket?

comment:13 Changed 3 weeks ago by irl

Resolution: fixed
Status: needs_informationclosed

I think the only thing missing now is the bandwidth lists, but they don't really exist at all yet in stem, so this should be all. I'll open new tickets if I find anything else.

Note: See TracTickets for help on using tickets.