Opened 5 years ago

Last modified 2 years ago

#12799 new defect

fingerprints - descriptor Space removal, case normalization

Reported by: grarpamp Owned by:
Priority: Very Low Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.4.22
Severity: Normal Keywords: tor-relay needs-proposal
Cc: karsten, atagar Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

cached-descriptors...
fingerprint 50E9 30FB 6141 E9A7 DAD4 968E 58DE AA1B 06CF 4908
Remove the spaces from the fingerprints.

This isn't OpenPGP, no one goes around reading them off to people. You have to click-hold-carefully-drag to select the whole FP instead of a simple double-click. You have to postprocess strip them to make any use of them anywhere, including everywhere else in Tor... torctl, configs, etc. Nowhere else does Tor present/accept any fingerprints with spaces. And they currently waste about 60kB per descriptor set X all the nodes X frequency.
The spaces have no substantive use whatsoever and are very annoying!
Please remove them.

With that, normalize all displayed/coded fingerprints everywhere in Tor to be either upper or lower case... regardless of whether either/mixed case are supported/enforced as input. Lower case is suggested for better readability (ie: A4B8D0 vs. a4b8d0) and commonality with outputs of various hash programs.

Child Tickets

Change History (9)

comment:1 Changed 5 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.???

I wouldn't mind removing those spaces, but according to the specification, they need to be there. If we take them out now, software that expects to find them there will break. To remove them, the first step would be to audit basically everything that has ever looked at a Tor descriptor or at a fingerprint file, and make sure that it will work okay if the fingerprint has no spaces.

And they currently waste about 60kB per descriptor set X all the nodes X frequency.

Fortunately, this is uncompressed disk storage space, not transmission size: after compression, the spaces don't affect the size of the descriptor on the wire.

comment:2 Changed 5 years ago by arma

Cc: karsten added
Summary: fingerprints - descriptor SPace removal, case normalizationfingerprints - descriptor Space removal, case normalization

I would also be a fan of removing spaces from the fingerprint line in the descriptor, since it would let me grep for it more easily, let me paste the line into atlas, etc.

But Nick has a good point too.

I wonder what metrics/etc applications are looking at the fingerprint line in descriptors currently? Karsten, do you have any insight here?

comment:3 in reply to:  2 Changed 5 years ago by karsten

Cc: atagar added

Replying to arma:

I wonder what metrics/etc applications are looking at the fingerprint line in descriptors currently? Karsten, do you have any insight here?

I guess all applications listed on the CollecTor page would be affected. (Hint: if there are more applications that would be affected and which should maybe be listed there, please let me know.)

But accepting both space-separated, upper-case fingerprints and space-removed, lower-case fingerprints shouldn't be hard. Should be a few changed lines of code in metrics-lib and Stem, and maybe some applications that parse descriptors directly.

Please keep me posted. Cc'ing atagar to comment on the Stem side of things.

comment:4 Changed 5 years ago by atagar

I'm a fan of dropping the spaces - we don't do it many spots and I agree it's nothing but pesky. But where do we opt for lowercase fingerprints? Everywhere that comes to mind is forty character uppercase hex. I'd prefer for us to standardize on that.

Presently stem validates that there's the spaces, but changing this is just a matter of dropping four lines...

https://gitweb.torproject.org/stem.git/blob/HEAD:/stem/descriptor/server_descriptor.py#l467

comment:5 Changed 3 years ago by grarpamp

Severity: Normal

teor mentioned, in an unrelated ticket
You might have been thinking of the OnionOO query syntax.

Was thinking the consensus and [sys]log files and control port of this ticket.

This would be ok for output, but for input, we need to keep accepting both forms.

Fixing the output and docs will cause input to gravitate that way by example, which is cool, yielding possible de-acceptance point in future.

Looking around the net at various usage and rationales, lower case hex seems highly preferred these days... no visual character ambiguity, requires no keyboard or spoken shift key, no case insensitive flags to regexes or extra 'tolower'ing step, matches output of hash tools, etc.

comment:6 Changed 3 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:7 Changed 3 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:8 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:9 Changed 2 years ago by nickm

Keywords: tor-relay needs-proposal added
Priority: MediumVery Low
Note: See TracTickets for help on using tickets.