Opened 2 years ago

Last modified 3 months ago

#19650 assigned enhancement

Keep non-printable characters out of details documents

Reported by: cypherpunks Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Onionoo Version:
Severity: Normal Keywords: metrics-2018
Cc: Actual Points:
Parent ID: #24033 Points:
Reviewer: Sponsor:

Description (last modified by karsten)

Future Tor will not publish non-ASCII descriptors and dir auths will reject them at some point.

Since this is quite far in the future, what do you think about removing such strings in onionoo (before it reaches tor)?
I'm not suggesting to ignore the entire descriptor in such a case but just replace such chars with "?" ?
"??B?" -> "??B??"

https://atlas.torproject.org/#details/21E84B294794821E2898E8ED18402E45E4FC351E

Note: I used "non-printable" (vs. non-ASCII) since onionoo data includes printable but non-ASCII chars.

https://lists.torproject.org/pipermail/tor-relays/2016-July/009667.html

related:
[1] https://trac.torproject.org/projects/tor/ticket/19647
[2] https://trac.torproject.org/projects/tor/ticket/18938

Child Tickets

Change History (6)

comment:1 Changed 2 years ago by twim

May it be useful to print platform in hex when unprintable char occurs?

comment:2 Changed 15 months ago by karsten

Description: modified (diff)
Summary: do not include non-printable strings in details documentsKeep non-printable characters out of details documents

Tweak summary.

comment:3 Changed 15 months ago by karsten

Keywords: metrics-2018 added

comment:4 Changed 15 months ago by karsten

Owner: set to metrics-team
Status: newassigned

comment:5 Changed 3 months ago by teor

For the record, prop285 has tor switching to UTF-8 for all documents that used to be ASCII.

Currently, this affects platform/version and contact lines.

The spec requires ASCII for the other descriptor, vote, and consensus lines, and all other documents. But since tor doesn't parse ignored fields and lines, and doesn't always enforce ASCII, there will continue to be some parts of these documents that allow UTF-8.

comment:6 Changed 3 months ago by teor

Parent ID: #24033

The master ticket for this change is #24033.

Note: See TracTickets for help on using tickets.