Opened 13 months ago

Closed 11 months ago

Last modified 11 months ago

#27367 closed defect (implemented)

Authorities should reject non-UTF-8 in relay descriptors

Reported by: teor Owned by:
Priority: Medium Milestone: Tor: 0.4.0.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: rust-wants, prop285, 034-triage-20180328, 034-removed-20180328
Cc: chelseakomlo Actual Points:
Parent ID: #24033 Points:
Reviewer: asn Sponsor:

Child Tickets

TicketStatusOwnerSummaryComponent
#18938closedneelAuthorities should reject non-UTF-8 content in ExtraInfo descriptorsCore Tor/Tor

Change History (26)

comment:1 Changed 13 months ago by teor

Summary: Authorities should reject non-utf8 in relay descriptorsAuthorities should reject non-UTF-8 in relay descriptors

comment:2 Changed 13 months ago by teor

Description: modified (diff)
Summary: Authorities should reject non-UTF-8 in relay descriptorsAuthorities and relays should reject non-UTF-8 in relay descriptors

comment:3 Changed 13 months ago by cyberpunks

Not sure how to test, but made an attempt at this in branch unicode-descriptors1 at https://gitgud.io/onionk/tor.git

comment:4 Changed 13 months ago by teor

Milestone: Tor: unspecifiedTor: 0.3.6.x-final
Status: newneeds_review

I pushed your branch to ​https://github.com/teor2345/tor/branches to get CI.

We usually create valid descriptors, votes, and consensuses using chutney:
https://gitweb.torproject.org/chutney.git

In this case, an unpatched tor will happily copy non-unicode characters from ContactInfo into its descriptor, and sign that descriptor.

Here are some existing test descriptors and votes:
https://gitweb.torproject.org/tor.git/tree/src/test/test_descriptors.inc

And here is the function that parses those descriptors:
https://gitweb.torproject.org/tor.git/tree/src/test/test_helpers.c#n90

If we're going to test for invalid descriptors, then we should probably put the test in test_dir.c.

comment:5 Changed 13 months ago by teor

Hi, the CI failed:

We can't merge this branch as-is, because it doesn't implement the proposal:

comment:6 Changed 13 months ago by cyberpunks

Right, just authorities at first. Undid the other commit and pushed a BOM check.

comment:7 Changed 13 months ago by teor

Status: needs_reviewneeds_revision

Thanks, I pushed it to CI, if it passes, a green tick will show up next to the branch on https://github.com/teor2345/tor/branches

When we check different kinds of directory documents, we'll want the BOM check as part of a descriptor UTF-8 checking function in util_string.c. Would you mind moving it in this branch?

Since the UTF-8 checking function accepts a length, we should probably also check for NUL. If we fail on the first NUL, this check becomes a last-ditch memory safety check, as most bytes in RAM are NUL.

We should probably also log a bug warning if the function encounters a NUL byte:

if (BUG(failure-condition) {
  failure-action;
}

or

if (success-condition) {
  return
}
tor_assert_nonfatal_unreached();

comment:8 Changed 13 months ago by teor

Oh, and I think that the changes file still mentions relays?

comment:9 Changed 13 months ago by teor

Something went wrong with one of the Windows builds in files you didn't change, so I restarted it:
https://ci.appveyor.com/project/teor2345/tor/build/1.0.111

comment:10 Changed 13 months ago by teor

The same thing went wrong with the rebuild:
https://ci.appveyor.com/project/teor2345/tor/build/1.0.111/job/8nf6lhj1oag6nbmf

I wonder if the compiler that we're using has been updated to a broken version.

I pushed the common ancestor of your branch and master to master-unicode-descriptors1, and the current master to master. If they both fail, I'll open a ticket to disable that build:
https://github.com/teor2345/tor/branches

comment:11 Changed 13 months ago by teor

I opened #27389 to allow failures on that target, until the compiler is fixed.

comment:12 in reply to:  8 ; Changed 13 months ago by cyberpunks

Replying to teor:

Oh, and I think that the changes file still mentions relays?

The changes file was reverted actually. Fixed.

But this ticket says 'and relays.' Could edit the summary and have a separate ticket for making relays reject in the future?

comment:13 in reply to:  7 ; Changed 13 months ago by cyberpunks

Replying to teor:

Since the UTF-8 checking function accepts a length, we should probably also check for NUL. If we fail on the first NUL, this check becomes a last-ditch memory safety check, as most bytes in RAM are NUL.

We should probably also log a bug warning if the function encounters a NUL byte:

Receiving a NUL byte isn't a bug though. This function is processing untrusted input from fetch_from_buf_http() that might contain anything including NUL bytes.

comment:14 Changed 13 months ago by atagar

Related a few past tickets where this has bitten us.

Just a quick note that dirauths could use Stem as a tool for rejecting malformed content. It does stricter validation than the tor binary that descriptors are conformant with the spec. I've been performing this check through DocTor since 2013, filing tickets each time more bad data makes it into the consensus...

https://gitweb.torproject.org/doctor.git/tree/descriptor_checker.py

Bad data chokes not only Stem, but metrics-lib and anything else that ingests it.

Clearly in an ideal world the tor binary itself would do better validation but in the absence of that if we took advantage of Stem's validator I wouldn't need to keep filing tickets every few months. Using Stem on dirauths to reject malformed descriptors would prevent these issues upfront, saving Karsten and I hassle.

If that's a no-go I could also redirect the DocTor check I mention above to email other folks (Nick? teor? Maybe the network team list?) so I don't need to file tickets each time this comes up.

comment:15 in reply to:  12 Changed 13 months ago by teor

Summary: Authorities and relays should reject non-UTF-8 in relay descriptorsAuthorities should reject non-UTF-8 in relay descriptors

Replying to cyberpunks:

Replying to teor:

Oh, and I think that the changes file still mentions relays?

The changes file was reverted actually. Fixed.

But this ticket says 'and relays.' Could edit the summary and have a separate ticket for making relays reject in the future?

Sure. I changed the title.
The separate ticket for relays checking their own descriptors is #27414.
(The torrc check *should* be enough, but I'd like to add another check right before the descriptor and extrainfo are uploaded.)
The separate ticket for relays rejecting other relays' descriptors is #27369.

comment:16 in reply to:  13 Changed 13 months ago by teor

Replying to cyberpunks:

Replying to teor:

Since the UTF-8 checking function accepts a length, we should probably also check for NUL. If we fail on the first NUL, this check becomes a last-ditch memory safety check, as most bytes in RAM are NUL.

We should probably also log a bug warning if the function encounters a NUL byte:

Receiving a NUL byte isn't a bug though. This function is processing untrusted input from fetch_from_buf_http() that might contain anything including NUL bytes.

Right, it's not a bug in tor, but it is a protocol violation. (NUL bytes are ok in compressed content. But when it's uncompressed, it should all be UTF-8.)

We probably don't need a separate check then.

comment:17 in reply to:  14 Changed 13 months ago by teor

Replying to atagar:

Related a few past tickets where this has bitten us.

Just a quick note...

I continued this discussion in #27416, because the descriptor validation issues you raised are not actually resolved by checking for UTF-8.

comment:18 Changed 13 months ago by cyberpunks

Created a squashed version of the branch at unicode-descriptors-dirauth1 that's rebased on top of some additions to the unit tests in #27373.

Can this not make it into 0.3.5, it being the next LTS?

comment:19 in reply to:  18 ; Changed 13 months ago by teor

Replying to cyberpunks:

Created a squashed version of the branch at unicode-descriptors-dirauth1 that's rebased on top of some additions to the unit tests in #27373.

Thanks!

Can this not make it into 0.3.5, it being the next LTS?

I'm not sure.

This code is a major change on directory authorities, and it hasn't had much testing at all.
Only a few relay operators are affected, but there hasn't been any warning about the change.

Also, LTS isn't used for directory authorities:

Some features and configurations are currently only maintained in the two most recent stable releases, either because they are less mature than the rest of Tor, because there are relatively few users who need them, or for some other reason:

Running a directory authority.
…

https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/CoreTorReleases

But I think we should try to get a limited UTF-8 torrc warning into 0.3.5 for relay ContactInfo lines:
https://trac.torproject.org/projects/tor/ticket/27380#comment:4

That way, 0.3.5 relays will continue to work when authorities upgrade to 0.3.6.

comment:20 Changed 13 months ago by teor

Also, there are only 9 working days before the 0.3.5 code freeze, and we've put a halt on non-essential code reviews:
https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/CoreTorReleases

comment:21 in reply to:  19 Changed 11 months ago by cyberpunks

#27799 (b54a5e70 moved dirserv_add_multiple_descriptors) and #26744 (194acfb5 moved directory_handle_command_post) required a rebase: https://gitgud.io/onionk/tor/compare/master...unicode-descriptors-dirauth2

Last edited 11 months ago by cyberpunks (previous) (diff)

comment:22 Changed 11 months ago by nickm

Status: needs_revisionneeds_review

comment:23 Changed 11 months ago by dgoulet

Reviewer: asn

comment:24 Changed 11 months ago by asn

Status: needs_reviewmerge_ready

LGTM, but was missing tests for the new no_bom function. Other than that, looks sane and simple.

I added some test in my branch unicode-descriptors-dirauth2.
Also check out the PR: https://github.com/torproject/tor/pull/464
(all new patches should come with a github PR).

Marking as merge_ready.

comment:25 Changed 11 months ago by nickm

Resolution: implemented
Status: merge_readyclosed

LGTM; merged. There's a travis failure, but it looks intermittent and unrelated.

comment:26 Changed 11 months ago by nickm

Milestone: Tor: 0.3.6.x-finalTor: 0.4.0.x-final

Tor 0.3.6.x has been renamed to 0.4.0.x.

Note: See TracTickets for help on using tickets.