Opened 11 months ago

Last modified 7 months ago

#32588 merge_ready defect

Setting ORPort [ipv6]:auto mistakenly advertises port 94

Reported by: arma Owned by: teor
Priority: Medium Milestone: Tor: 0.4.3.x-final
Component: Core Tor/Tor Version: Tor: 0.2.3.9-alpha
Severity: Normal Keywords: consider-backport-after-0435, 043-should, ipv6, 035-backport, 041-backport, 042-backport, 043-backport, network-team-roadmap-2020Q1
Cc: Actual Points: 1.1
Parent ID: #33048 Points: 1
Reviewer: nickm, teor Sponsor: Sponsor55-must

Description (last modified by arma)

Start your Tor with

ORPort 9001
ORPort [2a01:238:43e4:2b00:ae5a:a980:1f63:cc5e]:auto nolisten
DirPort 9030

and let it start up. Then go to http://127.0.0.1:9030/tor/server/authority and check out how it has the line

or-address [2a01:238:43e4:2b00:ae5a:a980:1f63:cc5e]:94

First: How did it pick that number? It's a weird choice for a port.

Second: If this is actually your ipv6 address, you can leave out the nolisten, and that's where things get interesting. Your logs will say something like

Nov 24 01:53:42 v4 tor[10988]: Nov 24 01:53:42.001 [notice] Opening OR listener on [2a01:238:43e4:2b00:ae5a:a980:1f63:cc5e]:0
Nov 24 01:53:42 v4 tor[10988]: Nov 24 01:53:42.002 [notice] Opened OR listener on [2a01:238:43e4:2b00:ae5a:a980:1f63:cc5e]:41535

but then your descriptor will still say :94.

This is happening right now to relay "Testbit": they have set :auto for their ipv6 address in their ORPort, they get the above line about how it's opened on port 41535, and yet their descriptor says it's on port 94.

Child Tickets

Attachments (1)

relay.sh (451 bytes) - added by teor 10 months ago.
Tor test script for IPv6 ORPort auto bug

Download all attachments as: .zip

Change History (48)

comment:1 Changed 11 months ago by cypherpunks

last two bytes of ipv6 address parsed as portnumber

0x5e = 94 in decimal :)

Last edited 11 months ago by cypherpunks (previous) (diff)

comment:2 Changed 11 months ago by nickm

I agree that this is a bug.

But another issue is that "auto" and "nolisten" don't make a lot of sense together. "Auto" is "pick a port for me"; "nolisten" is "don't actually listen on this address, I'll set it up in a firewall rule or something".

comment:3 in reply to:  2 Changed 11 months ago by arma

Replying to nickm:

I agree that this is a bug.

But another issue is that "auto" and "nolisten" don't make a lot of sense together. "Auto" is "pick a port for me"; "nolisten" is "don't actually listen on this address, I'll set it up in a firewall rule or something".

No, the NoListen is just here so you can reproduce the bug. The bug exists when you say :auto on your IPv6 ORPort -- the log tells you what port it picked, and it listens on that port, and then it puts the wrong number in your descriptor.

comment:4 Changed 11 months ago by nickm

Milestone: Tor: 0.4.3.x-final

comment:5 Changed 11 months ago by neel

Cc: neel added
Keywords: ipv6 added
Owner: set to neel
Status: newassigned

comment:6 Changed 11 months ago by teor

Keywords: memory-safety security-low added
Status: assignedneeds_information

The IPv6 descriptor code can never work for auto ports:
https://github.com/torproject/tor/blob/master/src/feature/relay/router.c#L1991

It should be like the IPv4 descriptor port code:
https://github.com/torproject/tor/blob/master/src/feature/relay/router.c#L1978
And call router_get_advertised_or_port_by_af() to get the IPv6 ORPort.

Ideally, we should add a new router_get_advertised_ipv6_or_port() function, which searches for an address like this:
https://github.com/torproject/tor/blob/master/src/feature/relay/router.c#L1991

But searches for a port like this: (if the discovered port is 0)
router_get_advertised_or_port_by_af(… , AF_INET6).

I still can't work out how the port ends up being 94. Maybe we're overwriting some memory somewhere?
I think we should try to find the memory issue, before we call this bug "fixed".

comment:7 Changed 11 months ago by teor

It's also worth noting that we're getting 0x5e = 94 in the output, and not 0xcc5e or 0x5ecc. So it's just a one byte overflow. And it's happening some time after the port is opened, but before the relay descriptor is built.

And it only seems to affect auto IPv6 ORPorts.

Does the issue still happen if the IPv6 address is much shorter than the maximum length?
For example, does "ORPort [::1]:auto" give you a port of 1?
You might need to set some custom directory authorities to be allowed to listen on an internal address.

comment:8 Changed 10 months ago by arma

Description: modified (diff)

comment:9 Changed 10 months ago by arma

Anybody should be able to reproduce the bug with the initial instructions -- set two orports and a dirport, and then you can look at your descriptor on your local dirport.

When I set my ORPort to [::2]:auto it still publishes

or-address [::2]:94

in its descriptor.

Same with [::5] and [::10].

comment:10 Changed 10 months ago by teor

Status: needs_informationneeds_review

Here's a draft PR:

I still need to:

  • check how far to backport
  • make a changes file
  • open a separate ticket for the related authority change

I'll attach the test script I was using.

Changed 10 months ago by teor

Attachment: relay.sh added

Tor test script for IPv6 ORPort auto bug

comment:11 Changed 10 months ago by teor

Actual Points: 0.4
Keywords: 035-backport 040-backport 041-backport 042-backport added
Points: 1
Version: Tor: 0.4.1.6Tor: 0.2.3.9-alpha

See my PR:

I'd like an initial review.

I will try to write some unit tests on Monday.

comment:13 Changed 10 months ago by dgoulet

Status: needs_reviewneeds_revision

Hmmm CI exploded for _one_ specific builds with interesting stacktrace:

https://travis-ci.org/torproject/tor/jobs/627571176

comment:14 Changed 10 months ago by neel

Cc: neel removed

comment:15 Changed 10 months ago by neel

Status: needs_revisionneeds_review

comment:16 Changed 10 months ago by dgoulet

Status: needs_reviewneeds_revision

comment:17 in reply to:  13 Changed 10 months ago by teor

Actual Points: 0.40.6
Status: needs_revisionneeds_review

Replying to dgoulet:

Hmmm CI exploded for _one_ specific builds with interesting stacktrace:

https://travis-ci.org/torproject/tor/jobs/627571176

I forgot to check for a NULL IPv6 address.

I re-did the patch, and split the commits into:

  • bug fix (and code movement)
  • refactoring

I still need to write some tests, but I'd like a review :-)
I also need to fix up #32822, because it's based on this branch.

comment:18 Changed 10 months ago by nickm

Keywords: 043-should added

comment:19 Changed 10 months ago by neel

Owner: neel deleted
Status: needs_reviewassigned

comment:20 Changed 10 months ago by neel

Status: assignedneeds_review

comment:21 Changed 9 months ago by teor

Owner: set to teor
Status: needs_reviewassigned

comment:22 Changed 9 months ago by teor

Status: assignedneeds_review

comment:23 Changed 9 months ago by teor

Keywords: 043-should removed

I made some fixes suggested by dgoulet in #32822, and squashed them in to a new PR:

The fixup commits are in the old PR.

comment:24 Changed 9 months ago by teor

Keywords: 043-should added
Status: needs_reviewneeds_revision

I made some fixes, but I still need to do unit tests, so I'm leaving it in needs_revision.

comment:25 in reply to:  23 Changed 9 months ago by teor

Replying to teor:

I made some fixes suggested by dgoulet in #32822, and squashed them in to a new PR:

I added some tests for auto port parsing. They don't actually trigger this bug, but they do increase the auto port config code coverage:

comment:26 Changed 9 months ago by teor

Actual Points: 0.60.8

comment:27 Changed 9 months ago by teor

Owner: teor deleted
Status: needs_revisionassigned

This fix should go in 0.4.3, someone just needs to write a unit test or a script test.

Or we can decide that it's too hard to test, and the reviewer can just check it manually.

comment:28 Changed 9 months ago by teor

Status: assignedneeds_revision

comment:29 Changed 9 months ago by ahf

Owner: set to ahf
Status: needs_revisionassigned

comment:30 Changed 9 months ago by ahf

Status: assignedneeds_revision

comment:31 Changed 8 months ago by teor

Parent ID: #33048
Sponsor: Sponsor55-must

I'm going to need the new function in this branch for Sponsor 55, Objective 1.1.

ahf, do you think you can get the remaining tests done this week?

If not, I can take this task back.

comment:32 Changed 8 months ago by teor

Owner: changed from ahf to teor
Reviewer: ahf
Status: needs_revisionassigned

comment:33 Changed 8 months ago by teor

Status: assignedneeds_revision

comment:34 Changed 8 months ago by nickm

Owner: changed from teor to nickm
Status: needs_revisionaccepted

I'll try doing the tests here to take some load off teor.

comment:35 Changed 7 months ago by nickm

FWIW, this isn't an "uninitialized RAM" bug: The reason it says 94 is that the "ipv6_port" field in the cfg_port_t is set to CFG_AUTO_PORT, which is 0xc4005e in hex. Once that's cast to a uint16_t, we get 0x5e, which is 94.

comment:36 Changed 7 months ago by nickm

I wrote some tests on top of your branch, in a branch called `bug32588_035_tests. I made a PR at https://github.com/torproject/tor/pull/1811 . There will be merge conflicts later on here, but they look to be straightforward. Please let me know if you want help with those: I'm going under the assumption you might want to edit this branch some more.

I don't know whether this is everything you wanted to test or not.

comment:37 Changed 7 months ago by nickm

Actual Points: 0.81

comment:38 in reply to:  35 Changed 7 months ago by teor

Replying to nickm:

FWIW, this isn't an "uninitialized RAM" bug: The reason it says 94 is that the "ipv6_port" field in the cfg_port_t is set to CFG_AUTO_PORT, which is 0xc4005e in hex. Once that's cast to a uint16_t, we get 0x5e, which is 94.

That makes sense! I misread the hex constant as 0x405e, so I couldn't work out how the cast was doing that, without some memory misalignment.

comment:39 Changed 7 months ago by teor

Owner: changed from nickm to teor
Status: acceptedassigned

Thanks, this is exactly what I wanted to test. Honestly, I was a bit stuck trying to work out how.

I'll do a few tweaks, and then merge.

Thanks so much!

comment:40 Changed 7 months ago by teor

Reviewer: ahfnickm, teor
Status: assignedneeds_revision

comment:41 Changed 7 months ago by teor

I tweaked a few of the tests, and fixed the changes file.

See my PR:

I'll squash and create merge forward branches after the tests pass.

comment:42 Changed 7 months ago by teor

Actual Points: 11.1
Keywords: consider-backport-after-0435 043-backport added; memory-safety security-low 040-backport removed
Milestone: Tor: 0.4.3.x-finalTor: 0.4.4.x-final

comment:43 Changed 7 months ago by teor

Status: needs_revisionmerge_ready

I squashed my changes, nick's changes, my extra tests, and some minor test fixes.

See my PRs:

Test branches are here:

Anyone can merge after CI passes.

comment:44 Changed 7 months ago by teor

0.4.1 and 0.4.2 failed on Travis, but they pass locally.

And Travis is having a database downtime, so I can't check why. See:
https://www.traviscistatus.com/

comment:45 Changed 7 months ago by nickm

Milestone: Tor: 0.4.4.x-finalTor: 0.4.3.x-final

Okay, I'm merging this to master and marking for a 0.4.3 backport.

comment:46 Changed 7 months ago by gaba

Keywords: network-team-roadmap-2020Q1 added

Add all the tickets from sponsor 55 that are done and being worked on to the keyword #network-team-roadmap-2020Q1 so I can look at them in the wiki page...

comment:47 in reply to:  44 Changed 7 months ago by teor

Replying to teor:

0.4.1 and 0.4.2 failed on Travis, but they pass locally.

And Travis is having a database downtime, so I can't check why. See:
https://www.traviscistatus.com/

These errors appear to be Travis macOS hangs, due to #32804.

I have restarted the relevant builds.

Note: See TracTickets for help on using tickets.