Opened 6 months ago

Closed 3 months ago

#22789 closed defect (fixed)

Tor 0.3.1.4-alpha crash on OpenBSD-current

Reported by: fredzupy Owned by: nickm
Priority: High Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version: Tor: 0.3.1.4-alpha
Severity: Major Keywords: tor, crash, inet_pton, c99, openbsd, 024-backport, 025-backport, 026-backport, 027-backport, 028-backport, 029-backport, 030-backport, review-group-20 trove-2017-007
Cc: Actual Points: 3
Parent ID: Points:
Reviewer: Sponsor: SponsorV

Description

Hi,

After a few hours running, my new freshly compiled Tor 0.3.1.4-alpha has crashed with this message in log :

Jul 1 05:06:47 ethnao Tor[94685]: tor_assertion_failed_(): Bug: src/common/compat.c:2597: tor_inet_pton: Assertion next != src failed; aborting. (on Tor 0.3.1.4-alpha fab91a290ded3e74)
Jul 1 05:06:47 ethnao Tor[94685]: Bug: Assertion next != src failed in tor_inet_pton at src/common/compat.c:2597. (Stack trace not available) (on Tor 0.3.1.4-alpha fab91a290ded3e74)

Things I've done before :

$ ./configure
$ make
$ make test
$ doas src/or/tor -f /etc/tor/torrc

Then put some load on the Tor process with some random http request.

Logs with context :

Jul 1 05:06:02 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $0DA9BD201766EDB19F57F49F1A013A8A5432C008~PhantomTrain4 at 65.19.167.131. Retrying on a new circuit.
Jul 1 05:06:03 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $0DA9BD201766EDB19F57F49F1A013A8A5432C008~PhantomTrain4 at 65.19.167.131. Retrying on a new circuit.
Jul 1 05:06:04 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $0DA9BD201766EDB19F57F49F1A013A8A5432C008~PhantomTrain4 at 65.19.167.131. Retrying on a new circuit.
Jul 1 05:06:07 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $379FB450010D17078B3766C2273303C358C3A442~aurora at 176.126.252.12. Retrying on a new circuit.
Jul 1 05:06:11 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $379FB450010D17078B3766C2273303C358C3A442~aurora at 176.126.252.12. Retrying on a new circuit.
Jul 1 05:06:12 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $379FB450010D17078B3766C2273303C358C3A442~aurora at 176.126.252.12. Retrying on a new circuit.
Jul 1 05:06:15 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $379FB450010D17078B3766C2273303C358C3A442~aurora at 176.126.252.12. Retrying on a new circuit.
Jul 1 05:06:17 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $379FB450010D17078B3766C2273303C358C3A442~aurora at 176.126.252.12. Retrying on a new circuit.
Jul 1 05:06:17 ethnao Tor[94685]: Tried for 133 seconds to get a connection to [scrubbed]:80. Giving up.
Jul 1 05:06:18 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $379FB450010D17078B3766C2273303C358C3A442~aurora at 176.126.252.12. Retrying on a new circuit.
Jul 1 05:06:18 ethnao Tor[94685]: Tried for 125 seconds to get a connection to [scrubbed]:80. Giving up.
Jul 1 05:06:19 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $379FB450010D17078B3766C2273303C358C3A442~aurora at 176.126.252.12. Retrying on a new circuit.
Jul 1 05:06:21 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $14F92FF956105932E9DEC5B82A7778A0B1BD9A52~hessel0 at 109.163.234.2. Retrying on a new circuit.
Jul 1 05:06:22 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $14F92FF956105932E9DEC5B82A7778A0B1BD9A52~hessel0 at 109.163.234.2. Retrying on a new circuit.
Jul 1 05:06:22 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $14F92FF956105932E9DEC5B82A7778A0B1BD9A52~hessel0 at 109.163.234.2. Retrying on a new circuit.
Jul 1 05:06:26 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $14F92FF956105932E9DEC5B82A7778A0B1BD9A52~hessel0 at 109.163.234.2. Retrying on a new circuit.
Jul 1 05:06:27 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $14F92FF956105932E9DEC5B82A7778A0B1BD9A52~hessel0 at 109.163.234.2. Retrying on a new circuit.
Jul 1 05:06:30 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $14F92FF956105932E9DEC5B82A7778A0B1BD9A52~hessel0 at 109.163.234.2. Retrying on a new circuit.
Jul 1 05:06:34 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $14F92FF956105932E9DEC5B82A7778A0B1BD9A52~hessel0 at 109.163.234.2. Retrying on a new circuit.
Jul 1 05:06:36 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $E444FE133F55B6FFD338337AA42BA199581E2C4B~spacitospacito2 at 51.15.52.230. Retrying on a new circuit.
Jul 1 05:06:38 ethnao Tor[94685]: We tried for 16 seconds to connect to '[scrubbed]' using exit $E444FE133F55B6FFD338337AA42BA199581E2C4B~spacitospacito2 at 51.15.52.230. Retrying on a new circuit.
Jul 1 05:06:38 ethnao Tor[94685]: We tried for 16 seconds to connect to '[scrubbed]' using exit $E444FE133F55B6FFD338337AA42BA199581E2C4B~spacitospacito2 at 51.15.52.230. Retrying on a new circuit.
Jul 1 05:06:41 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $E444FE133F55B6FFD338337AA42BA199581E2C4B~spacitospacito2 at 51.15.52.230. Retrying on a new circuit.
Jul 1 05:06:41 ethnao Tor[94685]: Tried for 125 seconds to get a connection to [scrubbed]:80. Giving up.
Jul 1 05:06:42 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $E444FE133F55B6FFD338337AA42BA199581E2C4B~spacitospacito2 at 51.15.52.230. Retrying on a new circuit.
Jul 1 05:06:45 ethnao Tor[94685]: We tried for 15 seconds to connect to '[scrubbed]' using exit $E444FE133F55B6FFD338337AA42BA199581E2C4B~spacitospacito2 at 51.15.52.230. Retrying on a new circuit.
Jul 1 05:06:47 ethnao Tor[94685]: tor_assertion_failed_(): Bug: src/common/compat.c:2597: tor_inet_pton: Assertion next != src failed; aborting. (on Tor 0.3.1.4-alpha fab91a290ded3e74)
Jul 1 05:06:47 ethnao Tor[94685]: Bug: Assertion next != src failed in tor_inet_pton at src/common/compat.c:2597. (Stack trace not available) (on Tor 0.3.1.4-alpha fab91a290ded3e74)


Child Tickets

Attachments (1)

0001-Fix-assertion-failure-related-to-openbsd-strtol.patch (2.2 KB) - added by nickm 6 months ago.

Download all attachments as: .zip

Change History (34)

comment:1 Changed 6 months ago by nickm

Keywords: tor crash inet_pton ???-backport added
Milestone: Tor: 0.3.1.x-final
Priority: MediumHigh
Severity: NormalMajor

comment:2 Changed 6 months ago by arma

Does this happen repeatedly? Can you induce it?

It looks like it's an ipv6 address that is upsetting it -- maybe one you're getting from the exit relay as part of the connected cell.

Do you get a core? If so, can you get us a backtrace with gdb?

comment:3 Changed 6 months ago by teor

What architecture are you on?

comment:4 Changed 6 months ago by cypherpunks

random people was here

Version 1, edited 6 months ago by cypherpunks (previous) (next) (diff)

comment:5 Changed 6 months ago by fredzupy

I'm on i386 architecture :
kern.version=OpenBSD 6.1-current (GENERIC.MP) #150: Sat Jun 24 22:22:03 MDT 2017

deraadt@…:/usr/src/sys/arch/i386/compile/GENERIC.MP

It only happens once. I can't induce it, for now.

Relaunch about 12h ago and it did not happen yet.

comment:6 Changed 6 months ago by cypherpunks

Last edited 6 months ago by cypherpunks (previous) (diff)

comment:7 Changed 6 months ago by fredzupy

No_0x told me via IRC to access 0xtrololo via Tor and effectively it crash with the same assert.

Something like :
$ telnet 0xquux 80

ends with the following lines in log :
Jul 2 22:25:37 ethnao Tor[42555]: tor_assertion_failed_(): Bug: src/common/compat.c:2597: tor_inet_pton: Assertion next != src failed; aborting. (on Tor 0.3.1.4-alpha fab91a290ded3e74)
Jul 2 22:25:37 ethnao Tor[42555]: Bug: Assertion next != src failed in tor_inet_pton at src/common/compat.c:2597. (Stack trace not available) (on Tor 0.3.1.4-alpha fab91a290ded3e74)

comment:8 Changed 6 months ago by cypherpunks

        long r = strtol(src, &next, 16);
        if (next == NULL || next == src)
          return 0; /* <+strtol> something went wrong */

comment:9 in reply to:  8 Changed 6 months ago by fredzupy

Replying to cypherpunks:

        long r = strtol(src, &next, 16);
        if (next == NULL || next == src)
          return 0; /* <+strtol> something went wrong */

You mean convert the two 'tor_assert()' by 'if' statement? I did that. It works… At least it doesn't crash the Tor process ! Thanks !

I don't know if it's relevant but on OpenBSD, if the strtol() is too complicated to manage, there is strtonum() alternative which is meant to be simpler :

STANDARDS

strtonum() is an OpenBSD extension. The existing alternatives, such as
atoi(3) and strtol(3), are either impossible or difficult to use safely.

comment:10 Changed 6 months ago by nickm

Good diagnosis! I spent the weekend fuzzing tor_inet_pton(), to no effect-- because this bug will only affect systems where strtol works the same way[*] openbsd's strtol does does.

Let's also audit the other ato* and strto* usages in Tor to see if they're affected by this issue.

[*] To my mind, this comes down to an interpretation of these sentences in section 7.20.1.4 in the C99 standard:

  1. [...] . If the value of base is 16, the characters 0x or 0X may optionally precede the sequence of letters and digits, following the sign if present.
  2. The subject sequence is defined as the longest initial subsequence of the input string, starting with the first non-white-space character, that is of the expected form. [...]

I'm not enough of a standards guru to interpret whether openbsd's behavior here is allowed or not, but it might be a good idea to get a second opinion.

comment:11 Changed 6 months ago by nickm

Keywords: openbsd c99 added

comment:12 Changed 6 months ago by cypherpunks

Keywords: openbsd c99 removed

I don't know if it's relevant but on OpenBSD, if the strtol() is too complicated to manage, there is strtonum() alternative which is meant to be simpler
is an OpenBSD extension

OpenBSD's strtol not so complicated with compare to glibc's variant. Just some edge case you can to filter by simple check. "0x" syntax disallowed generally by tor_inet_pton so no more additional side effects.

This bug affect all OpenBSD users, easy to target exit relays (by any client) and clients (by sites, etc)

comment:13 Changed 6 months ago by cypherpunks

Keywords: c99 added

comment:14 in reply to:  10 Changed 6 months ago by nickm

Keywords: openbsd added

comment:15 Changed 6 months ago by nickm

Added #22802 for looking for other cases.

comment:16 Changed 6 months ago by nickm

Keywords: 024-backport 025-backport 026-backport 027-backport 028-backport 029-backport 030-backport added; ???-backport removed
Status: newneeds_review

I'm calling this TROVE-2017-007. Classifying it as medium since it only affects openbsd, but it's still important to fix.

I've put this in a branch, with regression tests, as bug22789_024. I'll also attach the patch for convenience here.

comment:17 Changed 6 months ago by catalyst

My reading of C99 is that strtol("0xquux", &next, 16) must return zero and next must point to the x. The optionality in paragraph 3 is for the input data, not the implementation.

comment:18 in reply to:  17 ; Changed 6 months ago by fredzupy

Replying to catalyst:

My reading of C99 is that strtol("0xquux", &next, 16) must return zero and next must point to the x. The optionality in paragraph 3 is for the input data, not the implementation.

In order to clarify a little bit, because it was not obvious for me to understand, the following code have different behaviour under different system:

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

int
main()
{
        long l;
        char *rest;

        l = strtol("0xquux", &rest, 16);
        printf("l:%ld rest:%s\n", l, rest);

        return 0;
}

Under Linux produce:
l:0 rest:xquux

Under OpenBSD produce:
l:0 rest:0xquux

The question is to know if the Tor code is good enough and OpenBSD need to fix something or OpenBSD is sufficiently conformant and the Tor code need to adapt.

comment:19 in reply to:  18 Changed 6 months ago by cypherpunks

Replying to fredzupy:

Replying to catalyst:

My reading of C99 is that strtol("0xquux", &next, 16) must return zero and next must point to the x. The optionality in paragraph 3 is for the input data, not the implementation.

Under Linux produce:
l:0 rest:xquux

Under OpenBSD produce:
l:0 rest:0xquux

The question is to know if the Tor code is good enough and OpenBSD need to fix something or OpenBSD is sufficiently conformant and the Tor code need to adapt.

I believe the OpenBSD result is correct according to the C99 text and contradicts catalyst's statement. Because paragraph 7.20.1.4.7 states

If the subject sequence is empty or does not have the expected form, no conversion is performed; the value of nptr is stored in the object pointed to by endptr, provided that endptr is not a null pointer.

With the example code, nptr does not have the expected form, so no conversion is performed therefore nptr == *endptr. Furthermore, paragraph 7.20.1.4.8 states

The strtol, strtoll, strtoul, and strtoull functions return the converted value, if any. If no conversion could be performed, zero is returned. If the correct value is outside the range of representable values, LONG_MIN, LONG_MAX, LLONG_MIN, LLONG_MAX, ULONG_MAX, or ULLONG_MAX is returned (according to the return type and sign of the value, if any), and the value of the macro ERANGE is stored in errno.

With the example code, nptr does not have the expected form, so no conversion is performed therefore the return value is zero.

Combining these two together means that when the subject sequence is empty or does not have the expected form, no conversion is performed and the return value is zero and nptr == *endptr.

comment:20 in reply to:  18 Changed 6 months ago by cypherpunks

Hmm, cypherpunks write not only truth on Trac...

Replying to fredzupy:
Prefer

OpenBSD need to fix something

Explanation:
From the common sense PoV: 0 is a valid hexadecimal number.

Detailed explanation:
http://www.cplusplus.com/reference/cstdlib/strtol/

comment:21 Changed 5 months ago by catalyst

Nothing in C99 7.20.1.4 explicitly says that a string starting with 0x should result in nptr == *endptr when base == 16.

It might be ambiguous about whether 0x is an expected subject sequence for strtol with base == 16. I think the ambiguity is whether the subject sequence is 0 vs 0x, rather than empty. 7.20.1.4p7 says nptr == *endptr if the subject sequence "is empty or does not have the expected form", but 7.20.1.4p4 defines the subject sequence as "the longest initial subsequence of the input string, starting with the first non-white-space character, that is of the expected form", so "not have the expected form" is redundant because that is impossible for a subject sequence as defined.

It goes on to say that the subject sequence is empty if "the input string is empty or consists entirely of white space, or if the first non-white-space character is other than a sign or a permissible letter or digit". Neither of these is true for a string starting with 0x, because 0 is a permissible digit for base == 16.

This also brings up the question of whether a subject string can consist of only a sign prefix. I think it can be, but the reference to 6.4.4.1 implies that is not true, at least for base == 0, because an integer-constant cannot be empty.

comment:22 Changed 5 months ago by nickm

Keywords: review-group-20 added

Creating review-group-20

comment:23 Changed 5 months ago by nickm

Owner: set to nickm
Status: needs_reviewaccepted

setting owner

comment:24 Changed 5 months ago by nickm

Status: acceptedneeds_review

comment:25 Changed 5 months ago by catalyst

Status: needs_reviewmerge_ready

Patch looks good to me. Also doesn't apply cleanly to master due to test_addr.c divergence.

comment:26 Changed 5 months ago by nickm

Actual Points: 3
Keywords: trove-2017-007 added
Resolution: fixed
Status: merge_readyclosed

Thanks for the review! Merged to 0.2.4 and forward.

If anybody wants to ask the OpenBSD Libc maintainers what they think about the standard here, they should feel free to do so politely. Apparently OpenBSD inherited the code from NetBSD, which may also have the same behavior. FreeBSD appears to have patched itself to have the behavior that Tor expects.

I'll refrain from any arguments about whether the standard permits this behavior; even if it doesn't, the behavior apparently exists in the wild, so Tor has to work around it.

comment:27 Changed 5 months ago by cypherpunks

FWIW the TROVE ID isn't mentioned in the changes file.

comment:28 in reply to:  26 Changed 5 months ago by fredzupy

Replying to nickm:

If anybody wants to ask the OpenBSD Libc maintainers what they think about the standard here, they should feel free to do so politely. Apparently OpenBSD inherited the code from NetBSD, which may also have the same behavior. FreeBSD appears to have patched itself to have the behavior that Tor expects.

FYI fix committed on OpenBSD.

https://marc.info/?l=openbsd-cvs&m=149935819726649

comment:29 Changed 5 months ago by nickm

whoa, neat.

comment:30 in reply to:  27 Changed 5 months ago by nickm

Replying to cypherpunks:

FWIW the TROVE ID isn't mentioned in the changes file.

Good catch; thank you! I've fixed that in b47249e0bb6a8ef0f29f775352d1238b7133b938

comment:31 Changed 5 months ago by catalyst

Sponsor: SponsorV

comment:32 Changed 5 months ago by teor

Resolution: fixed
Status: closedreopened

The changelog entry for this is incorrect, the problematic pattern is 0x[^0-9a-fA-F]. 0xfoo is a valid hexadecimal 15 with trailing oo.

Last edited 5 months ago by teor (previous) (diff)

comment:33 Changed 3 months ago by nickm

Resolution: fixed
Status: reopenedclosed

Retroactively changed the changelog in master, though usually we don't do this. b268dca41175fe438441d3e4137766efee30da35

Note: See TracTickets for help on using tickets.