Opened 3 years ago

Closed 3 years ago

#20871 closed defect (fixed)

Regression in Torsocks 2.2.0 breaks wget, among others

Reported by: cypherpunks Owned by: dgoulet
Priority: High Milestone:
Component: Core Tor/Torsocks Version:
Severity: Major Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Various applications no longer work due to a regression in the latest Torsocks update, such as tlsdate and wget, whereas curl and youtube-dl work. I've tested this by compiling both 2.1.0 and 2.2.0 and exporting the library with LD_PRELOAD, and trying wget with both.

The problem is not with name resolution, as would be suggested in the wget error, or with the TLS handshake, as tlsdate suggests, because the strace output does not even get to socket() and connect(). However, when an IP address is passed directly to wget rather than a hostname, 2.2.0 works. It appears to me like the problem, at least in wget's case, is with the hostname resolution code being hooked incorrectly by Torsocks. That's just a guess though.

With wget and curl:

$ LD_PRELOAD=libtorsocks-2.1.0.so curl https://www.torproject.org/ >/dev/null    
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 15868  100 15868    0     0   6158      0  0:00:02  0:00:02 --:--:--  8369

$ LD_PRELOAD=libtorsocks-2.1.0.so wget --spider https://www.torproject.org/
Spider mode enabled. Check if remote file exists.
--2016-12-03 10:08:57--  https://www.torproject.org/
Resolving www.torproject.org... 38.229.72.16
Connecting to www.torproject.org|38.229.72.16|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 15868 (15K) [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

$ LD_PRELOAD=libtorsocks-2.2.0.so curl https://www.torproject.org/ >/dev/null    
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 15868  100 15868    0     0   7227      0  0:00:02  0:00:02 --:--:-- 10494

$ LD_PRELOAD=libtorsocks-2.2.0.so wget --spider https://www.torproject.org/
Spider mode enabled. Check if remote file exists.
--2016-12-03 10:09:20--  https://www.torproject.org/
Resolving www.torproject.org... failed: Unknown error.
wget: unable to resolve host address www.torproject.org

$ LD_PRELOAD=libtorsocks-2.2.0.so wget --spider $(tor-resolve torproject.org)
Spider mode enabled. Check if remote file exists.
--2016-12-03 10:11:12--  http://138.201.14.197/
Connecting to 138.201.14.197:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 496 [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

And with tlsdate:

$ LD_PRELOAD=libtorsocks-2.1.0.so tlsdate -n -V
Sat Dec  3 10:11:57 UTC 2016

$ LD_PRELOAD=libtorsocks-2.2.0.so tlsdate -n -V
SSL connection failed
child process failed in SSL handshake

Child Tickets

Attachments (1)

fix_check_addr.patch (1.3 KB) - added by cypherpunks 3 years ago.

Download all attachments as: .zip

Change History (8)

comment:1 Changed 3 years ago by dgoulet

Status: newneeds_information

This works for me:

$ torsocks --version
Torsocks 2.2.0
$ torsocks wget --spider www.torproject.org
Spider mode enabled. Check if remote file exists.
--2016-12-03 11:52:48--  http://www.torproject.org/
Resolving www.torproject.org (www.torproject.org)... 38.229.72.16
[...]

My bet here is that you got an Exit relay that wasn't cooperating well. If you try with -i multiple time, do you still get the issue? (that option will make you use a different circuit per execution).

If yes, next step would be to provide the output of TORSOCKS_LOG_LEVEL=5 torsocks ... so we can look at the debug logs and see where it went wrong.

comment:2 Changed 3 years ago by cypherpunks

It's not an exit relay issue because the error is instant and occurs no matter what exit is in use. I did find out what line is causing the problem though, and know a fix. I'm very confused why it works for you, because it seems like the logic is fundamentally flawed.

In 2.2.0 in src/lib/gethostbyname.c, there's an if statement which checks if name is a valid IPv4 address. If it is valid IPv4 address, it should return 1, otherwise a negative value. What's weirder still is in src/common/utils.c, the check_addr() function is written in such a way that it cannot possibly return 0, only 1 or -1 (which both evaluate true in C). Later on, utils_is_address_ipv4(), which is just a wrapper for that function, is called. The if statement it's called in will always return true (because it can only possibly evaluate to 1 or -1), never false.

Since utils_is_address_ipv4() cannot return 0, the if statement it is in will always assume that Torsocks has been passed an IP address directly, so it will never attempt to resolve it, and instead will try to convert it to network byte order. This is why it works when it's passed an IP address directly, but not a hostname. The only reason I can think of that it might work for you is if, for some reason, the domain is being resolved early and the IP is passed to tsocks_gethostbyname() (which is where the stuck if statement is located) on your system but not mine, bypassing the buggy logic.

A fix would involve having check_addr() return 0 instead of -1 when it's passed a host which is a valid IPv4/IPv6 instead of a string hostname. Only tests/unit/test_utils.c would have to be changed because it explicitly checks for -1, but everything else would work.

First two code blocks are from src/common/utils.c. The third is from src/lib/gethostbyname.c.

static int check_addr(const char *ip, int af)
{
        int ret = 0;
        char buf[128];

        assert(ip);

        ret = inet_pton(af, ip, buf);
        if (ret != 1) {
                ret = -1;
        }

        return ret;
}
int utils_is_address_ipv4(const char *ip)
{
        return check_addr(ip, AF_INET);
}
/* Man page specifies that it can either be an hostname or IPv4 address.
 * If it's an address, go with it else try to resolve it through Tor. */
if (utils_is_address_ipv4(name)) {
        if (inet_pton(AF_INET, name, &ip) <= 0) {
                goto error;
        }
        /* "ip" now contains the network byte order of the address. */
} else {
        /* We have a hostname so resolve it through Tor. */
        ret = tsocks_tor_resolve(AF_INET, name, &ip);
        if (ret < 0) {
                goto error;
        }
}

tl;dr
utils_is_address_ipv4() is broken, as is the if statement which uses it, because it cannot return 0.

comment:3 in reply to:  2 Changed 3 years ago by sysrqb

Replying to cypherpunks:

In 2.2.0 in src/lib/gethostbyname.c, there's an if statement which checks if name is a valid IPv4 address. If it is valid IPv4 address, it should return 1, otherwise a negative value. What's weirder still is in src/common/utils.c, the check_addr() function is written in such a way that it cannot possibly return 0, only 1 or -1 (which both evaluate true in C).

Nice catch, indeed that is likely broken and should be corrected. However, wget doesn't use gethostbyname(), it uses getaddrinfo(). Can you provide debug level logs that show how the name is being resolved (as dgoulet requested)? Setting something like TORSOCKS_LOG_LEVEL=5 TORSOCKS_LOG_FILE_PATH=torsocks.log should be sufficient. Logging to a file maybe better than stdio.

Debug logging shows:

1480831284 DEBUG torsocks[11468]: Setting up a connection to the Tor network on fd 5 (in setup_tor_connection() at torsocks.c:367)
1480831284 DEBUG torsocks[11468]: Socks5 sending method ver: 5, nmethods 0x01, methods 0x00 (in socks5_send_method() at socks5.c:229)
1480831284 DEBUG torsocks[11468]: Socks5 received method ver: 5, method 0x00 (in socks5_recv_method() at socks5.c:262)
1480831284 DEBUG torsocks[11468]: [socks5] Resolve for www.torproject.org sent successfully (in socks5_send_resolve_request() at socks5.c:639)
1480831284 DEBUG torsocks[11468]: [socks5] Resolve reply received successfully (in socks5_recv_resolve_reply() at socks5.c:716)
1480831284 DEBUG torsocks[11468]: [getaddrinfo] Node www.torproject.org resolved to 82.195.75.101 (in tsocks_getaddrinfo() at getaddrinfo.c:107)

Specifically:

amnesia@amnesia:~/torsocks$ git show --pretty=oneline | head -n 1
e54d80bc9595beeceac637b03e5c5395c07e62f7 Update version to v2.2.0
amnesia@amnesia:~/torsocks$ git describe
v2.2.0
amnesia@amnesia:~/torsocks$ /usr/lib/wget/wget -V | head -n 3
GNU Wget 1.16 built on linux-gnu.

+digest +https +ipv6 +iri +large-file +nls +ntlm +opie +psl +ssl/gnutls
amnesia@amnesia:~/torsocks$ TORSOCKS_CONF_FILE=doc/torsocks.conf TORSOCKS_LOG_LEVEL=5 TORSOCKS_LOG_FILE_PATH=torsocks.log LD_PRELOAD=src/lib/.libs/libtorsocks.so.0.0.0 /usr/lib/wget/wget --spider https://www.torproject.org/
Spider mode enabled. Check if remote file exists.
--2016-12-04 06:16:22--  https://www.torproject.org/
Resolving www.torproject.org (www.torproject.org)... 82.195.75.101
Connecting to www.torproject.org (www.torproject.org)|82.195.75.101|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 15845 (15K) [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

comment:4 Changed 3 years ago by cypherpunks

Well I found out why this is happening to my wget and not yours. I did more experimenting with compiling it with different flags. I've been compiling wget to try to eliminate variables, and realized that this problem occurs when wget is compiled with IPv6 disabled. IPv4 uses gethostbyname(), but if IPv6 is even enabled, then getaddrinfo() is used for both of them. See code starting on lines 327, 782, and 864 of wget 1.18's src/host.c for an example of this.

Anyway, the debug info:

$ git show --pretty=oneline | head -n 1
e54d80bc9595beeceac637b03e5c5395c07e62f7 Update version to v2.2.0

$ git describe
v2.2.0

$ wget -V | head -n 3
GNU Wget 1.18 built on linux-gnu.

-cares +digest -gpgme +https -ipv6 -iri +large-file -metalink +nls

$ TORSOCKS_CONF_FILE=doc/torsocks.conf TORSOCKS_LOG_LEVEL=5 TORSOCKS_LOG_FILE_PATH=torsocks.log LD_PRELOAD=lib/torsocks/libtorsocks.so wget --spider https://www.torproject.org
Spider mode enabled. Check if remote file exists.
--2016-12-04 07:40:17--  https://www.torproject.org/
Resolving www.torproject.org... failed: Unknown error.
wget: unable to resolve host address ‘www.torproject.org’

$ cat torsocks.log
1480837264 DEBUG torsocks[36553]: Logging subsytem initialized. Level 5, file torsocks.log, time 1 (in init_logging() at torsocks.c:303)
1480837264 DEBUG torsocks[36553]: Config file setting tor address to 127.0.0.1 (in conf_file_set_tor_address() at config-file.c:298)
1480837264 DEBUG torsocks[36553]: Config file setting tor port to 9050 (in conf_file_set_tor_port() at config-file.c:254)
1480837264 DEBUG torsocks[36553]: [config] Onion address range set to 127.42.42.0/24 (in set_onion_info() at config-file.c:108)
1480837264 DEBUG torsocks[36553]: Config file doc/torsocks.conf opened and parsed. (in config_file_read() at config-file.c:572)
1480837264 DEBUG torsocks[36553]: [fclose] Close caught for fd 4 (in tsocks_fclose() at fclose.c:45)
1480837264 DEBUG torsocks[36553]: [onion] Pool init with subnet 127.42.42.0 and mask 24 (in onion_pool_init() at onion.c:104)
1480837264 DEBUG torsocks[36553]: [onion] Pool initialized with base 0, max_pos 255 and size 8 (in onion_pool_init() at onion.c:132)
1480837264 DEBUG torsocks[36553]: [fclose] Close caught for fd 4 (in tsocks_fclose() at fclose.c:45)
1480837264 DEBUG torsocks[36553]: [fclose] Close caught for fd 4 (in tsocks_fclose() at fclose.c:45)
1480837264 DEBUG torsocks[36553]: [close] Close caught for fd 4 (in tsocks_close() at close.c:33)
1480837264 DEBUG torsocks[36553]: [fclose] Close caught for fd 4 (in tsocks_fclose() at fclose.c:45)
1480837264 DEBUG torsocks[36553]: [gethostbyname] Requesting www.torproject.org hostname (in tsocks_gethostbyname() at gethostbyname.c:68)
1480837264 DEBUG torsocks[36553]: [onion] Destroying onion pool containing 0 entry (in onion_pool_destroy() at onion.c:148)
1480837264 DEBUG torsocks[36553]: [fclose] Close caught for fd 3 (in tsocks_fclose() at fclose.c:45)
Last edited 3 years ago by cypherpunks (previous) (diff)

comment:5 Changed 3 years ago by cypherpunks

I forgot to attach the patch. This fixes it for wget with IPv6 disabled, as well as default tlsdate, and other problematic programs like irssi, which is completely broken on my system with Torsocks without this patch. In the entirety of the Torsocks code, check_addr() is only called with the address family being specified as AF_INET or AF_INET6, so only an additional bug would result in triggering assert(ret != -1).

Last edited 3 years ago by cypherpunks (previous) (diff)

Changed 3 years ago by cypherpunks

Attachment: fix_check_addr.patch added

comment:6 Changed 3 years ago by dgoulet

Status: needs_informationneeds_review

comment:7 Changed 3 years ago by dgoulet

Resolution: fixed
Status: needs_reviewclosed

Ok I took the patch and fixed couple of things. I removed the assert() because the amount of system torsocks is compiled for, I can see a world where maybe inet_pton() could return -1 on AF_INET6... maybe... let's not take any risk.

I also fixed a config-file unit test.

Merged and thanks!

Note: See TracTickets for help on using tickets.