Opened 6 weeks ago

Last modified 4 weeks ago

#32314 needs_revision defect

Can't connect to literal IPv6 address containing double colon

Reported by: liberat Owned by:
Priority: Medium Milestone: Tor: 0.4.3.x-final
Component: Core Tor/Tor Version: Tor: 0.4.1.6
Severity: Normal Keywords: tor-client, tor-exit, ipv6, BugSmashFund, check-backport
Cc: Actual Points: 0.1
Parent ID: #26664 Points: 0.5
Reviewer: Sponsor:

Description

When an application wants to use Tor's SOCKS port to connect to a known IPv6 address, it has a couple of options:

  • It can specify a 16-byte binary address using address type 4.
  • It can specify the address as an ASCII string using address type 3.

If the address is specified as a string, Tor accepts IPv6 addresses either with or without brackets. For example, Tor will accept either "2a01:4f8:fff0:4f:266:37ff:fe2c:5d19" or "[2a01:4f8:fff0:4f:266:37ff:fe2c:5d19]".

However, if the address is abbreviated using double-colon notation, it only works if enclosed in brackets: "[2a00:1450:4001:800::200e]" works, but "2a00:1450:4001:800::200e" does not. On the other hand, the unabbreviated form "2a00:1450:4001:800:0:0:0:200e" does work.

The problem appears to be:

  • The destination is transmitted to the exit relay as a string of the form "<hostname>:<port>".
  • The exit relay tries to parse this string by calling the function tor_addr_port_split.
  • The string "2a00:1450:4001:800::200e:80" is a valid IPv6 literal, so tor_addr_port_split interprets it as an address with no port number.
  • The relay refuses to connect to an address with no port number.

Note that if the application uses the binary form (address type 4), this is internally converted into a string enclosed in brackets. However, it seems to be more common for applications to use the ASCII form, without brackets. For example, if you try to visit http://[2a00:1450:4001:800::200e]/ in Tor Browser, it will fail, whereas http://[2a01:4f8:fff0:4f:266:37ff:fe2c:5d19]/ succeeds.

So there are a few ways this could be fixed:

(a) applications could be changed to use either the binary form or wrap the address in brackets;

(b) the Tor proxy could automatically add brackets around IPv6 addresses;

(c) the exit relay could be smarter about parsing IPv6 addresses.

It seems to me that (b) would be the most sensible option, but it might be reasonable to do (c) as well.

In the long term, I think it'd be wise to deprecate the use of IPv6 addresses without brackets in RELAY_BEGIN, as well as any other places where tor_addr_port_split is used, because it's just confusing.

Child Tickets

TicketStatusOwnerSummaryComponent
#31542assignedCannot connect to IPv6 addresses using Tor SOCKSCore Tor/Tor

Attachments (1)

test-connect.py (784 bytes) - added by liberat 6 weeks ago.
Script demonstrating various SOCKS5 connection requests

Download all attachments as: .zip

Change History (9)

Changed 6 weeks ago by liberat

Attachment: test-connect.py added

Script demonstrating various SOCKS5 connection requests

comment:1 Changed 5 weeks ago by liberat

One straightforward way to fix this would be to parse the address using tor_addr_parse and then convert back to a string using tor_addr_to_str:

--- a/src/core/or/connection_edge.c
+++ b/src/core/or/connection_edge.c
@@ -1631,6 +1631,12 @@ connection_ap_handshake_rewrite(entry_connection_t *conn,
     conn->original_dest_address = tor_strdup(conn->socks_request->address);
   }
 
+  /* If the address is an IPv6 literal, either with or without brackets,
+   * convert it into its canonical form and wrap it in brackets. */
+  if (tor_addr_parse(&addr_tmp, socks->address) >= 0) {
+    tor_addr_to_str(socks->address, &addr_tmp, sizeof(socks->address), 1);
+  }
+
   /* First, apply MapAddress and MAPADDRESS mappings. We need to do
    * these only for non-reverse lookups, since they don't exist for those.
    * We also need to do this before we consider automapping, since we might

This also has the effect of transforming the address into "canonical" form. This seems like a good idea anyway, as it reduces possibilities for application fingerprinting by exit nodes.

However, this also impacts the behavior of "MapAddress". Currently, if your torrc contains:

MapAddress fc00::0001 www.torproject.org

then a client that tries to connect to "fc00::0001" will reach www.torproject.org, but a client that tries to connect to "[fc00::1]" will not. So it would probably be wise to also "canonicalize" addresses used in MapAddress.

comment:2 Changed 5 weeks ago by nickm

Milestone: Tor: 0.4.3.x-final

comment:3 in reply to:  description Changed 4 weeks ago by teor

Replying to liberat:

However, if the address is abbreviated using double-colon notation, it only works if enclosed in brackets: "[2a00:1450:4001:800::200e]" works, but "2a00:1450:4001:800::200e" does not. On the other hand, the unabbreviated form "2a00:1450:4001:800:0:0:0:200e" does work.

The problem appears to be:

  • The destination is transmitted to the exit relay as a string of the form "<hostname>:<port>".
  • The exit relay tries to parse this string by calling the function tor_addr_port_split.
  • The string "2a00:1450:4001:800::200e:80" is a valid IPv6 literal, so tor_addr_port_split interprets it as an address with no port number.
  • The relay refuses to connect to an address with no port number.

This isn't quite right. Addresses and ports in Tor cells are binary. So the string parsing all happens on the client.

Note that if the application uses the binary form (address type 4), this is internally converted into a string enclosed in brackets. However, it seems to be more common for applications to use the ASCII form, without brackets. For example, if you try to visit http://[2a00:1450:4001:800::200e]/ in Tor Browser, it will fail, whereas http://[2a01:4f8:fff0:4f:266:37ff:fe2c:5d19]/ succeeds.

So there are a few ways this could be fixed:

(a) applications could be changed to use either the binary form or wrap the address in brackets;

(b) the Tor proxy could automatically add brackets around IPv6 addresses;

(c) the exit relay could be smarter about parsing IPv6 addresses.

It seems to me that (b) would be the most sensible option, but it might be reasonable to do (c) as well.

So these fixes both have to happen on the client.

comment:4 Changed 4 weeks ago by liberat

This isn't quite right. Addresses and ports in Tor cells are binary. So the string parsing all happens on the client.

I could be wrong. But according to the spec, the payload of the RELAY_BEGIN cell is:

         ADDRPORT [nul-terminated string]
         FLAGS    [4 bytes]

   ADDRPORT is made of ADDRESS | ':' | PORT | [00]

   where  ADDRESS can be a DNS hostname, or an IPv4 address in
   dotted-quad format, or an IPv6 address surrounded by square brackets;
   and where PORT is a decimal integer between 1 and 65535, inclusive.

If the SOCKS client encodes the address in one ASCII format, Tor is able to connect. If the SOCKS client encodes exactly the same address in a slightly different ASCII format, Tor is not able to connect. When it fails, it seems to be that the exit node reports an error. So I am assuming that the ASCII string supplied by the client is indeed what is transmitted in the RELAY_BEGIN cell.

And I assume the fact that this sometimes works when the client supplies an address without brackets is a lucky accident, due to the permissive behavior of tor_addr_port_split.

comment:5 Changed 4 weeks ago by liberat

A less disruptive way to fix the issue would be to do something like this (untested):

--- a/src/core/or/connection_edge.c
+++ b/src/core/or/connection_edge.c
@@ -3063,6 +3063,7 @@
 MOCK_IMPL(int,
 connection_ap_handshake_send_begin,(entry_connection_t *ap_conn))
 {
+  const char *address;
   char payload[CELL_PAYLOAD_SIZE];
   int payload_len;
   int begin_type;
@@ -3092,10 +3093,16 @@
   /* Set up begin cell flags. */
   edge_conn->begincell_flags = connection_ap_get_begincell_flags(ap_conn);
 
-  tor_snprintf(payload,RELAY_PAYLOAD_SIZE, "%s:%d",
-               (circ->base_.purpose == CIRCUIT_PURPOSE_C_GENERAL) ?
-                 ap_conn->socks_request->address : "",
-               ap_conn->socks_request->port);
+  address = ((circ->base_.purpose == CIRCUIT_PURPOSE_C_GENERAL) ?
+             ap_conn->socks_request->address : "");
+  if (strchr(address, ':') && address[0] != '[') {
+    tor_snprintf(payload,RELAY_PAYLOAD_SIZE, "[%s]:%d",
+                 address, ap_conn->socks_request->port);
+  }
+  else {
+    tor_snprintf(payload,RELAY_PAYLOAD_SIZE, "%s:%d",
+                 address, ap_conn->socks_request->port);
+  }
   payload_len = (int)strlen(payload)+1;
   if (payload_len <= RELAY_PAYLOAD_SIZE - 4 && edge_conn->begincell_flags) {
     set_uint32(payload + payload_len, htonl(edge_conn->begincell_flags));

comment:6 in reply to:  4 Changed 4 weeks ago by teor

Replying to liberat:

This isn't quite right. Addresses and ports in Tor cells are binary. So the string parsing all happens on the client.

I could be wrong. But according to the spec, the payload of the RELAY_BEGIN cell is:

         ADDRPORT [nul-terminated string]
         FLAGS    [4 bytes]

   ADDRPORT is made of ADDRESS | ':' | PORT | [00]

   where  ADDRESS can be a DNS hostname, or an IPv4 address in
   dotted-quad format, or an IPv6 address surrounded by square brackets;
   and where PORT is a decimal integer between 1 and 65535, inclusive.

You're right, I was thinking of EXTEND2 cells, not BEGIN cells.

comment:7 in reply to:  1 Changed 4 weeks ago by teor

Actual Points: 0.1
Keywords: tor-client tor-exit ipv6 BugSmashFund check-backport added
Points: 0.5
Status: newneeds_revision

Replying to liberat:

One straightforward way to fix this would be to parse the address using tor_addr_parse and then convert back to a string using tor_addr_to_str:

--- a/src/core/or/connection_edge.c
+++ b/src/core/or/connection_edge.c
@@ -1631,6 +1631,12 @@ connection_ap_handshake_rewrite(entry_connection_t *conn,
     conn->original_dest_address = tor_strdup(conn->socks_request->address);
   }
 
+  /* If the address is an IPv6 literal, either with or without brackets,
+   * convert it into its canonical form and wrap it in brackets. */
+  if (tor_addr_parse(&addr_tmp, socks->address) >= 0) {
+    tor_addr_to_str(socks->address, &addr_tmp, sizeof(socks->address), 1);
+  }
+
   /* First, apply MapAddress and MAPADDRESS mappings. We need to do
    * these only for non-reverse lookups, since they don't exist for those.
    * We also need to do this before we consider automapping, since we might

This also has the effect of transforming the address into "canonical" form. This seems like a good idea anyway, as it reduces possibilities for application fingerprinting by exit nodes.

I prefer this fix, because it canonicalises all addresses.

However, this also impacts the behavior of "MapAddress". Currently, if your torrc contains:

MapAddress fc00::0001 www.torproject.org

then a client that tries to connect to "fc00::0001" will reach www.torproject.org, but a client that tries to connect to "[fc00::1]" will not. So it would probably be wise to also "canonicalize" addresses used in MapAddress.

Yes, we'll also need a fix for MapAddress.

And we'll need tests that use the client code to encode addresses, and the exit code to parse it. Let's have some cases that succeed regardless of the patch. And some other cases that fail without the patch, but succeed with it.

Finally, we'll need to work out when this bug was introduced, so we know whether to backport to 0.2.9, 0.3.5, or 0.4.0 and later.

comment:8 Changed 4 weeks ago by teor

Parent ID: #26664
Note: See TracTickets for help on using tickets.