Opened 8 years ago

Closed 8 years ago

Last modified 7 years ago

#3786 closed task (implemented)

Make clients and bridges use their IPv6 address

Reported by: ln5 Owned by: ln5
Priority: Medium Milestone: Tor: 0.2.3.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay
Cc: nickm Actual Points:
Parent ID: #3563 Points:
Reviewer: Sponsor:

Description

Look at fuzzel's patch and either work on it to make it apply

-or-

use it as a shopping list for what needs to be done.

Pay attention to data structures which could suffer from growing substantially. Some may need refactoring.

Child Tickets

Change History (26)

comment:1 Changed 8 years ago by nickm

Jeroen's sent in a new version of his patch. I've uploaded it to my public repository as branch "jeroen-ipv6".

I haven't had a chance to review it closely yet. First step is going to be splitting it up into logically separate units: a 4.7kline patch on its own is really hard to figure out.

comment:2 Changed 8 years ago by ln5

Summary: Change uint32_t --> tor_addr_t where neededMake clients and bridges use their IPv6 address

Renaming this ticket to reflect what will actually be done in the scope of "bridges on ipv6" (#3563).

This work is based on the implementation of proposal 186 (former 118), #3785, where a second address tor_addr_t ipv6_addr (and corresponding port) was added to routerinfo_t.

comment:3 Changed 8 years ago by ln5

Cc: nickm added
Status: newaccepted

Branch bug3786 in git.tpo/linus/tor.git now contains enough to make a
client successfully use a bridge over IPv6. Client in my tests was on
FreeBSD and the bridge was on Ubuntu.

The least clean part of this is probably the way both clients and
bridges select between the IPv4 and IPv6 addresses of a bridge. This
is written with the assumption that we will have a configuration
option PreferIPv6 which makes both clients and bridges prefer an IPv6
address if it's set.

This is probably not the right way to do this though. I do think that
we want a PreferIPv6 option but it shouldn't be used in all places
bug3786 does right now.

Clients should do what arma suggested in #tor-dev and "scribble on a
descriptor" which address to prefer. This can be based on a config
option or something else, depending on context.

Bridges should not make use of that option at all, I think. In
bug3786 it's used for router_get_advertised_or_port() and
tor_check_port_forwarding(). It's unclear to me how we should deal
with these issues. I'd love to see them addressed in #3785.

comment:4 Changed 8 years ago by ln5

Branch bug3786_hacking contains something that works for several cases but not all, see below.

On the client side, we no longer depend on a config option but instead marks the routerinfo_t of a bridge with an ipv6_preferred flag when we find that an IPv6 address has been configured for the corresponding bridge. Since we turn the flag off when we find that an Ipv4 address has been configured, the state of the flag is dependent on the order of the configuration entries but I think that's good enough.

On the bridge side, we prefer an IPv4 port over an IPv6 port (unless BRIDGE_PREFERS_IPV6 is defined at compile time) but use the IPv6 if there's no IPv4 present.

I'd like nickm to take a look at this with an eye on design flaws and missing pieces. Comments on code quality shortcomings are of course welcome too. I'm unsure how it should be packaged though. This branch is a bit hackish, with later commits changing code from earlier commits. Would it be better with another branch (bug3786 f.ex.) with fewer commits, each with a clear goal?

Successfully tested:

  • v4 client using a v4+v6 bridge
  • v6 client using a v6 bridge
  • v4+v6 client using a v4+v6 bridge
  • v4 client using two v4 bridges
  • v4 client using three v4 bridges
  • v4+v6 client using one v4 bridge and one v6 bridge
  • v4+v6 client using a v4 bridge

Not working:

  • v4+v6 client using a v6 bridge

The last test case fails because the v4 connection fails and the v6
address is never tried. If entry_guard_register_connect_status()
could flip the ipv6_preferred bit in the bridges routerinfo_t, that'd
probably help. I can't see how we could find the routerinfo_t from
there though.

comment:5 Changed 8 years ago by nickm

Status: acceptedneeds_review

Putting this in "needs review", to make sure it doesn't slide off my queue.

comment:6 Changed 8 years ago by nickm

If entry_guard_register_connect_status()
could flip the ipv6_preferred bit in the bridges routerinfo_t, that'd
probably help. I can't see how we could find the routerinfo_t from
there though.

If you've got the identity digest, you can do router_get_mutable_by_id_digest(). {Or node_get_mutable_by_id() if you want a node_t.}

If all you have is the address:port (for a bridge) and you don't know the identity yet, I think the right place to make notes is probably the bridge_info_t.

Reading through the code now to find things to comment on...

comment:7 Changed 8 years ago by nickm

Initial notes on skimming:

I love how you are doing this as a bunch of little commits'''

This is going to be easier to review than I'm used to.

6a92ca076c11b3: Decapitalize 'V' in IPV*Only

Trivially correct; doesn't actually matter for strcasecmp, but that's how we specced it, I believe.

5b302fb679c91:

Hm. The idea of a single "alt" addr feels like it's going to hurt us. Maybe instead we want the idea of a "best IPv6 address for connecting" or "best address for connecting" or something?

Also, the documentation for functions that can return -1 on error should say so, and maybe why they would do that.

Our style for doxygen comments is to have a * at the start of each line of documentation.

node_does_prefer_ipv6 would seem to crash if it ever gets a node without a routerinfo. Also, node_ipv6_is_preferred might be a better name? We're not asking whether the node prefers ipv6; we're asking whether we prefer using ipv6 with that node.

I wonder if it makes sense to have all of these prim/alt/pref functions to return both an ORPort and an addr ? (returning the or_port via a pointer argument, I'd guess.)

Mutable fields belong in node_t more than in routerinfo_t. The ones that are there are legacy; I'd prefer to have prefer_ipv6 there if possible.

Also, I think we need a separate bit for whether we think the node is running at either address. Right now we use node->is_running for whether we think the node is up. I think we might need one bit per address; otherwise, it's hard to have the notion "I can connect at this address but not that one."

20e4621fb6fc976a793f06: Add helper functions

We can do this if we want to be pedantic, and I have no objection to being pedantic, but there are a bunch of places where we already assume that AF_INET==PF_INET, and AF_INET6==PF_INET6. I have never run into an OS where this wasn't true, but if you think it's going to be a problem anywhere you can think of, we should probably open a ticket for it.

d380fce501595c17a0: Use correct address and protocol family

Looks good

1b9c82fd6d1206d9:

Looks good.

As a style thing, we're completely willing in Tor to say "if (foo)" where you say "if (foo != NULL)", but we accept either.

0151ae4bec9bafe435: Make get_primary_or_port() consider IPv6

I don't know how you could use this; when you call get_primary_or_port(), how do you know which port you got unless you know what address it goes with?

And for the current use, I think it's wrong. The function get_primary_or_port() is called only by router_get_advertised_or_port, which uses it to decide which port to use in the "router ..." line of a router descriptor -- and that's the IPv4 port.

[FWIW, there will need to be a fix in router_get_advertised_or_port ; when it does connection_get_by_type(CONN_TYPE_OR_LISTENER), it will need to make sure it gets an IPv4 listener. Right now it doesn't.]
[Please ignore that last comment if you fixed it later.]

34a44559fa3ae1d2d5fb2: Make rewrite_node_address_for_bridge() match IPv6 addresses too.

Looks mostly okay. (Quibble: your brace and indentation style looks a little off. We use K&R, with 2-space indents. I see some one-space indents here {or are they tabs? can't tell}, and an "} \n else {" when "} else {" would be more usual for us.)

I am not sure how I feel about replacing ri->address. (What do we actually use ri->address for anyway?)

0289f47ade0258c: Don't use the IPv4 address if there's an IPv6 address and IPv6 is preferred

This seems mostly plausible but a little problematic. I wonder if prefer_ipv6() is enough for us? It seem that in the medium term, we want not only the idea of "I support IPv4 and IPv6, but prefer (IPv6/IPv4)" but also the idea of "I only support IPv4" and "I only support IPv6". [Oh, never mind! It looks like you removed it later.]

Also, extend_info_from_router() and node_get_addr() are called in too many cases to make this change for all the users of the function. Sometimes, when you're making an extend_info(), it's so you can connect to a node directly. Sometimes, it's so you can tell another node on your circuit to connect there (I think). The same applies to nodes, I believe.

The change to node_get_prim_addr seems wrong, I think. It makes sense to make that change for a node's preferred address, but I thought a a node's primary address was supposed to be IPv4 automatically.

3a812331514e25479: Allow extending to IPv4 and IPv6 addresses

We don't want this patch yet, I think. To extend a circuit is (usually, assuming that we're being terminologically precise) to tell the current last hop of the circuit to add another hop at the end of the circuit. Right now, we can't do that: current Tor nodes don't support EXTEND requests to an IPv6 address.

So right now, I think we only want to explicitly allow IPv6 addresses for the _first_ hop of the circuit. I believe that's the first main branch of circuit_send_next_onion_skin().

ef7a9f327b982dce795a:

This messes more with 0151ae4bec9bafe435 ; the same issues apply.

b11e2ddb01eebfb:

Looks fine.

33d3730cdac6a8c175:

Do you want prim_addr or pref_addr here? I thought pref_addr was the one that we wanted to connect to, and prim_addr was the "primary", "official" ipv4 address.

bc6c988482d77575b582: Remove the need for a "prefer ipv6" option.

Using pref_addr_port is still problematic in extend_info_from_router: see 0289f47ade0258c. We use extend_info_from_router not only for connecting directly to the router, but also for telling other routers whom to connect to when they extend.

b90ff192dae076d02d3fa: Parse IPv6 addresses in descriptors

Looks fine

51226c36e720bad6: Indicate preferred IP family for a bridge based on configured bridges

Maybe better to stick this flag node_t; but not a disaster if you can't move it by the end of the month.

What happens if the client lists both the v4 and the v6 address for a bridge?

b6d741abc547a4bdd:

The whole pack of changes to "get_primary_or_port()" are still problematic; see above.

several documentaion-only commits:

Looks fine.

4cb534e23caecb14: Handle bridges with IPv6-only orport better:

Looks ok.

Wow, that's a pile of comments! Please let me know about any of these you've got questions about, and I'll try to answer.

comment:8 in reply to:  7 Changed 8 years ago by ln5

Replying to nickm:

33d3730cdac6a8c175:

Do you want prim_addr or pref_addr here? I thought pref_addr was the one that we wanted to connect to, and prim_addr was the "primary", "official" ipv4 address.

This is about connecting so I think pref_addr is correct.

comment:9 in reply to:  7 ; Changed 8 years ago by ln5

Replying to nickm:

What happens if the client lists both the v4 and the v6 address for a bridge?

Last handled bridge wins, i.e. it depends on the order of Bridge
lines.

Is this is good enough for now or should we simply drop one of them?

If so, which one? rransom suggested dropping the IPv4 address based
on the situation in China. I think that falling back to old client
behaviour, i.e. IPv4, might be the least surprising thing to do
though.

comment:10 in reply to:  9 Changed 8 years ago by nickm

Replying to ln5:

Replying to nickm:

What happens if the client lists both the v4 and the v6 address for a bridge?

Last handled bridge wins, i.e. it depends on the order of Bridge
lines.

Is this is good enough for now or should we simply drop one of them?

Seems good enough for now to me.

comment:11 Changed 8 years ago by ln5

Please review branch bug3786_2 in my repo. It addresses 15 out of 18
issues from your last review. (Thanks for the informative review btw.)

Issues not addressed:

  • make all the prim/alt/pref addr functions return both addr and port (after bug3786_2 is ok)
  • move the prefer_ipv6 flag to node_t (probably after nov 30)
  • separate bits for node->is_running, one per address, needed for bridges on both v4 and v6 (after nov 30)

comment:12 Changed 8 years ago by nickm

In extend_info_from_* , I think that use_pref_addr is the wrong name for the argument; I think instead it should be "for_direct_connect" or something. That way, the caller only needs to know what they want to do with the extend_info; not what their needs imply about how the extend_info needs to be generated.

The test in circuit_get_open_circ_or_launch looks right to me: in the case of a onehop circuit, the extend_info for the exit node you want should be the one you're using to connect to it.

If node_get_all_orports() is at all near the critical path, we should think about some way to make it faster: allocating and freeing little objects over and over can get expensive.

Have a look at the callers of node_get_{prim,perf}_{addr,orport} : would it make more sense or less sense to make these functions yield a tor_addr_port_t instead?

In get_configured_bridge_by_routerinfo: if there is no bridge configured with the preferred address, should we fall back and check whether there is one configured with the primary address? Or did I misunderstand something?

Otherwise looks okay, I think. If you think I should merge after the next round, please include a changes file, and let me know how tested this is/isn't at that point.

comment:13 in reply to:  12 ; Changed 8 years ago by ln5

Replying to nickm:

In extend_info_from_* , I think that use_pref_addr is the wrong name for the argument; I think instead it should be "for_direct_connect" or something. That way, the caller only needs to know what they want to do with the extend_info; not what their needs imply about how the extend_info needs to be generated.

Good point. Fixed in eed5d661.

The test in circuit_get_open_circ_or_launch looks right to me: in the case of a onehop circuit, the extend_info for the exit node you want should be the one you're using to connect to it.

Great. Comment removed in eadc5594.

If node_get_all_orports() is at all near the critical path, we should think about some way to make it faster: allocating and freeing little objects over and over can get expensive.

Would you suggest node_get_all_orports() taking a smartlist pointer as an argument? And even two (optional) tor_addr_t pointers?

Have a look at the callers of node_get_{prim,perf}_{addr,orport} : would it make more sense or less sense to make these functions yield a tor_addr_port_t instead?

Yes, that's what I intend to do next.

In get_configured_bridge_by_routerinfo: if there is no bridge configured with the preferred address, should we fall back and check whether there is one configured with the primary address? Or did I misunderstand something?

We should check "the other" address. I totally missed that.
More generally, we should check all addresses, preferring "the preferred".

Otherwise looks okay, I think. If you think I should merge after the next round, please include a changes file, and let me know how tested this is/isn't at that point.

I'll squash {node,router}_get_{prim,alt,perf}_addr with their _orport counterparts and then ask for merge.

comment:14 in reply to:  13 ; Changed 8 years ago by ln5

Replying to ln5:

In get_configured_bridge_by_routerinfo: if there is no bridge configured with the preferred address, should we fall back and check whether there is one configured with the primary address? Or did I misunderstand something?

We should check "the other" address. I totally missed that.
More generally, we should check all addresses, preferring "the preferred".

On second thought, is this really necessary? Since a router won't
have ri->ipv6_preferred set unless we've seen an IPv6 address in its
descriptor fetching by preferred address won't fail.

What worries me more is the way
get_configured_bridge_by_addr_port_digest() might return a bridge with
another address than the one we ask for. This might happen when we
don't pass it a digest. How bad would that be?

comment:15 Changed 8 years ago by nickm

I don't think it'd be too bad, though a second opinion from arma might be wise.

The one case where it's really important that addresses match is that we shouldn't connect to a bridge at any address other than the one we configured for it in the bridge line in our torrc. If we've got that property, I think we're okay.

comment:16 in reply to:  13 ; Changed 8 years ago by nickm

Replying to ln5:

If node_get_all_orports() is at all near the critical path, we should think about some way to make it faster: allocating and freeing little objects over and over can get expensive.

Would you suggest node_get_all_orports() taking a smartlist pointer as an argument? And even two (optional) tor_addr_t pointers?

I'd suggest no change unless we find out that it is in fact critical-path, or that it is in fact causing memory fragmentation or some other problem.

comment:17 in reply to:  16 Changed 8 years ago by ln5

Replying to nickm:

If node_get_all_orports() is at all near the critical path, we should think about some way to make it faster: allocating and freeing little objects over and over can get expensive.

Would you suggest node_get_all_orports() taking a smartlist pointer as an argument? And even two (optional) tor_addr_t pointers?

I'd suggest no change unless we find out that it is in fact critical-path, or that it is in fact causing memory fragmentation or some other problem.

Adding a comment and will try to come up with a test case that stresses this then. Thanks.

comment:18 Changed 8 years ago by ln5

I've been testing this branch by connecting clients to one or two
bridges, IPv4 and IPv6, from a couple of different kind of systems
(Ubuntu, Windows, OSX, FreeBSD and NetBSD) with bridges running on
Ubuntu, FreeBSD and NetBSD.

Weird things noticed include clients sticking to their IPv4 connection
even when the IPv6 address of a bridge with both families is
preferred. Another thing is that we fail bootstrap with 404 from the
bridge now and again unless UseMicrodescriptors 0.

Code is still on bug3786_2, now rebased onto master.

Please review eed5d661..HEAD and merge if OK.

comment:19 in reply to:  15 Changed 8 years ago by ln5

Replying to nickm:

The one case where it's really important that addresses match is that we shouldn't connect to a bridge at any address other than the one we configured for it in the bridge line in our torrc. If we've got that property, I think we're okay.

A client will not connect to an IPv6 address not found in a Bridge
line since that address won't evern show up in a bridge_info_t and
rewrite_node_address_for_bridge() is the only place where
ri->ipv6_preferred is set.

A client will _hopefully_ not connect to an IPv6 address not found in
a Bridge line but the closest to a proof that I can present for this
is that I've tested quite a lot with clients using only an IPv6
address to bridges not binding to an IPv4 address.

That's the best I've got right now.

comment:20 in reply to:  18 Changed 8 years ago by nickm

Replying to ln5:

I've been testing this branch by connecting clients to one or two
bridges, IPv4 and IPv6, from a couple of different kind of systems
(Ubuntu, Windows, OSX, FreeBSD and NetBSD) with bridges running on
Ubuntu, FreeBSD and NetBSD.

Weird things noticed include clients sticking to their IPv4 connection
even when the IPv6 address of a bridge with both families is
preferred. Another thing is that we fail bootstrap with 404 from the
bridge now and again unless UseMicrodescriptors 0.

Code is still on bug3786_2, now rebased onto master.

Looking at the new code, I'm not finding any showstoppers. We'll want to make sure the remaining known bugs all have tickets.

The branch is kinda screwed up, though!

92245f35bc593dad and f8687eba80cccc33 are merges from master back into the branch, which shouldn't be there if it's really rebased onto master. Moreover, lots (all?) of the patches seem to have been duplicated (??); like, 1e66d6b57651 and 1898c009e86531117; c7dc65c882c and eed5d661d463d95). Can you figure out what went wrong there and fix that?

Moreover, there is unrelated stuff (1d504759b30a6541815d9134d4f27a01fd5cea56, f98364cec66717aa1ebc32c2f0241e9d45681723) which I'd usually prefer to take separately. Not a big deal; but I don't know why it's still there.

(In the future, you should never give a rebased branch the same name as the old one. It makes it much harder to fix stuff like this.)

comment:21 Changed 8 years ago by ln5

Ugh, git fu 0, obviously. :-(
How does my branch ipv6_priv_bridges look to you?

Ack re tickets, adding shortly.

comment:22 Changed 8 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

Merged!

comment:23 Changed 8 years ago by nickm

(Make sure that you base subsequent work here on master, rather than on your old branch, since I just squashed and rebased that one before merging it.)

comment:24 in reply to:  14 Changed 8 years ago by ln5

Replying to ln5:

What worries me more is the way
get_configured_bridge_by_addr_port_digest() might return a bridge with
another address than the one we ask for. This might happen when we
don't pass it a digest. How bad would that be?

For the record, this is not true. If we pass it an empty digest, we
might get a bridge with the address we're passing. If we pass it a
non-empty digest, we might get a bridge with that identity. None of
these cases differ from the IPv4 only world we used to live in.

comment:25 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:26 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.