I run Tor 0.2.4.17-rc as a bridge configured with multiple IPv4 addresses and one ORPort line for each, like so:
ORPort :443
ORPort :443
...
ORPort :443
When starting tor, the log will show that only is self-tested and a bridge descriptor is published:
[notice] Now checking whether ORPort :443 is reachable... (this may take up to 20 minutes -- look for log messages indicating success)
[notice] Self-testing indicates your ORPort is reachable from the outside. Excellent. Publishing server descriptor.
Tor should publish bridge descriptors for all the available addresses and the bridge authorities should probably put them into pools of different confidentiality levels.
For clarity, the documentation should explain Tor's bandwidth allocation as configured with Relay(Bandwidth|Burst)Rate (per address or total).
Trac: Username: sqrt2
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Trac: Username: sqrt2 Summary: Make bridges publish a bridge descriptor per ORPort address to Make bridges publish additional ORPort addresses in their descriptor
sysrqb has confirmed that there is only a descriptor for known to BridgeDB, and there are no additional "a" lines in the descriptor.
There is only supposed to be one "a" line in a microdescriptor.
The other addresses don't appear in the log even at debug log level.
Out of curiousity, the extra <n2>, <n3, etc. addresses you're adding: are they IPv4 addresses?
Because there was this strange thing that I just noticed yesterday where the "or-address" lines in the @type-bridge-server-descriptors are only IPv6 addresses, never IPv4. The same thing goes for the "a" lines; they are always IPv6 addresses for bridge descriptors.
The extra ones in those 122 addresses could just be newer bridges, because two of my test bridges are among them. Also doing
bridgedb@ponticum:~$ cat /srv/bridges.torproject.org/from-authority/networkstatus-bridges | grep -Pzo 'a .*\na .*'
comes up with two descriptors which have two "a" lines, but on closer inspection my grep was matching an "a" on the end of the nickname string for two different bridges, not the beginning of the line.
So there are no @type bridge-networkstatus nor @type bridge-server-descriptor descriptors with multiple "a" or "or-address" lines. When either of these lines do occur, they are always IPv6 addresses.
Yes, these are all IPv4 addresses (though I could also add reachable IPv6 addresses if that helps with testing).
The fact that in reality there are only IPv6 addresses appears to be a due to missing code that would make anything else happen; quoting struct routerinfo_t from or.h:
/** A router's IPv6 address, if it has one. */ /* XXXXX187 Actually these should probably be part of a list of addresses, * not just a special case. Use abstractions to access these; don't do it * directly. */ tor_addr_t ipv6_addr; uint16_t ipv6_orport;
"a" SP address ":" port NL [Any number] The "or-address" element as specified in 2.1.
would seem to imply that we can have as many as we want (at least it won't affect descriptor parsing), and that they can be IPv4 or IPv6 addresses.
though in tor.git/src/or/router.c in router_rebuild_descriptor(), the routerinfo_t *ri only has one extra orport, which is always used for IPv6:
/* For now, at most one IPv6 or-address is being advertised. */ { const port_cfg_t *ipv6_orport = NULL; SMARTLIST_FOREACH_BEGIN(get_configured_ports(), const port_cfg_t *, p) { if (p->type == CONN_TYPE_OR_LISTENER && ! p->no_advertise && ! p->bind_ipv4_only && tor_addr_family(&p->addr) == AF_INET6) { if (! tor_addr_is_internal(&p->addr, 0)) { ipv6_orport = p; break;[...] if (ipv6_orport) { tor_addr_copy(&ri->ipv6_addr, &ipv6_orport->addr); ri->ipv6_orport = ipv6_orport->port; }
So this behaviour is not a bug, though for bridges it is certainly desirable to have more than one address per bridge.
If you make bridges send multiple descriptors to the BridgeAuthority, there could be some problems:
The same bridge (in this case, by "bridge" I mean "box") would have multiple entries in BridgeDB's hash rings, meaning an increased likelihood of being handed out. Similarly, in the "social bridge disributor" design for BridgeDB, because this one box has more descriptors in the database, it's likelihood of being handed out is correspondingly higher.
This is probably fine as long as each of these descriptors is a different IP address, and there is not an easy way, given one of these IPs/descriptors, to discover the other ones (i.e. via a reverse DNS query and then a regular DNS query to get the other addresses for that host).
You might have to do a bit of juggling to keep track of which descriptor was updated and resent to the BridgeAuthority when, though my guess is that in practice routers with multiple IP routes configured are not going to be shifting IP addresses too much.
You probably should not allow multiple descriptors to contain the same address. I.e. something like:
{{{
Bridge
|- desc1
| |- ORPort 1.1.1.1:443
| |- ORPort 12.12.12.12:4443
| |_ ServerTransportListenAddr obfs3 1.1.1.1:3333
|
|_ desc2
|- ORPort 2.2.2.2:2222
|_ ORPort 12.12.12.12:6666
}}}
would be bad for the following reasons:
Double the chance of handing out 12.12.12.12, so double the chance it gets blocked.
Since these separate descriptors would essentially be separate bridges, for the social bridge distribution scheme (the last of the redesign is being hammered out now), if desc1 got blocked (i.e. the 1.1.1.1 IP address was blackholed), then desc2 would also be blocked via the 12.12.12.12 address. In this case, the clients of 2.2.2.2 would stop earning coins for the desc2 bridge, since it's been blocked. It might be possible to further revise the scheme to deal with cases like this, but you can see how it would start to get hairy.
My preference would be to have one descriptor per IP address (that way PTs can run on that same address under alternate ports). Perhaps nickm and others have different concerns, however.
Trac: Status: new to needs_information Version: Tor: 0.2.4.17-rc to Tor: 0.2.5.1-alpha Cc: N/Atoisis@torproject.org
Contrary to what I wrote initially when I said Tor should publish "descriptors for each address", I think that the OR should not be trusted with deciding how exactly its addresses are going to be published, as not to make it possible to corrupt BridgeDB in the sense that statistics would show many bridges, but many of them are easily blocked as a group due to "treacherous" descriptors. (An odd type of attack, admittedly.) I agree that it is probably best if BridgeDB then treats each IP address as a separate bridge when it comes to handing out bridge addresses.
Therefore, staying within the scope of this ticket, I think that Tor should publish just one descriptor with additional "a" lines containing both alternative IPv4 and IPv6 addresses. (I suppose for the BridgeDB side it would then make sense to open a separate ticket against the BridgeDB component?)
For BridgeDB, there is not a specific ticket, but the problem of how to treat multiple addresses assigned to one "bridge" (and how to define exactly what one "bridge" is) is dealt with in multiple tickets as this is currently an ongoing issue. See #8614 (moved), #9652 (moved), and #9626 (moved).
Also, there was proposal 186-multiple-orports, the status of which is "partially implemented" due to, it seems, the ADDRLIST which you previously mentioned not being implemented yet. Since the proposal was previously accepted, I'd recommend reading it and searching around for the tickets related to it, and talk to whoever worked on it to determine the best way forward.
I think that it's fine to publish as many "or-address" lines per descriptor as we want. What would be harder is getting Tor to handle addresses beyond the first IPv4 address and the first IPv6 address.
The partial state of 186 is because it was much easier to implement the case with exactly one of each address type than it was to get clients to use additional addresses of each type. It also seemed hard to get authorities to do the right thing with testing large numbers of addresses, in the timeframe we had.
I've been looking through the code, and (considering that I can use my alternate bridge addresses as Bridge and that works flawlessly) my impression is that as an OR, reachability self-testing is the main area that currently depends on addr/or_port being simple uints.
It is also my impression, considering that IPv6 self-testing is currently not being done at all, that this ticket could be addressed with another hack. We coul add another field for additional IPv4 addresses/ports to routerinfo_t, and modify router_dump_router_to_string() and router_parse_entry_from_string() appropriately. In other words, we would be ignoring reachability testing for the additional ORPorts completely.
(One could do the same for additional IPv6 addresses, but because blocking entire netranges is much more dangerous for IPv4 than it is for IPv6, additional IPv4 bridge addresses are probably a lot more useful to the Tor network than is the case for IPv6 addresses.)
Am I missing something or would that work, and should we do it?
I think that if it's bridges-only, and we have some mechanism to make sure that the test for each happens at some point, it's not a huge issue where the test occurs in the first revision; whether it's a self-test or a test performed by the bridge authority or by bridgedb, or something else entirely.
So long as testing is happening somewhere, adding an additional-addresses field to routerinfo wouldn't be a horrible thing.
That said, we shouldn't give users bridge addresses without something testing them.
Reachability self-checking for multiple addresses requires that we can EXTEND a circuit to a specific IP address (and not just a specific OR) and that we can detect to which address a CREATE cell was sent.
Directory authorities, however, just do a TLS handshake with a specific address/port, and it shouldn't be too hard to adapt dirserv_single_reachability_test() to do this for every address in the descriptor.
I'll try to write a patch implementing these features in the coming days.
ok. best to make sure this gets rate-limited, so that a dirserver doesn't go crazy if it sees a node with 100 ports. This should also be bridges-only for now.
Eventaully, we will want the self-test to happen, so that nodes don't accidentally make themselves get rejected if they have one non-working address out of some large number.
OK, here it goes. What this is basically supposed to do is
Make address detection handle interfaces that have multiple addresses of the same address family better. get_interface_address{,6} returns a list, get_stable_interface_address still returns a single address that only changes when it must.
If we are a bridge, add additional listener addresses to our descriptor. (This is routerinfo_t.more_or_listeners.)
Make additional or-address lines in descriptors parseable.
Give directory authorities the ability to test the reachability of additional addresses in the descriptor. (Limited to 7, for now.)
I've tested this some, but I'd appreciate it of course if someone else gave it a spin as well.
Trac: Username: sqrt2 Status: needs_information to needs_revision
I have updated the attached patch to a version that I believe to be working properly. I've also written an explanation of the bigger changes in this patch to hopefully make understanding what it does easier:
First, if we consider tor to possibly have multiple OR listener addresses, when detecting addresses on an interface, we must be able to return multiple addresses. Therefore, we modify get_interface_address6() to return a smartlist_t. Now, during configuration, we must take into account that not any routable IP address on an interface is in fact a good address to use. For this, we add find_good_addr_from_list(), that will pick one of the addresses returned by get_interface_address6(), preferring addresses that the user has explicitly configured for our purpose. (resolve_my_address() has gained a parameter "listener_type" that specifies this purpose: OR, directory, etc.)
We must now take care of the fact that get_interface_address6() is also used by client_check_address_changed() to find if a client needs to rekey because its address has changed. Because depending on the operating system to always return interface addresses in the same order seems like asking for trouble, we need a mechanism to return an interface address that only changes if we can't use the previous one anymore (the interface isn't configured with this address anymore). This mechanism is get_stable_interface_address6().
We also modify test_addr.c to make it compile with the new get_interface_address6().
The address found by resolve_my_address() ends up as our main IPv4 address. To maintain compatibility with existing IPv6 code, we also need to find a main IPv6 address to put into routerinfo_t.ipv6_addr. This code is now in router_get_main_ipv6_listener_address(). Together, these can be queried in router_get_main_listener_address_by_af().
Sometimes, we need to find our OR listener port and don't have a routerinfo_t to look it up. Previously, we would just iterate through open connections in router_get_active_listener_port_by_type_af(). However in the presence of multiple listeners per address family, we need to also specify the exact listener address to do this in a stable manner. As the old function only gets called in situations where we also know the listener address, this function is now router_get_active_listener_port_by_addr_type_af() and associated functions have been changed accordingly.
Finally, we add a smartlist_t *more_or_listeners to routerinfo_t and, if we are a bridge, populate it with all the additional addresses in router_rebuild_descriptor(). We modify router_dump_router_to_string() and router_parse_entry_from_string() to include these additional addresses. We also adapt various other functions concerned with querying a router for addresses or ports.
We want the bridge authorities to test the reachability of these additional addresses. For this purpose, we modify dirserv_single_reachability_test() to test at most 7 additional addresses (in order not to overload the bridge authorities).
Because we're checking multiple addresses for reachability now, we need to keep track of reachability for each address separately. For this, we add a struct addr_reachability_t that contains a tor_addr_port_t and the time we could reach this address/port last. The last_reachble and last_reachble6 fields of node_t are replaced by a smartlist_t *last_reachable. To manage this new field, we add node_set_last_reachability() and node_get_af_last_reachability(), and node_af_reachable_since() with helper functions addr_replied(), all_listeners_replied(). Finally, we modify various functions in dirserv.c to make use of this new API.
Lacking the brains for a high-level review, I'm going for a partial
look-at-details review for now, in the hope that it might help shake
out some issues.
Also, note that most of router*.[ch] and test/test_addr.c have not
been reviewed yet.
are we leaking memory in get_interface_address6() in "smart way"
when return_addrs is empty after the loop (returning NULL)?
the construct with r = return_addrs in get_interface_address6() is a
bit odd IMO; the test !r will never be true f.ex.
are we leaking memory (the list in r) in get_interface_address()?
slight code duplication in dirserv_orconn_tls_done() when calling
node_set_last_reachability()
is the check for the ipv6 address in
dirserv_single_reachability_test() backwards? i.e. should check for
tor_addr_compare() == 0
did router_get_active_listener_port_by_addr_type_af() get the
tor_addr_compare() test backwards too?
the call to tor_free() in router_get_advertised_or_port_by_af() (at
router.c:1508) is never reached and we're leaking that memory
needs a ChangeLog entry (i.e. a file in changes/)
documentation for last_discovered_ipv*_address refers to a function
that does not exist (change to get_stable_interface_address6?)
Formatting and style.
remove trailing whitespace
no need to break a line like
"port_cfg_t * router_get_main_ipv6_listener_address(const smartlist_t *ports);"
i'd add curly braces to constructs like
else
/* We got here from options_validate(), and we're just checking if we're
* globally reachable and can be a directory authority. This function will
* be called again when ports have been configured and we can pick the right
* address then. */
log_fn(notice_severity, LD_CONFIG,
"We have at least one usable address, that's enough right now.");
indentation is wrong in a couple places (like address.c:1479)
long lines here and there (see 'Wide' when doing "make check-spaces")
no need for '' to continue a line (like dirserv.c:2966)
i'd break lines like this in two:
if (chan) command_setup_channel(chan);
So sorry for the long delay. I'll try to finish review by end of year.
In the meantime, the most useful thing to add here would be unit tests for all the new code, to whatever extent is possible. (The new mocking code in 0.2.5 should make it possible to test stuff that might not have been very testable before.
Can you describe in detail how you've tested this ? (Apologies if you've already said; just link if so.)
QUick notes:
In get_interface_address6(), we used to never return internal addresses, but it looks like that code got removed?
In get_interface_address6(), are there still any leaks of addr? For example, in the case where we return NULL instead of goto err?
In get_stable_interface_address6(), do we leak 'list' ?
All of the functions that allocate a new thing for their return value should document "returns a newly allocated copy of". Example: get_first_address_by_af. Or perhaps it should take a const smartlist_t * and return a reference to the appropriate element.
get_interface_address -- is it still necessary?
Why is find_good_addr_from_list IPv4-only? IT seems that having it take a tor_addr_t would make it more robust.
I will try to find more stuff later. I need to remember that just because a patch is long doesn't mean that I can't usefully review part of it. :/
In the meantime, the most useful thing to add here would be unit tests for all the new code, to whatever extent is possible. (The new mocking code in 0.2.5 should make it possible to test stuff that might not have been very testable before.
I assume you're talking about the MOCK* macros from test/testsupport.h. I will take a look at this.
Can you describe in detail how you've tested this ? (Apologies if you've already said; just link if so.)
I haven't done any rigorous testing. What I've done is (making use of a number of globally routable addresses I have) built a small network with a bridge directory server and a bridge with AlternateBridgeAuthority, adding addresses to the the local interface and configuring some of them/all of them/none of them as ORPorts, then observing that the bridge detects its proper addresses and that the directory authority learns of the bridge and its addresses.
So far, I've tested this with IPv4 only.
In get_interface_address6(), we used to never return internal addresses, but it looks like that code got removed?
In resolve_my_address(), we want to allow explicitly configured internal addresses (except if we are a DirAuth, step three). get_interface_address6() now also returns internal addresses so find_good_addr_from_list(allow_internal=1) can match these to any explicitly configured addresses.
In get_interface_address6(), are there still any leaks of addr? For example, in the case where we return NULL instead of goto err?
Yes, fixed.
In get_stable_interface_address6(), do we leak 'list' ?
Fixed.
All of the functions that allocate a new thing for their return value should document "returns a newly allocated copy of". Example: get_first_address_by_af. Or perhaps it should take a const smartlist_t * and return a reference to the appropriate element.
Fixed.
get_interface_address -- is it still necessary?
I don't think so. I've removed it now.
Why is find_good_addr_from_list IPv4-only? IT seems that having it take a tor_addr_t would make it more robust.
Fixed.
I have also added some code to handle the case where we guess a hostname, but the resolved address is not actually one of the configured ORPorts.
I've finally got time to add some tests. I've so far written some for all the bits that changed in common/address.c.
Additionally, some small "bugfixes" have been added to common/address.c, mostly using smartlist_free to free all memory allocated in smartlist_new(), not just the pointer to the struct.
@sqrt2: 0001-Make-bridges-publish-additional-addresses-in-their-d.patch doesn't apply to any of tor-0.2.4.17-rc (or -16 or -18) or master@{2013-11-17}. What is the baseline?
The code in get_stable_interface_address6 seems to assume that 'family' is AF_INET or AF_INET6. Maybe it should check that early on.
It would probably be good idea to have a function to malloc and copy a single address; that pattern happens a lot.
get_stable_interface_address6 looks like it could benefit from the "goto end" pattern of having a single point of cleanup.
In config.c:
We use tor_assert, not assert().
XXX come back and re-review resolve_my_address. It's too complicated. :(
In dirserv.c:
Wow, that conditional in dirserv_set_router_is_running is pretty complicated. Can we turn it into a function?
Local variables named "l" are begging for confusion with numberes named "1".
In nodelist.c:
In node_set_last_reachability, what is the point of setting ar->addrport.port and ar->addrport.addr immediately after checking whether those fields are equal to the values you're about to set them to? And why are we copying ap->addr into *addr?
The other new functions here need documentation.
addr_replied should have a name that makes clear that it's a predicate. Perhaps "addr_has_replied"? Similarly with all_listenres_have_Replied. Also, its arguments should be const.
all_listeners_have_replied looks a little expensive. We should make sure it isn't in a critical path.
Some of these functions should probably be static. Or STATIC, if they are getting tested directly.
test_addr.c:
Maybe some of these tests should use the test mocking functionality, so that they don't depend so heavily on the actual IP addresses of the running host?
Throughout:
Probably most of the loops should be declaring the type as "const tor_addr_t *", not "tor_addr_t *".
Try building this branch after running configure with "--enable-gcc-warnings". That will turn on a bunch of warnings that aren't on by default, but which we use to write cleaner code.
Most tor_addr_compare usage could be more cleanly written as tor_addr_eq.
Try to write if-else statements with braces on both the "if" and "else" clauses. Leaving braces out makes the code a little harder to review, since I need to make sure that the indentation is right.
This could still use more testing. Taking a look at the results of running the test coverage tools (see doc/HACKING for more info), I see that only about 1/3 of the new or modified lines have any tests reaching them.