Opened 11 years ago

Closed 9 years ago

Last modified 7 years ago

#652 closed defect (fixed)

Tor doesn't detect when the world cannot reach it anymore

Reported by: Sebastian Owned by:
Priority: Low Milestone: post 0.2.1.x
Component: Core Tor/Tor Version: 0.2.0.23-rc
Severity: Keywords: tor-relay
Cc: Sebastian, nickm, arma Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by nickm)

This is Tor r14297. When my internet connection dies (as it does every 24 hours),
and I come back up with a different IP address, Tor doesn't seem to notice that
the world cannot connect to it anymore. This used to be different in the pre-RC
versions, I haven't checked the previous RCs for that behaviour. Even after 6 hours,
Tor has nothing in the Notice-level log.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Change History (14)

comment:1 Changed 11 years ago by arma

(it does occur to me that since we use dir fetches right now to notice when

+things have switched, then a relay with an open dirport will notice a lot
+faster.)

actually. a relay with an *advertised* dirport will notice a lot faster.

+even if your dirport is defined, there are many reasons why your tor chooses
+not to advertise it.

so your tor used to detect ip address changes correctly,
but then you turned your dirport off, and now it doesn't?

yes, that's what I think now at least

try turning it back on again :)
see if it resumes noticing

comment:2 Changed 11 years ago by Sebastian

It notices the the change in IP address within 6 minutes with the dirport enabled

comment:3 Changed 11 years ago by arma

Ah ha.

Option #1 is to make relays fetch dir stuff just like normal dir mirrors
even if their dirport is off or not being advertised. That would be an easy
fix. but it would involve adding a lot of useless overhead, which is why we
made them stop in the first place (it was a feature).

Option #2 is to have non-dir relays fetch "server/authority" from a dir
mirror every 5 or 10 minutes even though they don't need it. That seems like
kind of a waste, but if our dynamic-IP relays fall off the net otherwise, it's
not a horrible tradeoff.

Option #3 is something a bit smarter: Sebastian suggested making relays
notice when their traffic drops off, and decide to check then if they've got
a new IP address. We would want to make sure that our heuristics don't make this
check trigger too often, but it's a good idea in theory. This would be something
we work on in 0.2.1.x and consider backporting; so option #2 is still appealing
as a stopgap measure.

I had used to think that once we started reading NETINFO cells we'd be in better
shape here, since that's a way that even non-dir-mirrors would get told guesses
of their IP address. But if we're unreachable, we're not going to hear any incoming
connections, and we're not going to hear any requests to extend to other places.
So option #4: we would need to initiate connections on our own periodically.
Perhaps to our guards, if we don't get a keepalive cell from them on time? Or maybe
this will all work naturally if TCP notices the connection is broken and we
automatically retry a new circuit? But I don't trust TCP to fail a connection within
a given timeframe.

Any other ideas?

comment:4 Changed 11 years ago by nickm

Option 2 is easiest for 0.2.0.x, I think. We can be way less agressive than every 10 minutes; if a
host is off the net for an hour whenever its IP changes, this is okay unless the IP changes so often as to
make it useless.

Option 4 seems best for 0.2.1.x, perhaps combined with #3 so that with "launch connections on our own" is
replaced with "launch connections (test circuits?) on our own if we haven't seen many incoming connections lately."

comment:5 Changed 11 years ago by Sebastian

Nick: how often do you expect an IP to change?
Sebastian: once every 24 hours or more often
Nick: Please realize that clients won't find out till they get a consensus mentioning the new descriptor, which will take at least an hour anyway
Sebastian: yes, I know that
Nick: hm. could be configurable, with default 2 or 3 hours and minimum 20 minutes.
Sebastian: But that means of the 24 hour timeframe, 1 hour drops out because they need a new consensus
Sebastian: I wonder why there couldn't be something like "at first, ask after 5 minutes, if that fails, ask again after 10, then 20"
Nick: (also, when we're done, you should copy this conversation to the bug report, so we don't forget about it before the bug gets fixed.)
Sebastian: I will, I just wanted to talk about it first :)
Nick: there can. It's just easier to paint the bikeshed mauve than plaid. ;)
Sebastian: ok, fair point :)
Sebastian: hm, I just think the design currently doesn't make good enough use of nodes with short uptime... But on the other hand, I guess decreasing that hour to 30 minutes won't make such a huge difference, as the design is unchanged
Nick: Agreed on the making use of nodes with short uptime.
Nick: It would be nice to fix

comment:6 Changed 11 years ago by arma

My best thought for a fix is to have a time_t in 0.2.1.x that gets updated
whenever we receive bytes from a non-local connection and whenever we receive
a new non-local connection.

Then in main.c we see if that time_t is sufficiently large, and if so, every
time it hits a multiple of, say, 300 seconds, we launch a tor/server/authority.z
request.

We would want to make sure to catch the various edge cases where we don't need
to. For example, if we're hibernating. Or maybe if !has_completed_circuit.

The 0.2.0.x solution for now would be to not add the time_t, and just do that
launch periodically. (We could then backport the time_t deal once we trusted it.)
But it would seem that we would still need to figure out the edge cases. Hm.

comment:7 Changed 11 years ago by nickm

Possible patch for the dumb version: what do people think of it?

src/or/routerlist.c

src/or/routerlist.c

==================================================================
--- src/or/routerlist.c (revision 15283)
+++ src/or/routerlist.c (local)
@@ -4013,17 +4013,34 @@

smartlist_free(no_longer_old);

}


+/ How often should we launch a server/authority request to be sure of getting
+ * a guess for our IP? */
+/*XXXX021 this info should come from netinfo cells or something, or we should
+ * do this only when we aren't seeing incoming data. see bug 652. */
+#define DUMMY_DOWNLOAD_INTERVAL (20*60)
+

/ Launch downloads for router status as needed. */
void
update_router_descriptor_downloads(time_t now)
{

or_options_t *options = get_options();

+ static time_t last_dummy_download = 0;

if (should_delay_dir_fetches(options))

return;

if (directory_fetches_dir_info_early(options)) {

update_router_descriptor_cache_downloads_v2(now);

}
update_consensus_router_descriptor_downloads(now);

+
+ if (server_mode(options) &&
+ last_routerdesc_download_attempted + DUMMY_DOWNLOAD_INTERVAL < now &&
+ last_dummy_download + DUMMY_DOWNLOAD_INTERVAL < now) {
+ /* If we haven't tried to get any routerdescs in a long time, try a dummy
+ fetch now. */
+ last_dummy_download = now;
+ directory_get_from_dirserver(DIR_PURPOSE_FETCH_SERVERDESC,
+ ROUTER_PURPOSE_GENERAL, "authority.z", 1);
+ }

}


/ Launch extrainfo downloads as needed. */

comment:8 Changed 11 years ago by nickm

The easy fix is now in svn for 0.2.0 and 0.2.1.

comment:9 Changed 11 years ago by jasemandude

This seems related to the client unable to create circuits behavior in bugs 648 and 675.

comment:10 Changed 11 years ago by nickm

Since the easy solution is in svn, I'm marking this as an 0.2.1 issue.

comment:11 Changed 10 years ago by arma

It's doing fine in 0.2.1.x still, so I'm going to mark this post-0.2.1.x.

I think that every relay launching a simple dir request every 20 minutes isn't
so bad, all things considered.

I think we could get quicker notification if we did the trick described above,
of noticing when everything has gone silent. We will want to do that one day,
to make even better use of dynamic-ip relays.

comment:12 Changed 9 years ago by nickm

Description: modified (diff)
Resolution: Nonefixed
Status: newclosed

So the probe that happens here now is in update_router_descriptor_downloads() if we're a server, and it happens every 20 minutes, if 20 minutes have passed since we last tried fetching any routerdescs.

We could also make it happen less often if we somehow know that our existing published address is okay: for example, if we are getting new OR connections and DIR connections happily, or if we learn our IP from some other source like a NETWORKSTATUS cell. But in practice, that doesn't seem to matter. I'm going to open a new bug report for this since the current solution is inelegant: see #2178. The original bug here was fixed back in 2008.

comment:13 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:14 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.