Opened 9 years ago

Last modified 17 months ago

#1136 assigned defect (None)

When Tor is offline, it doesn't quite run out of relays, so doesn't realize it's offline

Reported by: arma Owned by: mikeperry
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-client, shutdown, bootstrap, sponsor8-maybe
Cc: arma, nickm, Sebastian, mikeperry, teor Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by mikeperry)

If your Tor client goes offline, it will keep trying to establish circuits,
and as each TLS connection fails, it will mark that relay down.

In update_router_have_minimum_dir_info() Tor checks whether (num_present < 2)
but we never actually mark down the last few relays, either because we don't
have enough left to make a circuit so we don't ever try another TLS connection,
or because none of the remaining relays are suitable exit nodes so we can't
pick a path that would be a useful circuit so we don't try.

I think we need to catch the case where we failed to pick a path because we
don't have enough circuits, and if it case occurs and many of our relays are
marked down, we should mark them up.

That will cause us to attempt circuits for a lot longer than currently, but on
the other hand Tor will actually work when you come back to the network and
try to make a new application request.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Change History (20)

comment:1 Changed 9 years ago by mikeperry

I feel like this means that we should create a different flag variable for clients. The variable is_running in
routerinfo_t for example is marked as local only, yet we still appear to updating the routerstatus_t version when
we fail to reach a relay as a client. The problem seems to be that router_set_status() sets both versions
of the flag.

If we differentiate, we can then safely re-copy all of the consensus versions of is_running over our local versions,
as opposed to marking *every* router as now up.

comment:2 Changed 9 years ago by nickm

That sounds great. Let's talk about how best to do this. Immediately, just adding a new flag might do. But
in the meantime, I'm planning a related change for microdescriptors, where we stop using routerinfo_t as a
repository of local per-router knowledge. What we make a new shiny router_t that tracks the local Tor's
knowledge and opinion of each router, and turn all the routerstatus_t in a consensus, and every routerinfo_t,
into a nice immutable thing?

comment:3 Changed 9 years ago by mikeperry

Ok. This bug is blocking some of my circuit build times and bandwidth weight tests. Can I start out by just
moving the mutable version of this flag into routerinfo_t and implementing this reset by copy method in the
meantime? I can try to look for any other instances of modification of either routerstatus_t or routerinfo_t
and mark them with comments in the branch I do this in.

comment:4 Changed 9 years ago by nickm

There are a _lot_ of things that touch flags in routerstatus_t and routerinfo_t. If you're going to do this,
we should think closely about what the new status is, and how you're going to make sure that everything that
currently sets or examines the old flag is changed as needed. Then we should write it in or-dev so that we
know what design we're going for.

comment:5 Changed 9 years ago by mikeperry

Ok, I meant to simply stop the modification of is_running in routerstatus_t and move those modifications to
routerinfo_t. As for when to read which version, I think that Tor should primarily use the routerinfo_t version,
except when it detects that it runs out of routers, in which case it should retry every so often by resetting
all of the is_running flags to their routerstatus_t consensus versions. I can describe this in more depth on
or-dev when I start looking through the code again.

As far as commenting, I was planning on only marking weird modifications of routerstatus_t that I happen to
see while making this change. I feel like routerstatus_t is the first thing we should try to make readonly.
I don't intend on being exhaustive here, just a 'while I'm at it' kind of thing.

comment:6 Changed 8 years ago by mikeperry

Description: modified (diff)
Owner: set to mikeperry
Status: newassigned

comment:7 Changed 8 years ago by nickm

Milestone: Tor: 0.2.2.x-final

comment:8 Changed 8 years ago by nickm

Priority: majornormal

Downgrading to "normal". This didn't make 0.2.1 unusable, so it isn't a showstopper for 0.2.2.x

comment:9 Changed 8 years ago by arma

Triaging: this is an old bug. it requires some finesse to solve right. it's a bug on 0.2.1.x so we don't hurt anything by switching stable to 0.2.2.x. It might be that Nick's refactoring work for microdescriptors (#1757) will make #1136 easier to solve.

I think this bug shouldn't block 0.2.2.x stable.

comment:10 Changed 8 years ago by arma

See #675 for a very related bug. Maybe even the same bug.

comment:11 Changed 8 years ago by mikeperry

Milestone: Tor: 0.2.2.x-final

comment:12 Changed 8 years ago by mikeperry

Milestone: Tor: 0.2.3.x-final

comment:13 Changed 7 years ago by nickm

Milestone: Tor: 0.2.3.x-finalTor: unspecified

comment:14 Changed 7 years ago by Sebastian

Milestone: Tor: unspecifiedTor: 0.2.3.x-final

#675 is in 0.2.3.x-final, moving this one there too

comment:15 Changed 6 years ago by nickm

Milestone: Tor: 0.2.3.x-finalTor: unspecified

comment:16 Changed 6 years ago by nickm

Keywords: tor-client added

comment:17 Changed 6 years ago by nickm

Component: Tor ClientTor

comment:18 Changed 3 years ago by teor

Cc: teor added
Severity: Normal

comment:19 Changed 17 months ago by nickm

Keywords: shutdown bootstrap added

comment:20 Changed 17 months ago by nickm

Keywords: sponsor8-maybe added
Note: See TracTickets for help on using tickets.