Opened 9 years ago

Closed 5 years ago

#2454 closed enhancement (fixed)

We should check our IP immediately when cbt notes the network is live again

Reported by: Sebastian Owned by:
Priority: Medium Milestone: Tor: 0.2.5.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay 025-triaged
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Running a bridge that had a disappearing network which then reappeared, giving me a different external IP address:

Jan 29 04:55:43.574 [Notice] Tor now sees network activity. Restoring circuit build timeout recording. Network was down for 293 seconds during 24 circuit attempts.
Jan 29 05:18:38.281 [Notice] Our IP Address has changed from [scrubbed] to [scrubbed]; rebuilding descriptor (source: 87.106.26.29).

We could cut down that time while the bridge is unreachable by triggering an immediate fetch of directory things afaik, and it seems worthwhile to do just that

Child Tickets

Attachments (2)

Change History (19)

comment:1 Changed 8 years ago by nickm

Milestone: Tor: 0.2.3.x-final

comment:2 Changed 7 years ago by nickm

Milestone: Tor: 0.2.3.x-finalTor: 0.2.4.x-final

comment:3 Changed 7 years ago by nickm

Type: defectenhancement

comment:4 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:5 Changed 7 years ago by nickm

Component: Tor RelayTor

comment:6 Changed 7 years ago by sysrqb

This patch is based on the assumption that short "blips" (less than 3 minutes) won't invalidate the client's lease and result in it receiving a new IP address. If this is wrong, then I'll remove the grace-period check or shorten it if a shorter time is more appropriate.

Basically, once we successfully build a circuit after timing out for a certain amount of time, we set a flag that is checked in run_scheduled_events when we check if the descriptor should be regenerated.

Thoughts?

Patch below and branch bug2454 on git://gitweb.evolvesoftware.cc/tor.git

comment:7 Changed 7 years ago by nickm

Status: newneeds_revision

nonlive_timeouts is a count of how many times we triggered the "not live" case in circuit_build_times_network_close(). It's not a time value itself, and it doesn't happen once per second IIUC.

comment:8 Changed 7 years ago by sysrqb

Yeah, that'd make no sense. I typed nonlive_timeouts but I meant network_last_live.

I've updated the patch below and is on branch bug2454-1.

comment:9 Changed 7 years ago by sysrqb

Status: needs_revisionneeds_review

Now that I can do this.

comment:10 Changed 7 years ago by nickm

Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final

Bumping to 0.2.5, along with other remaining noncritical enhancements.

comment:11 Changed 5 years ago by andrea

Triage: keep for 0.2.5, this has sat around way too long.

comment:12 Changed 5 years ago by andrea

Keywords: 025-triaged added

comment:13 Changed 5 years ago by andrea

Status: needs_reviewneeds_revision

This patch looks just fine to me, but I'm putting it in needs_revision because it's 18 months old. I think it should be rebased against current master and tested again to make sure it hasn't gone stale before merging.

comment:14 Changed 5 years ago by nickm

The patch didn't apply to master, but it was pretty small, so I recreated the changes by hand in branch "bug2454_025" in my public repository. Note that master no longer has time_to_check_ipaddress.

Changes/checks I'd suggest:

1) No global variables.

2) In fact, we don't need a new variable at all; as the code stands now, we just need a function that resets time_to_check_descriptor() in main.c. (But see 5 below.)

3) I'd like to grep though all the calls to circuit_build_times_network_is_live to make sure this really -should- work.

4) Can this fire more than once every 3 minutes?

5) Is there anything in that big block in main.c that we *don't* want to do in this case?

comment:15 in reply to:  14 ; Changed 5 years ago by andrea

Replying to nickm:

The patch didn't apply to master, but it was pretty small, so I recreated the changes by hand in branch "bug2454_025" in my public repository. Note that master no longer has time_to_check_ipaddress.

Changes/checks I'd suggest:

1) No global variables.

2) In fact, we don't need a new variable at all; as the code stands now, we just need a function that resets time_to_check_descriptor() in main.c. (But see 5 below.)

Yeah, that'd be cleaner.

3) I'd like to grep though all the calls to circuit_build_times_network_is_live to make sure this really -should- work.

Yeah.

4) Can this fire more than once every 3 minutes?

Do you mean you think three minutes is *too long* and we should recheck sooner, or you're worried it might run too often and if it did something it runs would either be too expensive or not idempotent?

5) Is there anything in that big block in main.c that we *don't* want to do in this case?

It looks like most of that stuff has checks to stop it running too often, but check_descriptor_ipaddress_changed() starts DNS queries in some cases and routerlist_remove_old_routers() has a for loop over routerlist->routers - might be worth limiting how often that runs.

comment:16 in reply to:  15 Changed 5 years ago by nickm

Replying to andrea:

4) Can this fire more than once every 3 minutes?

Do you mean you think three minutes is *too long* and we should recheck sooner, or you're worried it might run too often and if it did something it runs would either be too expensive or not idempotent?

The latter.

5) Is there anything in that big block in main.c that we *don't* want to do in this case?

It looks like most of that stuff has checks to stop it running too often, but check_descriptor_ipaddress_changed() starts DNS queries in some cases and routerlist_remove_old_routers() has a for loop over routerlist->routers - might be worth limiting how often that runs.

I think burning some CPU if this gets out of hand isn't too bad -- I think the bad thing would be if we were spamming descriptors to the directory servers or something.

comment:17 Changed 5 years ago by nickm

Resolution: fixed
Status: needs_revisionclosed

Okay, I removed the global and did the appropriate auditing. Based on discussion here and on IRC, I think this is okay now.

Branch "bug2454_025" in my public repo removes the global variable usage. I squashed it as "bug2454_025_squashed" and merged.

Note: See TracTickets for help on using tickets.