Opened 10 months ago

Last modified 9 months ago

#25429 new defect

Need something better than client's `checkForStaleness`

Reported by: arlolra Owned by:
Priority: Medium Milestone:
Component: Obfuscation/Snowflake Version:
Severity: Normal Keywords:
Cc: dcf, arlolra Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

If no message has been received on the datachannel on the client for SnowflakeTimeout (30 seconds), checkForStaleness closes the connection. The comment says this is to,

Prevent long-lived broken remotes.

but there's no heartbeat at this level of abstraction so the connection is constantly being reset anytime the user pauses their activity (for example, to read a webpage).

This greatly exacerbated #21312

Child Tickets

Change History (4)

comment:1 Changed 10 months ago by dcf

I wonder if the repeated disconnections after 30 seconds is also the cause of "Your Guard is failing an extremely large amount of circuits" in #23780.

comment:2 Changed 10 months ago by arlolra

I wonder if the repeated disconnections after 30 seconds is also the cause of ...

I doubt it. Commenting out // go c.checkForStaleness() doesn't have any effect on that log line. However, changing the value of -max from 1 to 3 reduces it to single instance. And, I've only ever seen it at startup. Which leads me to believe it has something to do with buffering when the initial connections are made.

comment:3 in reply to:  2 Changed 9 months ago by dcf

Replying to arlolra:

I wonder if the repeated disconnections after 30 seconds is also the cause of ...

I doubt it. Commenting out // go c.checkForStaleness() doesn't have any effect on that log line.

That test may not work without deleting the state file in between--I believe this message comes from parsing the pb_ parameters in a Guard line in the state file:

Guard in=bridges rsa_id=2B280B23E1107BB62ABFC40DDCC8824814F80A72 bridge_addr=0.0.3.0:1 sampled_on=2018-03-01T19:18:39 sampled_by=0.3.3.2-alpha listed=1 confirmed_on=2018-02-26T22:51:38 confirmed_idx=0 pb_use_attempts=70.011719 pb_use_successes=46.431641 pb_circ_attempts=207.411194 pb_circ_successes=199.674622 pb_successful_circuits_closed=76.945984 pb_collapsed_circuits=94.614807 pb_unusable_circuits=28.113830 pb_timeouts=1.649048

Since the state file persists between runs, you wouldn't see the message go away until you had had enough successful connections to push the average down below some threshold, or something like that. And it seems that tor will only emit the message once, keeping track of whether it has done so in a path_bias_use_extreme variable, so that could explain why it is only seen at startup:

https://gitweb.torproject.org/tor.git/tree/src/or/entrynodes.h?h=tor-0.3.2.10#n46
https://gitweb.torproject.org/tor.git/tree/src/or/circpathbias.c?h=tor-0.3.2.10#n1284

But as for what to do with checkForStaleness, I don't understand its purpose either, but if we can't figure it out, we could just bump it up to a high value, like 5 hours or so.

comment:4 Changed 9 months ago by arlolra

I don't understand its purpose either

It was added in, https://gitweb.torproject.org/pluggable-transports/snowflake.git/commit/?id=ac9d49b8727b953c12a76e3645fe71a9ec3aab75
which doesn't provide much info.

It might be related to, https://gitweb.torproject.org/pluggable-transports/snowflake.git/commit/?id=cf1b0a49f13f2550cad1b32ef4e4820b4c26bcf1
where the client could be sitting around for several minutes waiting for the datachannel to close because the proxy disappeared.

Or, a denial of service where the proxy keeps the connection open but just doesn't send any data down the channel.

Note: See TracTickets for help on using tickets.