Opened 3 years ago

Last modified 8 months ago

#16659 assigned defect

Linux TCP Initial Sequence Numbers may aid correlation

Reported by: source Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Analysis Version:
Severity: Normal Keywords: nickm-cares, research-ideas
Cc: nickm@…, adrelanos@…, mcs Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

TCP Sequence Numbers seem to be one more way to leak the host clock on GNU/Linux systems. Its the last major vector in the literature thats not addressed yet.[1] The kernel embeds the system time in microseconds in TCP connections. Some opinions say the TCP ISNs are salted hashes and can't be abused but my impression from Steve Murdoch's papers are that its feasible and already carried out in his tests. [2][3]

There is no sysctl option to disable it and it must be patched upstream [4][5]

Nick has done exceptional work to get OpenSSL upstream to throw out mandatory timestamping in the protocol. TAILS and Whonix disable TCP Timestamps in the kernel sysctl. TCP Timestamps are a different vector from TCP ISNs discussed here - it would be great if upstream kernel disables this as well so all distros have it.

[1]https://www.cl.cam.ac.uk/~sjm217/papers/ccs06hotornot.pdf
[2]http://caia.swin.edu.au/talks/CAIA-TALK-080728A.pdf
[3]http://www.cl.cam.ac.uk/~sjm217/papers/ih05coverttcp.pdf
[4]https://stackoverflow.com/a/12232126
[5]http://lxr.free-electrons.com/source/net/core/secure_seq.c?v=3.16

Child Tickets

Change History (25)

comment:1 Changed 3 years ago by proper

Cc: adrelanos@… added

comment:2 Changed 3 years ago by mcs

Cc: mcs added

comment:3 Changed 3 years ago by nickm_mobile

Hmm. So, this issue wouldn't work the same way as the ssl clienthello issue would work. WIth clienthello, the timestamp was sent both locally in non-anonymized tls and remotely in anonymized tls. Here, the timestamp is sent locally, but not remotely, since tor doesn't relay tls headers.

Now, there could still be an issue : if tcp (or some other protocol) leaks the client's view of the current time to the local network, and some other protocol leaks the client's view of the time remotely.

Generally, the answer we've mostly gone with in cases like that is to attend mostly to the anonhmized protocol. There's generally more work to do there anyway. But if there's an easy fix to better hide more time info in tcp, I'd be in favor.

comment:4 Changed 3 years ago by arma

Right -- see https://www.torproject.org/docs/faq#RemotePhysicalDeviceFingerprinting

OS's like Tails might want something for their traffic on the local network, but the Tor process itself does not worry about this issue: "Tor transports TCP streams, not IP packets, so we end up automatically scrubbing a lot of the potential information leaks."

comment:5 Changed 3 years ago by source

OK so if I understand correctly, the Tor's protocol inside the OpenSSL encryption layer never sends TCP ISNs or any other timestamps past the guard node.

Wouldn't Tor (and any application) that operates on top of a Linux host's TCP Layer 3 still leak this information through no fault of their own? The clock information would be embedded in the lower layer (encapsulating TCP packets) observable in the local (non-anonymized) connection from client to Guard node.

Diagram:

Tor TCP protocol sanitizes time
====================
OpenSSL timestamps eliminated
====================
Linux TCP Layer 3 contains ISNs

Last edited 3 years ago by source (previous) (diff)

comment:6 Changed 3 years ago by yawning

So patch your kernel? I'm not seeing why this is a Tor issue, beyond "if you switch to using a UDP based transport, this will be a non-issue".

Your Guard, or anyone that sits between you and your Guard knows who you are. Leaking the delta of a timer that is on a 274s period with 64 ns resolution doesn't seem like a big deal. Real time in ns is shifted, truncated, then added to a salted hash to derive the ISN, so it's not like it's possible to work backwards to the real time (or for that matter the original timer value) in any way, the best you can do is obtain load information via clock skew.

I'm tempted to NAB this unless someone tells me otherwise.

comment:7 Changed 3 years ago by mikeperry

I think there is some confusion due to a recent tor-talk post that was stitched together based on a bunch of partial/incorrect/unrelated information and then ignored by reasonable people because tor-talk is ruled by trolls.

If you could actually recover the current time from the ISN, that would be a cause for concern, since it could make correlation attacks much easier given an additional application layer timestamp at the exit or hidden service. But I agree, it doesn't seem like that is actually the case.

Last edited 3 years ago by mikeperry (previous) (diff)

comment:8 Changed 3 years ago by source

So patch your kernel? I'm not seeing why this is a Tor issue, beyond "if you switch to using a UDP based transport, this will be a non-issue".

No one is saying it is one. I believe the goals of the TAILS, Whonix and Tor projects are aligned when researching and designing systems resistant to attack. Tor developer Jacob Appelbaum brought up the problem of TCP Timestamps on the TAILS mailing list and led to them disabling this feature. Its not a bug ticket but more of a research question. If it is a serious problem it could have far reaching consequences. Simply patching my kernel would make me stand out and not protect virtually every Linux system out there.

I think there is some confusion due to a recent tor-talk post that was stitched together based on a bunch of partial/incorrect/unrelated information and then ignored by reasonable people because tor-talk is ruled by trolls.

I wasn't aware of this but I have nothing to do with it and I'm looking for answers from reputable and competent people aka you the Tor Project team.

Your Guard, or anyone that sits between you and your Guard knows who you are. Leaking the delta of a timer that is on a 274s period with 64 ns resolution doesn't seem like a big deal. Real time in ns is shifted, truncated, then added to a salted hash to derive the ISN, so it's not like it's possible to work backwards to the real time (or for that matter the original timer value) in any way, the best you can do is obtain load information via clock skew.

But pages 10-12 in http://www.cl.cam.ac.uk/~sjm217/papers/ih05coverttcp.pdf seem to describe how to work backwards and get the original clock.

If I'm not mistaken, the TCP ISN code here: http://lxr.free-electrons.com/source/net/core/secure_seq.c?v=3.16 suggests the time is added after the source/destination port and IP are hashed together with a secret.

There is also the question of whether a 32bit salt is enough, if indeed the time is part of the hashed information - but it doesn't seem so.

the best you can do is obtain load information via clock skew.

Wouldn't it be better to completely close down this attack vector?

Last edited 3 years ago by source (previous) (diff)

comment:9 in reply to:  7 Changed 3 years ago by yawning

Resolution: not a bug
Status: newclosed

Replying to mikeperry:

If you could actually recover the current time from the ISN, that would be a cause for concern, since it could make correlation attacks much easier given an additional application layer timestamp at the exit or hidden service. But I agree, it doesn't seem like that is actually the case.

The information's only propagated as far as the Guard anyway, and if you suspect you're a given HS's guard, confirming it doesn't require TCP sequence number trickery.

But pages 10-12 in ​http://www.cl.cam.ac.uk/~sjm217/papers/ih05coverttcp.pdf seem to describe how to work backwards and get the original clock.

For Linux 2.2, 2.4, and 2.6. I don't care enough to check when they changed the algorithm.

If you actually bothered to read the code in question, you would see that:

  1. net_secret is initialized once and exactly once, and no longer periodically like described in the paper.
  2. The MD5 (not MD4 as described in the paper) hashed value, is added to the shifted and truncated time in nanoseconds seq + (ktime_to_ns(ktime_get_real()) >> 6). This transform is destructive, and the part that's added is (as I said in my comment) a cyclical timer with a 274 s period and 64 ns resolution.

Anything vaguely resembling the full host's time is totally destroyed by the shift + truncate step.

NABing. Complain to the Linux kernel developers if you think this is a big deal.

comment:10 Changed 3 years ago by mikeperry

FTR, I think is is worth complaining to the kernel developers for the simple reason that adding the 64ns timer post-hash probably *does* leak side channels about CPU activity, and that may prove very dangerous for long-running cryptographic operations (along the lines of the hot-or-not issue). Unfortunately, someone probably needs to produce more research papers before they will listen.

As far as this ticket goes, though, I agree with the NAB right now, because the ISN does not appear to leak the host clock due to the 32bit truncation of 64ns ticks.

comment:11 Changed 3 years ago by mikeperry

An extra question here is if it is possible to reconstruct even these 32 bits of time value than remain from the ISN, which would potentially assist correlation even without the full clock.

I don't think that is possible either, because net_secret is 128 bits and the connection tuple should make replays rare, but the use of MD5 is concerning here. If this were possible, it also seems like that should reduce to ISN prediction, as well, though.

comment:12 Changed 3 years ago by mikeperry

Summary: TCP Initial Sequence Numbers Leak Host ClockLinux TCP Initial Sequence Numbers may aid correlation

Ok - ignoring MD5, here is the scenario where this might matter: In about 1 in every sqrt(65535)==256 connections to your guard node, your tuples will replay, and the adversary will see this as well. In that case, they know the argument to seq_scale in http://lxr.free-electrons.com/source/net/core/secure_seq.c?v=3.16#L26 has replayed, and can then use this replay to extract 32 bits of your clock.

Then, on the application layer (at the exit or hidden service) a clock leak can be used to aid correlation.

Since Tor latency is on the order of a second, this probably gives about 8 bits of information to aid correlation, though in practice it is probably less than that because most people tend to be using NTP and other network time syncs. Those that aren't however, will be in really bad shape against this attack.

Concerning. Maybe is a bug after all. But again, we're not exactly on strong footing to land a kernel patch here. We'd have better luck making an ISN prediction argument, I bet.

comment:13 Changed 3 years ago by proper

NTP, as per NTP RFC does leak the local clock.

Origin Timestamp (org): Time at the client when the request departed
for the server, in NTP timestamp format.

Destination Timestamp (dst): Time at the client when the reply
arrived from the server, in NTP timestamp format.

So using it doesn't make things better, but worse. (Also NTP is in default configuration unencrypted/unauthenticated, therefore accessible to observation and modification by any ISP level adversary.)

comment:14 Changed 3 years ago by nickm

Resolution: not a bug
Status: closedreopened

I'm going to reopen this. I still say that the best point at which to try to resist stuff of this kind is at the application level, but resisting it better locally too can't be a bad idea.

comment:15 in reply to:  14 Changed 3 years ago by yawning

Replying to nickm:

I'm going to reopen this. I still say that the best point at which to try to resist stuff of this kind is at the application level, but resisting it better locally too can't be a bad idea.

Well, there is one thing that we can do, though it'll be a lot of code (that I won't write). Since part of the hash input is the TCP source port, we can use our cryptographic random number generator and explicitly randomize the source port on Linux (probably optionally). This should mostly mitigate the "attack" in question.

I still think this is extremely hard to exploit (bordering on "there are better things you can do if you are in a position to do so") and a kernel issue rather than a Tor issue.

comment:16 Changed 3 years ago by source

At the moment I'm brushing up documentation about Time/Clock based attacks and I wanted to confirm some things about the mitigation advice I'm giving for those in high risk situations like running an Onion Service: https://www.whonix.org/wiki/Time_Attacks

If I understand correctly, when running Tor, a passive network adversary looking at the Tor connection from outside cannot abuse this vector unless they are running your guard node. So the advice goes that Torrifying all connections from a machine will limit potential attackers to a colluding guard node (until defenses are introduced).

Is this right?

I am basing these conclusions on advice from Robert Ransom on defending against Clock skew attacks:
http://archives.seul.org/or/talk/Sep-2011/msg00060.html

They can only use that to locate your server if they can either
connect to it directly (not through Tor) or accept a non-Torified
connection from it, and determine what your server thinks is the
current time based on information it receives on that connection.

The obvious ways that your server could leak its current time include
running a web server and sending e-mail messages. The less obvious
ways include opening an outbound TLS connection and running a cron job
with externally observable effects (e.g. an automatic update
downloader).

and on information about how the measurer confirms their victim in the Hot or Not paper:

Measurer:
Connects directly to the Hidden Server’s public
IP address, requesting TCP timestamps, ICMP times-
tamps and TCP sequence numbers

comment:17 Changed 3 years ago by proper

https://lists.torproject.org/pipermail/tor-talk/2015-August/038697.html
Murdoch, Steven:

On 25 Jul 2015, at 17:49, Patrick Schleizer <patrick-mailinglists@…> wrote:

On the other hand, I've read the claim "The kernel embeds the system
time in microseconds in TCP connections.", but I haven't found the code
in question to confirm, that this is so. Any idea?

The code is here:

http://lxr.free-electrons.com/source/net/core/secure_seq.c

In particular the seq_scale(u32 seq) function introduces the timestamp.

So if you see two initial sequence numbers for TCP streams between the same source/destination port/IP then you can work out the time difference (in units of 64 ns) according to the clock of the other end point.

Best wishes,
Steven


FYI, made a list of local clock leaks. (w)

comment:18 Changed 3 years ago by sjmurdoch

There seems to be some confusion here. What the Linux ISN leaks is the difference between two timestamps, not the timestamp itself. A difference lets you work out drift and skew, which can help someone fingerprint the computer hardware, its environment and load. Of course that only works if you can probe a computer, and maintain the same source/destination port and IP address.

comment:19 Changed 3 years ago by proper

Which trac component is this?

comment:20 Changed 3 years ago by rl1987

Component: - Select a componentAnalysis

comment:21 Changed 2 years ago by cypherpunks

Severity: Normal

comment:23 Changed 15 months ago by nickm

Keywords: nickm-cares added

comment:24 Changed 15 months ago by karsten

Owner: set to metrics-team
Status: reopenedassigned

comment:25 Changed 8 months ago by irl

Keywords: research-ideas added
Note: See TracTickets for help on using tickets.