Opened 5 years ago

Last modified 13 months ago

#10969 reopened defect

Set of guard nodes can act as a linkability fingerprint

Reported by: asn Owned by: mikeperry
Priority: High Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-client, tor-guard, XKEYSCORE, prop259, SponsorU-deferred, QUICKANT
Cc: hopper@…, nikita@…, isis Actual Points:
Parent ID: Points: large
Reviewer: Sponsor:

Description

It's well understood that your set of guard nodes can act as a fingerprint. Some calculations can be found in comment:3:ticket:9273 but it's pretty clear that each 3-subset of guards is rare enough that it's very likely that no other clients have exactly the same.

There are a few proposed ideas on how to reduce the linkability of guard nodes sets. For example, reducing the number of guard nodes to 1 will help against this. Still, as an example, in a city with only 500 Tor users, even if each person has a single guard, there are only going to be a few people with the same guard node (and some of them might always be in the same physical location, so the one who roams is probably the same person).

To further improve on the above, maybe it makes sense to pick N guards but only use 1 of them at a time -- and cycle through the N guards every now and then. Maybe we should cycle everytime we change network (see https://github.com/leewoboo/tordyguards) but how does little-t-tor knows when we changed network? There is some more discussion on this topic here:
https://lists.torproject.org/pipermail/tor-dev/2013-September/005424.html

Child Tickets

Change History (49)

comment:1 Changed 5 years ago by cypherpunks

There has been a note about this issue in path-spec.txt for 7.5 years:

https://gitweb.torproject.org/torspec.git?a=blob_plain;hb=HEAD;f=path-spec.txt

comment:2 Changed 5 years ago by NickHopper

Leif and I talked about this a bit at the Dev meeting -- it's a pretty tough problem to solve.  From an anonymity standpoint, probably the most attractive solution is something like:

  1. At startup the user (or OS) provides a pass phrase, PP.
  1. For each SSID the user connects to, the tor client has a separate state file. 
  1. The state file is encrypted using Hash(PP|SSID) as the key. 
  1. tor can determine which state file to use by trial decryption - if none decrypt correctly, the client uses a new state file, picking fresh guard(s) for this SSID.

What's unpleasant about this approach is that trial passphrases can be tested offline, and the SSID doesn't add a lot of entropy to the task (none if the adversary knows at least one SSID the client connected through), so a user's state file essentially becomes a record of all the SSIDs she has connected to.  One option would be to convert this into an online brute force attack, by incorporating some unpredictable element that is retrieved through tor into the key calculation.  The process would become something like:

  1. At startup the user (or OS) provides a pass phrase, PP.
  1. For each SSID the user connects to, the tor client has a separate state file.

3'. To compute the key, the client connects through tor to a "blind signing hidden service," which signs exactly one value per connection;  the client gets a signature Sig on Hash(PP|SSID) and then computes Hash2(Sig|PP|SSID) to compute the state file encryption key.

  1. as above.

The idea here is that each pass phrase test requires an online connection attempt, so we can naturally rate limit the brute forcing speed.

The connection to the hidden service need not use a guard node, because the fact that a new session connects to the blind signing service isn't private; and since the signatures are blinded the service can't accumulate information about a user.

comment:3 Changed 5 years ago by NickHopper

Cc: hopper@… added

comment:4 Changed 5 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.6.x-final

comment:5 Changed 5 years ago by nikita

Cc: nikita@… added

comment:6 Changed 4 years ago by nickm

Keywords: 026 added
Parent ID: #11480

comment:7 Changed 4 years ago by nickm

Keywords: 026-triaged-1 added; 026 removed

comment:8 Changed 4 years ago by cypherpunks

here is a second implementation of the tordyguards idea (different set of guards for each network):

https://bitbucket.org/mckinney_subgraph/torshiftchange

comment:9 Changed 4 years ago by cypherpunks

Not using guards (setting UseEntryGuards 0) actually means you'll pick a new entry node for every circuit. This will greatly reduce the amount of time it takes for you to eventually pick a malicious relay as an entry node. If you're concerned about this issue but aren't using one of the tools linked above, a better option (than UseEntryGuards 0) is to leave UseEntryGuards enabled (the default) but stop tor and nuke your state file to get a new set of guards whenever you physically relocate.

Is there any progress on fixing this properly? Or are you waiting to see some XKEYSCORE fingerprints showing that this linkable information is actually being linked?

comment:10 Changed 4 years ago by cypherpunks

Tails appears to be about to add persistent guard support without addressing the location linkability issue at all: https://mailman.boum.org/pipermail/tails-dev/2013-May/003113.html :(

See also "[tor-talk] Location-aware persistent guards" thread from 2012: https://lists.torproject.org/pipermail/tor-talk/2012-October/025975.html

comment:11 Changed 4 years ago by isis

Cc: isis added

comment:12 Changed 4 years ago by nickm

Milestone: Tor: 0.2.6.x-finalTor: 0.2.???

We've made this better by moving to one traffic guard. More analysis may yield more designs.

comment:13 Changed 4 years ago by cypherpunks

Keywords: XKEYSCORE added

comment:14 Changed 3 years ago by mikeperry

Actual Points: large
Milestone: Tor: 0.2.???Tor: 0.2.8.x-final

comment:15 Changed 3 years ago by mikeperry

Owner: set to mikeperry
Status: newassigned

comment:16 Changed 3 years ago by mikeperry

This depends on #12538.

comment:17 Changed 3 years ago by nickm

Actual Points: large
Points: large

comment:18 Changed 3 years ago by bugzilla

Keywords: tor-guard added; tor-guards removed
Severity: Normal
Type: taskdefect

comment:19 Changed 3 years ago by cypherpunks

Is this really a worthwhile problem to solve? If there there is any non-Tor network traffic, it likely contains some unique identifier anyway (cookies, session ids, unique ids in update checks; the set of update checks is likely unique; standard browsers likely have a unique fingerprint that is probed by ad networks).

Users who are not able to set guard nodes manually, will more than likely run into one of the things mentioned above.

comment:20 in reply to:  19 ; Changed 3 years ago by isis

Replying to cypherpunks:

Is this really a worthwhile problem to solve? If there there is any non-Tor network traffic, it likely contains some unique identifier anyway (cookies, session ids, unique ids in update checks; the set of update checks is likely unique; standard browsers likely have a unique fingerprint that is probed by ad networks).

Users who are not able to set guard nodes manually, will more than likely run into one of the things mentioned above.


Yes. It is very important.

It's important because one of the major use cases for Tor is for people who have been (or are) in abusive relationships, or people who have stalkers. All such an abuser/stalker would need to do to find the person's physical location is wardrive around while running wireshark — any relatively unskilled person could do this.

A bigger problem is that this allows adversaries with more observational capabilities (e.g. NSA, GCHQ, BND, perhaps some ISPs) to track Tor users' movements and store that data forever.

comment:21 in reply to:  20 Changed 3 years ago by cypherpunks

Replying to isis:

It's important because one of the major use cases for Tor is for people who have been (or are) in abusive relationships, or people who have stalkers. All such an abuser/stalker would need to do to find the person's physical location is wardrive around while running wireshark — any relatively unskilled person could do this.

That assumes that the WIFI network is not properly secured. The WIFI network already broadcasts the access point MAC and the potentially unique SSID in clear. Those are better identifiers than a set of guard nodes. Currently only one guard will be active. Forcing usage of the others will require an active attack.

A bigger problem is that this allows adversaries with more observational capabilities (e.g. NSA, GCHQ, BND, perhaps some ISPs) to track Tor users' movements and store that data forever.

How many users route all traffic through Tor? For those users that do and don't have non-Tor traffic leaks, Tor most likely won't be able to detect the the network.

For those users that do have non-Tor network traffic, that traffic most likely contains better unique identifiers than the set of guard nodes. Normally only one guard node will be used anyway, if there is no active attack.

comment:22 Changed 3 years ago by nickm

Milestone: Tor: 0.2.8.x-finalTor: 0.2.9.x-final

These seem like features, or like other stuff unlikely to be possible this month. Bumping them to 0.2.9

comment:23 Changed 3 years ago by isabela

Sponsor: SponsorU-can

comment:24 Changed 3 years ago by nickm

Priority: MediumHigh

comment:25 Changed 3 years ago by nickm

Keywords: prop259 added

These are all prop259-related.

comment:26 Changed 3 years ago by isabela

Milestone: Tor: 0.2.9.x-finalTor: 0.2.???

tickets market to be removed from milestone 029

comment:27 Changed 2 years ago by nickm

Keywords: SponsorU-deferred added
Sponsor: SponsorU-can

Remove the SponsorU status from these items, which we already decided to defer from 0.2.9. add the SponsorU-deferred tag instead in case we ever want to remember which ones these were.

comment:28 Changed 2 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:29 Changed 23 months ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:30 Changed 20 months ago by teor

Has the new guard design in 0.3.0 fixed this issue?
Has switching to one entry guard fixed this issue?

comment:31 in reply to:  30 Changed 20 months ago by asn

Replying to teor:

Has the new guard design in 0.3.0 fixed this issue?
Has switching to one entry guard fixed this issue?

Hey teor,

switching to one entry guard slightly improved the situation, but did not fix the issue. The new guard design did not fix the issue either.

An adversary who monitors your connection enough to be able to derive the first few elements of your guard list can use that info to track you down. This is even easier since currently we use multiple directory guards, which means that we easily leak the first 3 positions in our guard list (see DFLT_N_PRIMARY_DIR_GUARDS_TO_USE) (see #21006).

Even with 1 directory guard, an adversary could get glimpses into your guard list when your guards are down, or when he kills your connections to them. To completely solve this issue we would need to use guard sets or some other wacky solution (http://www.homepages.ucl.ac.uk/~ucabaye/papers/guard_sets_for_onion_routing.pdf). We are pretty far away from this happening.

comment:32 Changed 20 months ago by nickm

Parent ID: #11480

comment:33 Changed 17 months ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:34 Changed 17 months ago by nickm

Keywords: 026-triaged-1 removed

comment:35 Changed 17 months ago by nickm

Resolution: fixed
Status: assignedclosed

Prop271 and its predecessors have made progress here.

comment:36 Changed 14 months ago by cypherpunks

Resolution: fixed
Status: closedreopened

It's nice that tor sometimes uses only one guard now, but this problem still persists (pun intended) on computers that leave a tor process running when the network is down such as laptops and cell phones (eg, probably most tor users). Look at the tor state file on your laptop, and how many guards you have - it's a lot! Look at your network connections when you unsleep your laptop after some time, and don't immediately connect to the internet. When you do connect, tor will often connect to several guards.

For many users, this could actually be more of a fingerprint than simply using a popular VPN provider. Please fix this!

Btw, did anything actually change between asn commenting "switching to one entry guard slightly improved the situation, but did not fix the issue. The new guard design did not fix the issue either." and nickm closing the issue with "Prop271 and its predecessors have made progress here."?

comment:37 Changed 14 months ago by cypherpunks

I hope that this is fixed somehow before Tails implements guard persistence! https://labs.riseup.net/code/issues/5462

comment:38 Changed 14 months ago by cypherpunks

Keywords: QUICKANT added

comment:39 in reply to:  36 ; Changed 14 months ago by isis

Resolution: fixed
Status: reopenedclosed

Replying to cypherpunks:

Btw, did anything actually change between asn commenting "switching to one entry guard slightly improved the situation, but did not fix the issue. The new guard design did not fix the issue either." and nickm closing the issue with "Prop271 and its predecessors have made progress here."?


Yeah, there were several iterations of the new guard algorithm. Nick and I simulated several of the designs, and the simulations show a substantial improvement towards limiting the number of guards used. If you need higher protections on a global passive adversary tracking your physical location at this time, consider using something which rotates your state file depending on which network you connect to, or using a single bridge relay. Please also keep in mind that your computer likely has numerous other fingerprints which a global passive adversary may use to track you, e.g. idiosyncrasies in your networking stack, kernel, times that networked cronjobs are executed, etc.

comment:40 Changed 13 months ago by cypherpunks

Resolution: fixed
Status: closedreopened

Hi isis, I'm again reopening this ticket because the fundamental problem in the title and description ("set of guard nodes can act as a linkability fingerprint") remains unfixed.

I just checked a friend's laptop (Debian stable, tor 0.2.9.11-1~deb9u1) and when it got online it immediately connected to four guards. I don't know why, but I suspect it's because (like most laptops) it is sometimes not connected to the internet. (Some time later, it remained connected to two of them.)

I'm well aware of other possible infoleaks and fingerprinting vectors, and I am even beta-testing a DHCP client that implements RFC7844. But even for casual users without a randomized MAC address and restrictive firewall, it seems obvious that tracking a person as they change locations is much simpler when there's a unique set of IPs which they (and only they) connect to, doesn't it?

Btw, your fork of tordyguards which you linked above is currently missing three commits from upstream from 2014 so I think I'll stick with that version for now.

comment:41 in reply to:  40 ; Changed 13 months ago by cypherpunks

Resolution: fixed
Status: reopenedclosed

Replying to cypherpunks:

Hi isis, I'm again reopening this ticket because the fundamental problem in the title and description ("set of guard nodes can act as a linkability fingerprint") remains unfixed.

I just checked a friend's laptop (Debian stable, tor 0.2.9.11-1~deb9u1) and when it got online it immediately connected to four guards. I don't know why, but I suspect it's because (like most laptops) it is sometimes not connected to the internet. (Some time later, it remained connected to two of them.)

Sorry but that doesn't disprove what Isis said, Prop271 was implemented in Tor 0.3.0.x and not 0.2.9.x which your laptop's friend had.

Also if you want more unlinkability maybe look at the snowflake pluggable transport, the temporary IPs design maybe convenient for your needs (assuming there are enough snowflake bridges currently-but you can help! You can do some heavy duty international lobbying so that more people run them). Also more information about the Snowflake PT is available here: https://trac.torproject.org/projects/tor/wiki/doc/Snowflake

Last edited 13 months ago by cypherpunks (previous) (diff)

comment:42 in reply to:  41 ; Changed 13 months ago by isis

Replying to cypherpunks:

Replying to cypherpunks:

Hi isis, I'm again reopening this ticket because the fundamental problem in the title and description ("set of guard nodes can act as a linkability fingerprint") remains unfixed.


This is actually fair. The new guard algorithm, as you're probably aware, has a concept of SAMPLED_GUARDS which is a subset of all the available guards in the consensus, and tor (>=0.3.x) will only connect to those. At any given point, a client will only have one guard. However, there's probably some super weird edge-cases, like "if your guard has stopped answering EXTEND cells but it's still behaving properly for circuits you've already created" then maybe tor could be tricked into keeping the circuits that still work while using a new, different guard for the other circuits. (Although, this more-fingerprintable state—if it's possible—wouldn't exist for long.)

Anyway, I guess what I mean to say is that this specific issue with guard linkability shouldn't be an issue anymore, but there's definitely still improvements which could be made to the guard algorithm.

I just checked a friend's laptop (Debian stable, tor 0.2.9.11-1~deb9u1) and when it got online it immediately connected to four guards. I don't know why, but I suspect it's because (like most laptops) it is sometimes not connected to the internet. (Some time later, it remained connected to two of them.)

Sorry but that doesn't disprove what Isis said, Prop271 was implemented in Tor 0.3.0.x and not 0.2.9.x which your laptop's friend had.


nickm and asn and I talked about this ticket a bit in IRC, and I think the general consensus was to make a new tracking ticket for guard algorithm issues/improvements, with specific child tickets that only discuss post-prop271 tor behaviours, since this ticket was originally about a part of the code that has been nearly entirely replaced.

I have a slight preference for a new ticket, but I'll happily defer to whatever other people think is the logical thing to do.

comment:43 in reply to:  42 Changed 13 months ago by teor

Replying to isis:

Replying to cypherpunks:

Replying to cypherpunks:

Hi isis, I'm again reopening this ticket because the fundamental problem in the title and description ("set of guard nodes can act as a linkability fingerprint") remains unfixed.

...

I just checked a friend's laptop (Debian stable, tor 0.2.9.11-1~deb9u1) and when it got online it immediately connected to four guards. I don't know why, but I suspect it's because (like most laptops) it is sometimes not connected to the internet. (Some time later, it remained connected to two of them.)

Sorry but that doesn't disprove what Isis said, Prop271 was implemented in Tor 0.3.0.x and not 0.2.9.x which your laptop's friend had.

It is likely that at least 2 of the 4 "guards" are:

  • directory authorities
  • fallback directory mirrors
  • directory guards

That's why your client disconnected from 2 of them after downloading the consensus, certificates, and descriptors.

comment:44 Changed 13 months ago by cypherpunks

Resolution: fixed
Status: closedreopened

Directory guards make this problem worse, don't they? (Each user is even more unique.)

Having just re-read prop271, which I must say is a pretty vague and confusing text, I don't see how it fixes this problem.

In ideal network conditions, prop271 might cause only one guard (plus a directory guard?) to be used, but as is well explained in this ticket's original description that is not a sufficient solution because the size of the set of tor users in a given city who have selected a given guard is likely to be small if not one. The set of users with the same guard(s) in the same city is effectively the anonymity set for the very real user-story/threat-model of "I want location anonymity against a passive observer at the local ISP while I travel around my city".

I'm not even talking about FVEY here, I'm talking about adversaries like a stalker with a friend at the local phone company. But, of course, more powerful adversaries can locate people this way too.

Does prop271 prevent connecting to several guards after being offline a little while? I actually doubt it even does that well. It defines "probably offline" as 10 minutes, and doesn't say anything about detecting "no route to host" (an obvious indicator of offlineness in my tor log file today). In any case, it certainly doesn't say anything about maintaining separate guards for different physical locations (gateway MAC addresses). I admit I haven't tried 0.3.0 yet, but if its supposed mitigations to these problems is what is described in prop271, I believe this problem must still exist.

So, I am once again re-opening this ticket.

comment:45 in reply to:  44 Changed 13 months ago by cypherpunks

Replying to cypherpunks:

You didn't answer my previous question with regards to Snowflake, so does it suit your needs?

comment:46 Changed 13 months ago by cypherpunks

The suggestion that *I* should use Snowflake entirely misses the point here: I'm trying to get Tor to make *all* users more uniform, so that we all have an anonymity set to blend in to! How many Snowflake users do you think there are in my city? I would guess that there are not a lot. Also, from reading #21312, it doesn't sound like Snowflake is quite production-ready anyway.

comment:47 in reply to:  39 Changed 13 months ago by cypherpunks

Replying to isis:

Replying to cypherpunks:

Btw, did anything actually change between asn commenting "switching to one entry guard slightly improved the situation, but did not fix the issue. The new guard design did not fix the issue either." and nickm closing the issue with "Prop271 and its predecessors have made progress here."?


Yeah, there were several iterations of the new guard algorithm. Nick and I simulated several of the designs

[...]

BTW, those simulations actually occurred a year before asn's astute comment, not after it.

comment:48 in reply to:  46 ; Changed 13 months ago by cypherpunks

Replying to cypherpunks:

The suggestion that *I* should use Snowflake entirely misses the point here: I'm trying to get Tor to make *all* users more uniform, so that we all have an anonymity set to blend in to!

There were two problems you seemed to be interested in, 1) finding something that suits your needs (you're using Tor, right?), 2) finding out whether the proposed fix for this ticket was sufficient. My earlier comment was related to the first issue only.

How many Snowflake users do you think there are in my city? I would guess that there are not a lot. Also, from reading #21312, it doesn't sound like Snowflake is quite production-ready anyway.

You're misunderstanding how Snowflake operates: From a local network observer frame of reference, you first connect to some domain front, then you connect to one of the many short-lived Snowflake bridges, and its fingerprint looks like WebRTC. What may distinguish Snowflake for your situation is that the IPs you'll connect to will change a lot.

Again read the documentation to see whether it would suit your needs: https://trac.torproject.org/projects/tor/wiki/doc/PluggableTransports/SnowFlakeEvaluation

comment:49 in reply to:  48 Changed 13 months ago by cypherpunks

Replying to cypherpunks:

There were two problems you seemed to be interested in, 1) finding something that suits your needs (you're using Tor, right?), 2) finding out whether the proposed fix for this ticket was sufficient. My earlier comment was related to the first issue only.

Thanks for trying to help, but my comments written here in the first person were not a request for help with my personal needs but rather an attempt to illustrate a general shortcoming of Tor that I think should be improved for all users.

How many Snowflake users do you think there are in my city? I would guess that there are not a lot. Also, from reading #21312, it doesn't sound like Snowflake is quite production-ready anyway.

You're misunderstanding how Snowflake operates: From a local network observer frame of reference, you first connect to some domain front, then you connect to one of the many short-lived Snowflake bridges, and its fingerprint looks like WebRTC. What may distinguish Snowflake for your situation is that the IPs you'll connect to will change a lot.

Again read the documentation to see whether it would suit your needs: https://trac.torproject.org/projects/tor/wiki/doc/PluggableTransports/SnowFlakeEvaluation

I don't think I'm misunderstanding how Snowflake works. Snowflake does change the threat model for this issue a bit. Against a local-city-country passive adversary who sees only coarse connection data (4-tuples and timestamps, say) who wants to track your location, yes, I think it might be better than using a guard even if there aren't other Snowflake users (assuming that the domain fronting and STUN hosts are commonly used in that city). If the adversary has netflow data, though, or if your adversary is at the domain fronting host, then they can presumably fingerprint you as a snowflake user (aka "that" snowflake user, when you're the only snowflake user in town). And unlike with guards, with the snowflake domain fronting host, you don't have a bunch to choose from and the ability to rotate them.

But, more importantly, I think moreso than with any other transport, Snowflake exposes you to attacks by active adversaries who would like to become your first hop (unless I'm mistaken, it's way worse than UseEntryGuards 0, no?).

Anyway, unless there is a proposal to make Snowflake the default Tor transport (which I would strongly oppose for the reason stated in the previous sentence!), this ticket isn't about Snowflake. If you want to further discuss any of the Snowflake issues I've just mentioned here, you could link from here to other appropriate new or existing tickets and I might weigh in.

Last edited 13 months ago by cypherpunks (previous) (diff)
Note: See TracTickets for help on using tickets.