Opened 10 months ago

Last modified 9 months ago

#26240 new defect

Check Maxmind GeoIPLocation Database before distributing

Reported by: jvsg Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: GeoIP, Geoipdb, needs-proposal, metrics-geoip
Cc: dmr Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Currently we're consuming Maxmind's (a company registered in the U.S) GeoIPLocation Database in Tor. Not just this goes against the principles of modern privacy that advocates non-reliance on any single organisation/product, but comes with some serious threats. A powerful adversary can impose it's control over Maxmind's database. This can be used to attack tor in a variety of ways:

  1. The Tor Network is constantly monitored for any suspicious spike in nodes, as it may be an indication of an oncoming/undergoing sybil attack. A powerful adversary can coerce Maxmind to map some specific IP address blocks to random countries. This may lead to people/scripts monitoring the network to not feel suspicious about this event, and would result in the adversary staying under the radar.
  1. A large percentage of people don't want the exit of their circuits to be located in certain countries where the communication is under surveillance. The powerful adversary knows this as well. Users generally add a line in their config that allows them to not form a circuit through nodes located in those locations. To overcome this, the adversary can coerce Maxmind to alter it's database to map some particular IP's to locations which the user thinks are havens of free speech.

Solution

I propose a system where instead of directly distributed maxmind's db to the users, we first check it for any anomalies.

This is how it works:

  1. The Dir Authorities fetch the GeoIPLocation DBs from all the companies (including Maxmind) located in distinct countries.
  1. Tor Nodes' location (from maxmind) are checked against other DBs as well. The location which appears in a majority of DB is considered authentic.
  1. All the Dir Authorities perform the above two steps periodically and independently of each other, and try to reach on a consensus.
  1. This DB is then distributed to the users along with any modifications from step 2.

What if locations differ in all/most of the DBs?

A case might arise where the locations for an IP differ in all/most of DBs, because these locations are just guesses and hence can be erroneous. However IMO,

  1. Most of the nodes are either run from large datacentres, which in all cases have the right GeoLocation mapped to their IP addr range.
  1. Even if the nodes are run from home on a static IP, usually the whois records are well kept, which help companies such as maxmind fetch data for their DBs.

So, false positives would be very few. Even if there are some, we can ban the IP addr from participating in the network until the issue is resolved. Or we can be a little liberal and allow them to participate given that there isnt a spike in number of nodes recently.

What about DB licenses?

Only the Dir Auths have to pay to get DBs in addition to the freely available maxmind DB. The DB that we will distribute to the users would just be maxmind (with some possible modifications)

Child Tickets

Change History (7)

comment:1 Changed 10 months ago by irl

#25542 may be relevant.

comment:2 Changed 9 months ago by dgoulet

Component: Core TorCore Tor/Tor
Milestone: Tor: unspecified

comment:3 Changed 9 months ago by dmr

Cc: dmr added

comment:4 Changed 9 months ago by teor

Keywords: needs-proposal added

This idea needs a proposal. Here is our proposals process:
https://gitweb.torproject.org/torspec.git/tree/proposals/001-process.txt

Here is my feedback on your idea:

I don't believe we can make an unreliable database into a reliable database, using other unreliable databases. The definition of "location" is ambiguous: it can mean the location of any company in the chain of companies owning the data center, or the physical location of the data center. Until providers fix the definition, the data will never be accurate.

Also, providers don't care about server locations, because they're not used for advertising to consumers.

Some providers will want you to pay for any use of their data, even if you only replace one maxmind location. So you should get a lawyer to read their licensing terms before you write your proposal.

I have an alternative proposal:

  • stop relying on GeoIP for security-sensitive activities:
    • remove support for country codes in torrc options, or document them as unreliable
    • stop relying on countries in Sybil scanning
  • document all other uses (for example, in statistics and relay search) as informational only

comment:5 in reply to:  4 Changed 9 months ago by jvsg

Replying to teor:

Here is a proposal from me: https://pad.riseup.net/p/fl7xBSSNRtw7-keep

comment:6 Changed 9 months ago by teor

Thanks!

Please email your proposal to tor-dev@….
We don't discuss proposals on tickets.

comment:7 Changed 9 months ago by irl

Keywords: metrics-geoip added
Note: See TracTickets for help on using tickets.