Opened 5 years ago

Closed 5 years ago

#17786 closed defect (fixed)

"France, Metropolitan" is unused?

Reported by: arma Owned by: karsten
Priority: Low Milestone:
Component: Metrics/Website Version:
Severity: Normal Keywords: easy
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

In the "direct user stats" graph, in the pull-down menu for countries, one of thes options is "France, Metropolitan". I don't know what that is, but I think Tor doesn't either, since metrics can't graph it.

(Are there any other entries in that list that don't correspond to cctlds?)

Child Tickets

Change History (2)

comment:1 Changed 5 years ago by karsten

Owner: set to karsten
Status: newassigned

tl;wr: I'm going to remove country code FX (France, Metropolitan) and add country codes BQ (Bonaire, Sint Eustatius and Saba), CW (Curaçao), SX(Sint Maarten), and XK (Kosovo) to Metrics by January 25, 2016.

Great question. Sorry for taking so long to respond, but it turns out the answer was even more difficult than I anticipated. I had to compare six different sources of country codes to answer this question, in particular the part about other country codes. Here are the six different country code lists I looked at:

  1. Metrics: The R file used by Metrics to include country names in graphs;
  2. clients: The user number estimates file produced by Metrics from looking at extra-info descriptors written by Tor relays;
  3. geoip: The latest geoip and geoip6 files shipped with little-t-tor;
  4. MaxMind: MaxMind's list of ISO 3166 Country Codes used in GeoIP legacy databases that we used in little-t-tor until February 2014;
  5. MaxMind2: MaxMind's country codes used in GeoIP2 databases used by little-t-tor from February 2014 on; and
  6. Wikipedia: Wikipedia's ISO 3166-1 alpha-2 page, in particular the decoding table.

In the following I'm going through all country codes that are missing in at least one of these lists, skipping most of the user-assigned country codes listed in Wikipedia. Numbers in brackets are lists containing country codes. I'm starting with country codes contained in Metrics, which includes France, Metropolitan mentioned above, and from those I'm starting with the ones I think we can drop, which turns out to be just one:

  • FX (1, 6; France, Metropolitan): Wikipedia says this country code was reserved on request of France, but it's neither used by MaxMind nor by little-t-tor. We can safely drop this country code from Metrics.

Here are more country codes in Metrics that are not contained in all other lists, but I think we should keep them all:

  • AN (1, 2, 6; Netherlands Antilles): Wikipedia says this country code was assigned until 2012, and there are still relays reporting users from this country code. I'd say we should keep this, because we're not only graphing current user numbers but also user numbers from a few years ago.
  • BV (1, 2, 4, 6; Bouvet Island): This country code is still assigned, there have been Tor users coming from it in the past, there are just no IP address ranges in the current geoip file. We should keep this.
  • EH (1, 2, 4, 6; Western Sahara): Same as BV, keep it.
  • HM (1, 4, 6; Heard Island and McDonald Islands): There are currently no IP address ranges using this country code, but there might be in the future. We should keep this.

And here are country codes that are not contained in Metrics, starting with the ones we can safely ignore in the future:

  • ?? (2; Unknown): Tor uses this country code whenever it cannot resolve an IP address. I think there's no need to draw a graph with users coming from unknown countries, because there may be plenty of reasons for that, and the graph won't reveal what they are.
  • A1 (2, 4; Anonymous Proxy): MaxMind uses this country code for anonymous proxies, which is not a specific country. We filter out these IP address ranges before putting their database into little-t-tor, so these reported users come from relays using their own database file. We can safely ignore these, for the same reason as ignoring unknown countries mentioned before.
  • A2 (2, 4; Satellite Provider): Same as A1, keep ignoring.
  • AA (2; User-assigned): Wikipedia says this country code is free for assignment at the disposal of users, so we can safely ignore it.
  • AP (2, 4, 6; Asia/Pacific Region): MaxMind uses this code for Asia/Pacific Region when a specific country code has not been designated. We're ignoring this country code in Metrics and little-t-tor, which I think makes sense.
  • CS (2, 6; Serbia and Montenegro): Wikipedia says this country code was assigned to Serbia and Montenegro which are distinct countries since 2006. This predates the user number estimates on Metrics, so I'd say we better stay away from this political minefield by leaving out this country code from Metrics.
  • EU (2, 4, 6; Europe): See AP, ignore.
  • O1 (4; Other Country): MaxMind used this country code in the past for other countries, but there have not been any Tor users coming from that country. We can ignore this.
  • RI (2, 6; Indonesia): Wikipedia lists this country code under indeterminate reservations, and it's neither used by Metrics nor by little-t-tor. Let's ignore this.

And finally, here are country codes that are not yet in Metrics but which should be there:

  • BQ (2, 3, 4, 5, 6; Bonaire, Sint Eustatius and Saba): This country code was assigned in 2010, and Metrics is missing it. We should add it.
  • CW (2, 3, 4, 5, 6; Curaçao): See BQ, add it.
  • SX (2, 3, 4, 5, 6; Sint Maarten): See BQ, add it.
  • XK (2, 3, 5; Kosovo): Wikipedia says that XF is a user-assigned code that is being used by the European Commission, Switzerland, the Deutsche Bundesbank, SWIFT, and other organizations as a temporary country code for Kosovo. MaxMind adds this country code in its GeoIP2 format, and there are actual Tor users from that country code. We should add it to Metrics, too.

I'll leave this ticket in assigned state for a week and then make the changes stated above.

comment:2 Changed 5 years ago by karsten

Resolution: fixed
Status: assignedclosed

Merged, deployed, resolving. Thanks for the report!

Note: See TracTickets for help on using tickets.