Since circa 13062016 Onionoo fails to resolve ASNs and AS names for growing number of relays [1].
It doesn't seem to be connected with newcoming relays because some long-living relays are also affected (they recently had ASNs). Also many new relays are not affected.
At the same time Onionoo successfuly resolves country codes so it may not be related to #19154 (moved).
Maybe something is wrong with GeoIP database itself?
Right, I noticed the same issue today while migrating Compass to a new host.
Turns out MaxMind's latest ASN database contains many entries with AS number but without AS name. Two sample lines:
16844800,16845055,"AS4134 Chinanet" # AS number and name16846848,16847871,AS49597 # AS number only
In absolute numbers and compared to the January and February 2016 databases (I don't have databases from March to May anymore):
Database
AS number and name
AS number only
January 4, 2016
235,323
514
February 1, 2016
236,474
544
June 6, 2016
187,415
57,773
June 13, 2016
188,139
57,179
The June 6 database is currently deployed and the June 13 database is the latest I could download from MaxMind. It looks like they broke something, and it seems unlikely that they'll fix that in the next database unless somebody tells them. Would you want to reach out to them and see if they can fix this?
I guess the short-term fix is that we include AS numbers in Onionoo results even if they come without AS names. We currently don't do that.
Another quick fix would be to downgrade to the February 1 database. Would that be better?
Trac: Owner: N/Ato karsten Status: new to accepted
For me it looks like we definelety should include all that we have, i.e. ASNs even without better-to-have AS names. Downgrade just will bring another errors.
IMHO, in the long-term we should avoid using GeoIP databases (like MaxMind's) and do vanilla IP to ASN mapping. And thus avoid mysteriuos latitude/longtitude/radius/city fields. Proably something like this [1]. We have to have more precise and recent AS mapping (especially for little-t-tor).
Anyway tracking MaxMind's layout change is a way to nowhere.
FYI have a look at your battle with GeoIP:
$ git shortlog --grep "MaxMind\|GeoIP"
Karsten Loesing (14): Use recent GeoIP database without A1 entries. Simplify GeoIP cleanup code, update to May files. Extract GeoIP lookup code and test it. MaxMind's GeoIP files use ISO-8859-1, not UTF-8. Merge two writer classes to speed up rDNS lookups. Switch to using MaxMind's GeoLite2 city database. Add unit tests for new GeoIP2 code, and fix a bug. Move front-end parts of NodeStatus to SummaryDocument. Handle UTF-8 characters in GeoIP lookup results. Adapt to MaxMind's new GeoLite2 City format. Fix character encoding when reading GeoIP files. Add more tests for UTF-8 characters in GeoIP files. Fix character encoding of ASN database file. Support additional columns in GeoLite2 files.
For me it looks like we definelety should include all that we have, i.e. ASNs even without better-to-have AS names. Downgrade just will bring another errors.
Agreed. If I don't hear any other concerns, I'll merge the patch and deploy it tomorrow (Friday).
IMHO, in the long-term we should avoid using GeoIP databases (like MaxMind's) and do vanilla IP to ASN mapping. And thus avoid mysteriuos latitude/longtitude/radius/city fields. Proably something like this [1]. We have to have more precise and recent AS mapping (especially for little-t-tor).
We're currently unable to produce our own database, so we'll have to rely on some third party to do that for us. Researching alternatives is certainly an option, but it's not a priority right now. I might change my mind about that if MaxMind continues to screw up with the ASN database or screws up with the country/city databases in the near future.
There's also #19118 (moved) if you're curious. Also not a priority right now, sadly.
Anyway tracking MaxMind's layout change is a way to nowhere.
In this case I don't think they changed their data format but rather that they broke something that still produces the same data format as before.
FYI have a look at your battle with GeoIP:
$ git shortlog --grep "MaxMind\|GeoIP"
{{{
Karsten Loesing (14):
Use recent GeoIP database without A1 entries.
Simplify GeoIP cleanup code, update to May files.
Extract GeoIP lookup code and test it.
MaxMind's GeoIP files use ISO-8859-1, not UTF-8.
Merge two writer classes to speed up rDNS lookups.
Switch to using MaxMind's GeoLite2 city database.
Add unit tests for new GeoIP2 code, and fix a bug.
Move front-end parts of NodeStatus to SummaryDocument.
Handle UTF-8 characters in GeoIP lookup results.
Adapt to MaxMind's new GeoLite2 City format.
Fix character encoding when reading GeoIP files.
Add more tests for UTF-8 characters in GeoIP files.
Fix character encoding of ASN database file.
Support additional columns in GeoLite2 files.
}}}
Haha, nice. Now you know why I'm not too keen to repeat that battle with another GeoIP/ASN data provider. And honestly, I expect that to be a similar battle.
We're currently unable to produce our own database, so we'll have to rely on some third party to do that for us. Researching alternatives is certainly an option, but it's not a priority right now.
Now you know why I'm not too keen to repeat that battle with another GeoIP/ASN data provider. And honestly, I expect that to be a similar battle.
Absolutely agree. I just meant that we need more reliable and universal way to deal with ASN data. And also to use multiple sources for BGP data (maybe Linus Nordberg have some insight on this issue?).
Anyway I think that there should be some reassessment about relay clusteing. And #19118 (moved) seems to be really relevant, thanks!
I find the approach mentioned by twim (the link https://quaxio.com/bgp/) feasible, also for java.
But of, course it needs to be implemented and maintained.
In between, maybe we could combine our existing code using geoip data into a test that checks new data before we use it in Onionoo? Thus, avoiding errors/missing data elsewhere.
Needs implementation, but might be helpful, until there is a better data source.
In between, maybe we could combine our existing code using geoip data into a test that checks new data before we use it in Onionoo? Thus, avoiding errors/missing data elsewhere.
Nice idea! If something strange/broken appears in new database then do a fallback to previous version and drop a warning instead of breaking everything. But there should be some trickery (math) involved so I don't see straightforward solution.
Thanks for reviewing! Merged to master and deployed.
Regarding the suggestion to check new data before using it, I guess that's also an oversight on my part. When I update tor's geoip and geoip6 files I always skim over the diff before submitting a patch, but for some reason I didn't do that for Onionoo updates. Also, those are not in Git but only on the server. I'll make a note to be more careful there. What we could also do is add those files to Git. Thoughts?
Regarding switching our source of Geoip/ASN information, can I ask either of you to create a new ticket for that? I should repeat here that I'm not overly enthusiastic to work on that, so it may not happen soon if I have to do the bulk of the work there.
Leaving this ticket open until we have either extracted all remaining action items or decided not to do them.
Also, those are not in Git but only on the server. I'll make a note to be more careful there. What we could also do is add those files to Git. Thoughts?
Yeah, sounds reasonalbe.
How about importing raw data from all the sources into 'unstable' branch and then stabilizing it manually into 'stable'? Then use data from 'stable' in production. Stable data should have constant format.
Also one can scrape bits of data from sites like bgp.he.net and feed it into the repo. Then stable 'consensus' can be made.
Regarding switching our source of Geoip/ASN information, can I ask either of you to create a new ticket for that?
Sure, created #19437 (moved) for this. Considering to move this discussion there.
The June 6 database is currently deployed and the June 13 database is the latest I could download from MaxMind. It looks like they broke something, and it seems unlikely that they'll fix that in the next database unless somebody tells them. Would you want to reach out to them and see if they can fix this?
Whenever "as-name" is "UNSPECIFIED" they apparently use "descr".
The best way to see whether it is a good idea to use a new version is to use the current set of relay IP addresses and check how many of them will have no AS name after the (simulated) upgrade.
The best way to see whether it is a good idea to use a new version is to use the current set of relay IP addresses and check how many of them will have no AS name after the (simulated) upgrade.
Ok, I had the time to do that now and the results say that the number of AS-name less relays will increase from 111 to 353. One of the major ASes apparently lost their name (AS12876).
So I'd recommend to NOT upgrade.
used file for these tests:
sha1 fp:
3683070a9285b99e06fd41503d72f2c07fe371e9 GeoIPASNum.dat
another note:
the number of relays for which maxmind's DB was unable to provide any AS data (no number and no name) improved from 8 to 0 relays
(this is all based on onionoo details records from 2016-07-22 13:00:00
Alright, I just looked at the latest database file from yesterday (GeoIPASNum2.zip, shasum 997932353f5824eeb760459e0ad5f8ff2226c01c), and I believe we should update to this one rather than keeping the May 23, 2016 file. Some numbers:
Database
AS number and name
AS number only
February 24, 2015
213,789
318
January 4, 2016
235,323
514
February 1, 2016
236,474
544
May 23, 2016
242,462
2,043
June 6, 2016
187,415
57,773
June 13, 2016
188,139
57,179
July 18, 2016
245,367
1,362
October 9, 2016
250,573
1,372
December 3, 2016
250,447
1,010
I also compared AS number/names to a much older database from February 24, 2015 that I found somewhere else on my hard disk. My idea was that organization names don't change that often. I attached a graph showing how newer databases compared to that old one. It looks like if we're roughly back to normal again.
And this new file apparently doesn't have the issues stated above with AS8620 or AS12876.
So, let me ask: are there important reasons not to switch to the December 3, 2016 database? If I don't hear major concerns by, Thursday, I'll bring this up at the next metrics team meeting and ideally decide to switch.