Opened 10 years ago

Closed 9 years ago

Last modified 7 years ago

#1094 closed enhancement (fixed)

No way to specify a GeoIP country code miss in ExcludeNodes etc.

Reported by: flyposted Owned by: nickm
Priority: Low Milestone: Tor: 0.2.2.x-final
Component: Core Tor/Tor Version: 0.2.1.19
Severity: Keywords:
Cc: flyposted, nickm, Sebastian Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by nickm)

Tor 0.2.1.19 currently lacks a way to specify a country that doesn't match
a GeoIP, even though the src/or/geoip.c module provides a country code
'??' that could be used to match a GeoIP miss. This GeoIP miss code
would let Tor administrators block traffic to exit nodes that lack proper
GeoIP matches if in doubt about the true exit country. This seems more
like an implementation oversight than a feature, so I'm filing this as
a bug report.

I was able to produce a test case for this with:

-- The latest ip-to-country table from ip-to-country.webhosting.info

converted to Tor GeoIP format

-- Running Tor from the US
-- ExcludeNodes set to "{US},{??}"
-- An ExcludeNodes GeoIP miss on 66.230.230.230 (node AmericanProgress)

which shows up as in the US on MaxMind but is a total miss on the table

-- WebHosting.info: "IP Address 66.230.230.230 belongs to Some Place We Dont Know About."

(just to confirm that the table conversion worked OK)

After looking through the source code it looks like there's not enough
logging going on, even at the log_debug level, to see what is happening
exactly to the '??' match that pops out of the function
geoip_get_country_name().

The routerset_refresh_countries() function appears to kick the '??' country
out of the list if there's no match. I dummied up a GeoIP entry set like
this based on the last current entry of the GeoIP file:

4278190080,4294967294,US
4294967295,4294967295,??

That appears to be enough to satisfy the routerset functions in
src/or/routerlist.c so there's no complaint there, but this kludge
is useless for trying to get a match on a GeoIP miss. I suppose I
could write a script to patch gaps in the GeoIP file to kludge around
it yet another way and I could use one of the XX psuedo-countries
to work through this (which is what I'll probably do for now), but
it seems like there should be a well-known way to get this to work
without having to do that.

Any ideas?

[Automatically added by flyspray2trac: Operating System: Fedora Core Linux]

Child Tickets

Attachments (2)

reproc-geoip.pl.txt (755 bytes) - added by flyposted 10 years ago.
Perl hack to reprocess GeoIP file to avoid bug
0001-Make-Nowhere-explicitly-listable-in-torrc.patch (2.1 KB) - added by nickm 10 years ago.

Download all attachments as: .zip

Change History (19)

Changed 10 years ago by flyposted

Attachment: reproc-geoip.pl.txt added

Perl hack to reprocess GeoIP file to avoid bug

comment:1 Changed 10 years ago by flyposted

I am able to avoid the ExcludeNodes country code GeoIP miss bug by
populating the gaps in the GeoIP file with entries that map to a fake
country "XX". This results in GeoIP file entries like these:

0,33996343,XX
33996344,33996351,GB
33996352,94585423,XX
94585424,94585439,SE

Then adding XX to ExcludeNodes like this works:

ExcludeNodes {XX},...

routerset_refresh_countries() appears to populate the bitmap with
the XX entries and they appear to match so far.

Clearly this is a hack to get around a shortcoming in the code
though so it would be good to get this fixed. In the meantime,
I've hacked my way around the bug .....

comment:2 Changed 10 years ago by nickm

Hm. For 0.2.1.x, it would require backporting a fair bit of new code. Does the following patch against the
0.2.2.x series (generated against git master) work for you?

[Reclassifying as feature request.]

comment:3 Changed 10 years ago by nickm

Sebastian reports that the patch doesn't do what I hoped it would for him. More thought is needed.

comment:4 Changed 10 years ago by Sebastian

To debug, it would help to get a way to find a node that is definitely in
country ?? according to the current geoip db we're using. Is there such a way?

I tested with 66.230.230.230 (node desync, not AmericanProgress like indicated
above) - if the node was in a real country in my DB, I guess my simple test
wasn't worth anything.

comment:5 Changed 10 years ago by nickm

I'm holding off on this for a bit until arma's bug 1090 fix is done: it will make this easier to debug.

comment:6 Changed 9 years ago by nickm

Milestone: Tor: 0.2.2.x-final

Marking as 0.2.2.x-final, since bug 1090 is also marked for 0.2.2.x-final. It's not absolutely critical for that release, but we _have_ the code, and we should spend at least a few minutes trying to debug it.

comment:7 Changed 9 years ago by nickm

Owner: set to nickm
Status: newassigned

comment:8 Changed 9 years ago by nickm

Description: modified (diff)

velope from IRC rightly points out that this behavior is counterintuitive, and users rarely want to name '??' explicitly:

03:04 < velope> if the user wants to exclude a country badly enough to use the torrc option, the last thing he wants is for tor to use a node in that country just because geoip is incomplete or defective.
03:57 < nickm> Hm. So you think that nodes in Nowhere should count as belonging to every country for Exclude*Nodes, but count as belonging to no country for ExitNodes ?
03:57 < nickm> Seems plausible to me

To expand: This interpretation seems plausible to me because if you want to exclude *any* country, there's no way to tell whether an unknown-country router is there or not... and if you want to include only certain countries, there's no way to tell whether an unknown-country belongs to any of them.

This wouldn't be so hard to implement, in fact. We would just add a flag to each routerstatus_t when we created it to indicate whether unknown-country nodes should be considered members of *every* country (as for ExcludeNodes) or whether they should be considered members of *no* country (as for ExitNodes). This flag could be passed in as part of routerset_new(), and it would only need to be interpreted, I think, in routerset_contains().

This might even be a good newbie project; the code changes are small and isolated to:

  • struct routerset_t
  • routerset_new()
  • everywhere that calls routerset_new()
  • routerset_contains()

comment:9 Changed 9 years ago by Sebastian

So what happens when you use EntryNodes together with ExcludeExitNodes? I can imagine some people might want to do
EntryNodes {de} ExcludeExitNodes {de}

comment:10 Changed 9 years ago by nickm

Is that a problem? For the EntryNodes set, unknown-country IPs would not be included. For the ExcludeExitNodes set, unknown-county IPs _would_ be included. So in our earlier notation, it would be as if you said "EntryNodes {de}", "ExcludeExitNodes {de},{??}"

comment:11 Changed 9 years ago by yetonetime

This is not right to exclude any country by default because geoip db may not be complete only.

Let's all unset regions geoip db may contain the address of which wants to exlude by user, but at the same time is a unknown country may not be a country which wants to exclude.

If the user wants to exclude only {de} it should result in a only those relays for which it is well known that it is DE. If the user wants to exclude the unknown country, he must do it explicitly.

And yes, 0001-Make-Nowhere-explicitly-listable-in-torrc.patch works just fine. You need just more reliable tests for it than it was described.

comment:12 Changed 9 years ago by Sebastian

If we made this change, there'd be no way to get the current behaviour of ExcludeExitNodes {de}. I think that's a little sad. Maybe the right thing to do is to just fix the examples to explicitly point out that country ?? can be listed and some users probably want to do that?

comment:13 Changed 9 years ago by Sebastian

To make it more clear why I think that automatically including ?? with a country here, consider the case where a country with only very few relays is listed in ExcludeNodes. If the set of unknown relays is large, which has happened in the past, the user gets a big reduction in entropy for node choices. Forcing this on the user is probably not what we want at all.

comment:14 Changed 9 years ago by nickm

Status: assignedneeds_review

Okay, so it's back to the ??==Nowhereland idea. I've got a rebased version of my original patch with minor bulletproofing improvements and unit testing fixes at bug1094_v2 in my public repository.

comment:15 Changed 9 years ago by Sebastian

In my current testing, this seems to work. My test setup is like this:

torrc:
ExitNodes {??}
StrictNodes 1

There currently is only one node listed as belonging to country ??, nickname "evil". Fortunately (for my testing) it is an exit node, and all my circuits that actually push data to the net are using it. However, a lot more three-hop circuits are built (presumably for build-time measurements) which don't follow these rules; and using .exit notation to get a different exit still works. I suppose that is more related to bug 1090 though and shouldn't concern us here. I think it is a fine idea to merge this patch.

comment:16 Changed 9 years ago by nickm

Resolution: Nonefixed
Status: needs_reviewclosed

Thanks for the testing and review. Merging and closing.

comment:17 Changed 7 years ago by nickm

Component: Tor ClientTor
Note: See TracTickets for help on using tickets.