Opened 8 months ago

Closed 5 weeks ago

#20509 closed defect (implemented)

Directory authorities should reject relays with #20499 bug

Reported by: arma Owned by:
Priority: Medium Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version: Tor: 0.2.8.9
Severity: Normal Keywords: 028-backport, easy, TorCoreTeam201612, review-group-15
Cc: cb@… Actual Points:
Parent ID: Points:
Reviewer: teor, arma Sponsor:

Description

Once we resolve #20499, or more precisely, once we figure out which versions it affects, we should teach the directory authorities to withhold the Guard flag from relays that are running those versions.

Right now you can't bootstrap from e.g. sofia, which is a huge guard running 0.2.9.3-alpha.

Then later -- or maybe this will turn out to be the easiest solution for everything -- we can teach the directory authorities to simply reject descriptors from relays running these buggy versions.

[I'm not sure which milestone to put this in, since it will need to get into a majority of directory authorities for it to matter. :/ ]

Child Tickets

Change History (45)

comment:1 follow-up: Changed 8 months ago by arma

My current theory is that the bug went in at git commit 09a0f2d0b24, which went into Tor 0.2.9.2-alpha. So relays running 0.2.9.1-alpha-dev through 0.2.9.4-alpha-dev, and also 0.3.0.0-alpha-dev, are affected.

Option 1 is that we get a new 0.2.9.x out pretty soon, and then contact the big relay operators that are running affected versions and get them to update. Then we AuthDirReject the ones that have the Guard flag and don't update, and periodically we check the network for broken relays (via #20501) and contact them / reject them too.

Option 2 is that we change the directory authority code to withhold a Guard vote for all relays running the affected versions. And then get enough authorities to update that we can affect Guard assignment. This option seems better in theory, I don't have a good handle on what versions the dir auths like to run, so I don't know how tricky this one will be in practice.

Option 3 is that we do both -- option 1 at first while trying to do option 2. That's more work, which is usually stupid, but maybe if option 2 is a long way out, we'll need it.

Did I miss any options? :)

Oh, I'll also notice that "0.3.0.0-alpha-dev is affected" is a sad phrase, since it means we can't distinguish people running newer versions until we've made an 0.3.0 release. Unless I'm wrong?

comment:2 Changed 8 months ago by Sebastian

I think a reduced work option 3 is best here. That is, get a new alpha out, tell people to upgrade once, exclude relays that run these versions without upgrading, then don't do anything manually anymore. At the same time, include the code to reject those broken versions in the next regular releases, without rushing dirauths to upgrade. I find it rather unlikely that any significant number of relay operators will actually install a non-latest alpha version after we've made a new one, and the few stragglers who do will get excluded once the dirauths are ready.

comment:3 Changed 8 months ago by arma

Ok. Plausible! But: the next regular releases of what? I think it will be pretty easy to get this patch into 0.2.9. Do we aim for getting something into 0.2.8? I think no? In which case... I think that means moria1 is the only one that will get the patch until a while after 0.2.9 goes stable? :)

It seems a bit silly to try to get the patch into an 0.2.8 release. But it also seems a bit silly not to. Darn it.

comment:4 Changed 8 months ago by Sebastian

The patch should be trivial. I don't see how this isn't an obvious backport as it's related to actually keeping the network running

comment:5 Changed 8 months ago by arma

For those who enjoy writing trivial patches, the

    rs->is_possible_guard = (wfu >= guard_wfu && tk >= guard_tk) ? 1 : 0;

line in set_routerstatus_from_routerinfo() is a good place to start.

comment:6 Changed 8 months ago by nickm

  • Keywords 028-backport added
  • Milestone set to Tor: 0.2.9.x-final

comment:7 in reply to: ↑ 1 ; follow-up: Changed 8 months ago by teor

Replying to arma:

...
Option 2 is that we change the directory authority code to withhold a Guard vote for all relays running the affected versions. And then get enough authorities to update that we can affect Guard assignment. This option seems better in theory, I don't have a good handle on what versions the dir auths like to run, so I don't know how tricky this one will be in practice.
...

Hang on, didn't we have a bug where clients would pick relays as their guard regardless of whether they had the Guard flag? How many versions ago was that?

And do clients automatically get new guards when their existing guards lose the Guard flag? Or do they keep their old guards until they rotate?

comment:8 Changed 8 months ago by teor

  • Keywords easy added

Also, tor_version_as_new_as is a very useful function for this patch.

comment:9 in reply to: ↑ 7 Changed 8 months ago by arma

Replying to teor:

Hang on, didn't we have a bug where clients would pick relays as their guard regardless of whether they had the Guard flag? How many versions ago was that?

That was bug #17772 and its fix went into 0.2.7.6, which came out before 0.2.8.1-alpha. Whereas directory guards went into 0.2.4.8-alpha. So clients running 0.2.4.x and 0.2.5.x and 0.2.6.x will have the #17772 bug. But I think they have bigger problems.

And do clients automatically get new guards when their existing guards lose the Guard flag? Or do they keep their old guards until they rotate?

Great question. #17773 is the complicated mess you want to read to answer it. I *think* the summary of that ticket is that clients do in fact go find another Guard if their current one loses its flag. And we were thinking of changing that behavior, because it introduces other problems, but we haven't yet. Lucky us here.

comment:10 Changed 8 months ago by rubiate

Will something like this work?

diff --git a/src/or/dirserv.c b/src/or/dirserv.c
index 6e25323..7ff5535 100644
--- a/src/or/dirserv.c
+++ b/src/or/dirserv.c
@@ -2122,6 +2122,29 @@ routers_make_ed_keys_unique(smartlist_t *routers)
   } SMARTLIST_FOREACH_END(ri);
 }
 
+/** Check if the Tor version provided in the platform string <b>platform</b> is
+ * known to be broken in a way that means it should not be used as a Guard.
+ *
+ * Return 0 if it should be good, or 1 if it is known to be broken.
+ */
+STATIC int
+is_broken_guard_version(const char *platform)
+{
+  /* assume it's good if we don't know the platform/version */
+  if (platform == NULL)
+    return 0;
+
+  /* bug #20499 affects versions from 0.2.9.1-alpha-dev to 0.2.9.4-alpha
+   * and version 0.3.0.0-alpha-dev
+   */
+  if (!tor_version_as_new_as(platform, "Tor 0.2.9.1-alpha-dev") ||
+      (tor_version_as_new_as(platform, "Tor 0.2.9.5-alpha") &&
+      strcmpstart(platform, "Tor 0.3.0.0-alpha-dev") != 0))
+    return 0;
+
+  return 1;
+}
+
 /** Extract status information from <b>ri</b> and from other authority
  * functions and store it in <b>rs</b>>.
  *
@@ -2154,6 +2177,7 @@ set_routerstatus_from_routerinfo(routerstatus_t *rs,
   rs->is_valid = node->is_valid;
 
   if (node->is_fast && node->is_stable &&
+      !is_broken_guard_version(ri->platform) &&
       ((options->AuthDirGuardBWGuarantee &&
         routerbw_kb >= options->AuthDirGuardBWGuarantee/1000) ||
        routerbw_kb >= MIN(guard_bandwidth_including_exits_kb,
diff --git a/src/or/dirserv.h b/src/or/dirserv.h
index 1e4f27e..6f645bc 100644
--- a/src/or/dirserv.h
+++ b/src/or/dirserv.h
@@ -132,6 +132,7 @@ STATIC int dirserv_has_measured_bw(const char *node_id);
 STATIC int
 dirserv_read_guardfraction_file_from_str(const char *guardfraction_file_str,
                                       smartlist_t *vote_routerstatuses);
+STATIC int is_broken_guard_version(const char *platform);
 #endif
 
 int dirserv_read_measured_bandwidths(const char *from_file,

Tests and changes file:

diff --git a/changes/bug20509 b/changes/bug20509
new file mode 100644
index 0000000..aaa0467
--- /dev/null
+++ b/changes/bug20509
@@ -0,0 +1,7 @@
+  o Directory authorities:
+     - Directory authorities will now withhold the Guard flag from
+       relays which are running Tor versions 0.2.9.1-alpha-dev to
+       0.2.9.4-alpha, and 0.3.0.0-alpha-dev. Bug 20499 causes these
+       versions to not update the consensus they will serve, which
+       could prevent clients that use these relays as Guards from
+       being able to connect to the network. See ticket 20509.
diff --git a/src/test/test_dir.c b/src/test/test_dir.c
index cf0b94c..8aad34d 100644
--- a/src/test/test_dir.c
+++ b/src/test/test_dir.c
@@ -3183,6 +3183,29 @@ reset_routerstatus(routerstatus_t *rs,
   rs->addr = ipv4_addr;
 }
 
+static void
+test_dir_broken_guard_versions(void *arg)
+{
+  (void)arg;
+
+#define TEST_GUARD_VERSION(v, r) \
+  tt_int_op(is_broken_guard_version("Tor " v " on Linux"), OP_EQ, r);
+
+  TEST_GUARD_VERSION("0.2.8.9", 0);
+  TEST_GUARD_VERSION("0.2.9.1-alpha", 0);
+  TEST_GUARD_VERSION("0.2.9.1-alpha-dev", 1);
+  TEST_GUARD_VERSION("0.2.9.2-alpha", 1);
+  TEST_GUARD_VERSION("0.2.9.3-alpha", 1);
+  TEST_GUARD_VERSION("0.2.9.4-alpha", 1);
+  TEST_GUARD_VERSION("0.2.9.5-alpha", 0);
+  TEST_GUARD_VERSION("0.3.0.0-alpha-dev", 1);
+  TEST_GUARD_VERSION("0.3.0.0-alpha", 0);
+  TEST_GUARD_VERSION("0.3.0.0", 0);
+  tt_int_op(is_broken_guard_version(NULL), OP_EQ, 0);
+
+ done: ;
+}
+
 #define ROUTER_A_ID_STR    "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
 #define ROUTER_A_IPV4      0xAA008801
 #define ROUTER_B_ID_STR    "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"
@@ -5771,6 +5794,7 @@ struct testcase_t dir_tests[] = {
   DIR_LEGACY(clip_unmeasured_bw_kb),
   DIR_LEGACY(clip_unmeasured_bw_kb_alt),
   DIR(fmt_control_ns, 0),
+  DIR(broken_guard_versions, 0),
   DIR(dirserv_set_routerstatus_testing, 0),
   DIR(http_handling, 0),
   DIR(purpose_needs_anonymity_returns_true_for_bridges, 0),

comment:11 Changed 8 months ago by teor

  • Keywords TorCoreTeam201611 added
  • Status changed from new to needs_revision
  • Version set to Tor: 0.2.9.1-alpha

For consistency, if you are going to ban all 0.2.9.1-alpha-dev because some have this issue, please ban all 0.2.9.4-alpha-dev because some have this issue.

And in any case, please add 0.2.9.1-alpha-dev and 0.3.0.1-alpha to the unit tests.

Also, please use tor_version_compare rather than strcmpstart.

And it might be worth mentioning that 0.3.0.0-alpha and 0.3.0.0 are not Tor versions that will ever exist. The next version will be 0.3.0.1-alpha.

comment:12 Changed 8 months ago by rubiate

0.2.9.4-alpha-dev would already trip this although the comments don't say that.

New patch:

  • comments changed to reflect that 0.2.9.4-alpha-dev is also affected
  • using tor_version_compare() for the 0.3.0.0-alpha-dev check
  • tests added for 0.2.9.4-alpha-dev and 0.3.0.1-alpha, tests for non-existent versions removed, and added a test for if tor_version_parse() fails
diff --git a/changes/bug20509 b/changes/bug20509
new file mode 100644
index 0000000..496726d
--- /dev/null
+++ b/changes/bug20509
@@ -0,0 +1,7 @@
+  o Directory authorities:
+     - Directory authorities will now withhold the Guard flag from
+       relays which are running Tor versions 0.2.9.1-alpha-dev to
+       0.2.9.4-alpha-dev, and 0.3.0.0-alpha-dev. Bug 20499 causes
+       these versions to not update the consensus they will serve,
+       which could prevent clients that use these relays as Guards
+       from being able to connect to the network. See ticket 20509.
diff --git a/src/or/dirserv.c b/src/or/dirserv.c
index 6e25323..8f9ca0b 100644
--- a/src/or/dirserv.c
+++ b/src/or/dirserv.c
@@ -2122,6 +2122,36 @@ routers_make_ed_keys_unique(smartlist_t *routers)
   } SMARTLIST_FOREACH_END(ri);
 }
 
+/** Check if the Tor version provided in the platform string <b>platform</b> is
+ * known to be broken in a way that means it should not be used as a Guard.
+ *
+ * Return 0 if it should be good, or 1 if it is known to be broken.
+ */
+STATIC int
+is_broken_guard_version(const char *platform)
+{
+  tor_version_t cur, bad;
+
+  /* assume it's good if we don't know the platform/version */
+  if (platform == NULL)
+    return 0;
+
+  /* assume it's good if we can't parse the version */
+  if (tor_version_parse(platform, &cur) == -1 ||
+      tor_version_parse("Tor 0.3.0.0-alpha-dev", &bad) == -1)
+    return 0;
+
+  /* bug #20499 affects versions from 0.2.9.1-alpha-dev
+   * to 0.2.9.4-alpha-dev and version 0.3.0.0-alpha-dev
+   */
+  if (!tor_version_as_new_as(platform, "Tor 0.2.9.1-alpha-dev") ||
+      (tor_version_as_new_as(platform, "Tor 0.2.9.5-alpha") &&
+      tor_version_compare(&cur, &bad) != 0))
+    return 0;
+
+  return 1;
+}
+
 /** Extract status information from <b>ri</b> and from other authority
  * functions and store it in <b>rs</b>>.
  *
@@ -2154,6 +2184,7 @@ set_routerstatus_from_routerinfo(routerstatus_t *rs,
   rs->is_valid = node->is_valid;
 
   if (node->is_fast && node->is_stable &&
+      !is_broken_guard_version(ri->platform) &&
       ((options->AuthDirGuardBWGuarantee &&
         routerbw_kb >= options->AuthDirGuardBWGuarantee/1000) ||
        routerbw_kb >= MIN(guard_bandwidth_including_exits_kb,
diff --git a/src/or/dirserv.h b/src/or/dirserv.h
index 1e4f27e..6f645bc 100644
--- a/src/or/dirserv.h
+++ b/src/or/dirserv.h
@@ -132,6 +132,7 @@ STATIC int dirserv_has_measured_bw(const char *node_id);
 STATIC int
 dirserv_read_guardfraction_file_from_str(const char *guardfraction_file_str,
                                       smartlist_t *vote_routerstatuses);
+STATIC int is_broken_guard_version(const char *platform);
 #endif
 
 int dirserv_read_measured_bandwidths(const char *from_file,
diff --git a/src/test/test_dir.c b/src/test/test_dir.c
index cf0b94c..0bc3060 100644
--- a/src/test/test_dir.c
+++ b/src/test/test_dir.c
@@ -3183,6 +3183,30 @@ reset_routerstatus(routerstatus_t *rs,
   rs->addr = ipv4_addr;
 }
 
+static void
+test_dir_broken_guard_versions(void *arg)
+{
+  (void)arg;
+
+#define TEST_GUARD_VERSION(v, r) \
+  tt_int_op(is_broken_guard_version("Tor " v " on Linux"), OP_EQ, r);
+
+  TEST_GUARD_VERSION("0.2.8.9", 0);
+  TEST_GUARD_VERSION("0.2.9.1-alpha", 0);
+  TEST_GUARD_VERSION("0.2.9.1-alpha-dev", 1);
+  TEST_GUARD_VERSION("0.2.9.2-alpha", 1);
+  TEST_GUARD_VERSION("0.2.9.3-alpha", 1);
+  TEST_GUARD_VERSION("0.2.9.4-alpha", 1);
+  TEST_GUARD_VERSION("0.2.9.4-alpha-dev", 1);
+  TEST_GUARD_VERSION("0.2.9.5-alpha", 0);
+  TEST_GUARD_VERSION("0.3.0.0-alpha-dev", 1);
+  TEST_GUARD_VERSION("0.3.0.1-alpha", 0);
+  tt_int_op(is_broken_guard_version("Tor unparseable version"), OP_EQ, 0);
+  tt_int_op(is_broken_guard_version(NULL), OP_EQ, 0);
+
+ done: ;
+}
+
 #define ROUTER_A_ID_STR    "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
 #define ROUTER_A_IPV4      0xAA008801
 #define ROUTER_B_ID_STR    "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB"
@@ -5771,6 +5795,7 @@ struct testcase_t dir_tests[] = {
   DIR_LEGACY(clip_unmeasured_bw_kb),
   DIR_LEGACY(clip_unmeasured_bw_kb_alt),
   DIR(fmt_control_ns, 0),
+  DIR(broken_guard_versions, 0),
   DIR(dirserv_set_routerstatus_testing, 0),
   DIR(http_handling, 0),
   DIR(purpose_needs_anonymity_returns_true_for_bridges, 0),

comment:13 Changed 8 months ago by teor

Thanks!
This looks almost ready to go.

But please make:

tor_version_parse("Tor 0.3.0.0-alpha-dev", &bad)

a separate statement with an appropriate comment. It's confusing to put it with the other check. Also, if "Tor 0.3.0.0-alpha-dev" fails to parse, that's a coding bug. So you could do something like:

/* We should always be able to parse this version string */
if (BUG(tor_version_parse("Tor 0.3.0.0-alpha-dev", &bad) == -1)) {
  return 0;
}

Nitpick:

I was confused what cur and bad meant. We could use longer names for them. I'd like to try to make it clearer that cur is the parsed form of platform; and that bad isn't the only bad version, and it is the parsed form of "Tor 0.3.0.0-alpha-dev".

comment:14 Changed 8 months ago by rubiate

Alright, how's this?

diff --git a/src/or/dirserv.c b/src/or/dirserv.c
index 6e25323..09370d0 100644
--- a/src/or/dirserv.c
+++ b/src/or/dirserv.c
@@ -2122,6 +2122,41 @@ routers_make_ed_keys_unique(smartlist_t *routers)
   } SMARTLIST_FOREACH_END(ri);
 }
 
+/** Check if the Tor version provided in the platform string <b>platform</b> is
+ * known to be broken in a way that means it should not be used as a Guard.
+ *
+ * Return 0 if it should be good, or 1 if it is known to be broken.
+ */
+STATIC int
+is_broken_guard_version(const char *platform)
+{
+  tor_version_t parsed_platform, parsed_0300_alpha_dev;
+
+  /* assume it's good if we don't know the platform/version */
+  if (platform == NULL)
+    return 0;
+
+  /* assume it's good if we can't parse the version */
+  if (tor_version_parse(platform, &parsed_platform) == -1)
+    return 0;
+
+  /* this version string should always be able to be parsed */
+  if (BUG(tor_version_parse("Tor 0.3.0.0-alpha-dev",
+                            &parsed_0300_alpha_dev) == -1)) {
+    return 0;
+  }
+
+  /* bug #20499 affects versions from 0.2.9.1-alpha-dev
+   * to 0.2.9.4-alpha-dev and version 0.3.0.0-alpha-dev
+   */
+  if (!tor_version_as_new_as(platform, "Tor 0.2.9.1-alpha-dev") ||
+      (tor_version_as_new_as(platform, "Tor 0.2.9.5-alpha") &&
+      tor_version_compare(&parsed_platform, &parsed_0300_alpha_dev) != 0))
+    return 0;
+
+  return 1;
+}
+
 /** Extract status information from <b>ri</b> and from other authority
  * functions and store it in <b>rs</b>>.
  *
@@ -2154,6 +2189,7 @@ set_routerstatus_from_routerinfo(routerstatus_t *rs,
   rs->is_valid = node->is_valid;
 
   if (node->is_fast && node->is_stable &&
+      !is_broken_guard_version(ri->platform) &&
       ((options->AuthDirGuardBWGuarantee &&
         routerbw_kb >= options->AuthDirGuardBWGuarantee/1000) ||
        routerbw_kb >= MIN(guard_bandwidth_including_exits_kb,

comment:15 follow-up: Changed 8 months ago by arma

(For those who are curious, check out git commit cd678ae790 from #13152 for the last time we had code like this.)

Hm -- tor_version_compare ends in comparing the git_tag_len entry for each tor_version_t. Does that mean that in reality, relays running 0.3.0.0-alpha-dev won't match the version string from 0.3.0.0-alpha-dev, because they will have a git tag? If so, maybe we want some hack like "not as new as 0.3.0.1-alpha, but in the same series as it"? After more digging, I think the answer is "no, this current code is fine", since relay descriptors don't have the git tag in their platform string. But I figured I would leave this paragraph here for the next person who wonders it.

Here's a tiny detail that could be improved:

+  if (!tor_version_as_new_as(platform, "Tor 0.2.9.1-alpha-dev") ||
+      (tor_version_as_new_as(platform, "Tor 0.2.9.5-alpha") &&
+      tor_version_compare(&parsed_platform, &parsed_0300_alpha_dev) != 0))

That third line should be indented one column over -- I read it the first several times as being "if x or y, and z", and then I saw the open-paren on line two, and had to puzzle through that actually it means "if x, or y and z". Having the z line up with the y will help with readability. Assuming I am in fact reading it correctly now?

Overall, I think it looks good! Is the plan that we merge it into 0.2.9.x, and merge forward to 0.3.0.0, and moria1 tries it out, and maybe also somebody tries it in a chutney network and/or in teor and dgoulet's separate Tor network, and hopefully we find and fix bugs in it by the time we want to backport it to a new 0.2.8 release?

comment:16 follow-up: Changed 8 months ago by arma

Also, rubiate, have you gotten to the point where you should make a Tor git repo somewhere and point us to those branches? :)

comment:17 in reply to: ↑ 15 Changed 8 months ago by teor

Replying to arma:

(For those who are curious, check out git commit cd678ae790 from #13152 for the last time we had code like this.)

Hm -- tor_version_compare ends in comparing the git_tag_len entry for each tor_version_t. Does that mean that in reality, relays running 0.3.0.0-alpha-dev won't match the version string from 0.3.0.0-alpha-dev, because they will have a git tag? If so, maybe we want some hack like "not as new as 0.3.0.1-alpha, but in the same series as it"? After more digging, I think the answer is "no, this current code is fine", since relay descriptors don't have the git tag in their platform string. But I figured I would leave this paragraph here for the next person who wonders it.

The unit tests also verify your logic here.

Here's a tiny detail that could be improved:

+  if (!tor_version_as_new_as(platform, "Tor 0.2.9.1-alpha-dev") ||
+      (tor_version_as_new_as(platform, "Tor 0.2.9.5-alpha") &&
+      tor_version_compare(&parsed_platform, &parsed_0300_alpha_dev) != 0))

That third line should be indented one column over -- I read it the first several times as being "if x or y, and z", and then I saw the open-paren on line two, and had to puzzle through that actually it means "if x, or y and z". Having the z line up with the y will help with readability. Assuming I am in fact reading it correctly now?

I agree this compound statement is confusing, with the combination of !, &&, ||, (, ).
How about we split it into multiple statements:

if (exactly 0.3.0-alpha-dev) {
  return 1;
}

if (older than 0.2.9.1-alpha-dev) {
  return 0;
}

if (as new as 0.2.9.5-alpha-dev) {
  return 0;
}

return 1;

Overall, I think it looks good! Is the plan that we merge it into 0.2.9.x, and merge forward to 0.3.0.0,

Are we sure 0.2.9.5-alpha and 0.3.0.1-alpha fix the stale consensus issue?
I guess we can always change the version range in a later commit.
(Maybe we should do it like cd678ae790, and use variables for all the versions?)

and moria1 tries it out,

That's one way to make sure it works

and maybe also somebody tries it in a chutney network

This should happen before merge, at least to the point we know the code runs and doesn't blow up.

The mixed target would be ideal for testing the actual flag assignments - make tor-stable a bad version, and it should fail. Make it a good version, and it should succeed. But perhaps the failures would only show up if we turned off TestingDirAuthVoteGuard *.

and/or in teor and dgoulet's separate Tor network

We can do this, but I think we'd end up with two guards in the network, because it's almost all on master. I guess that's when we tell people to (up)grade to 0.2.9.5-alpha to keep the guard flag. (Since there's no 0.3.0.1-alpha yet.)

and hopefully we find and fix bugs in it by the time we want to backport it to a new 0.2.8 release?

We want to do this backport to make sure the authorities get the patch before 0.2.9 goes stable?
I think that's important enough for a backport.

comment:18 in reply to: ↑ 16 Changed 8 months ago by rubiate

Here's the updated function:

+/** Check if the Tor version provided in the platform string <b>platform</b> is
+ * known to be broken in a way that means it should not be used as a Guard.
+ *
+ * Return 0 if it should be good, or 1 if it is known to be broken.
+ */
+STATIC int
+is_broken_guard_version(const char *platform)
+{
+  tor_version_t parsed_platform, parsed_0300_alpha_dev;
+
+  /* assume it's good if we don't know the platform/version */
+  if (platform == NULL)
+    return 0;
+
+  /* assume it's good if we can't parse the version */
+  if (tor_version_parse(platform, &parsed_platform) == -1)
+    return 0;
+
+  /* this version string should always be able to be parsed */
+  if (BUG(tor_version_parse("Tor 0.3.0.0-alpha-dev",
+                            &parsed_0300_alpha_dev) == -1)) {
+    return 0;
+  }
+
+  /* bug #20499 affects versions from 0.2.9.1-alpha-dev
+   * to 0.2.9.4-alpha-dev and version 0.3.0.0-alpha-dev
+   */
+  if (tor_version_compare(&parsed_platform, &parsed_0300_alpha_dev) == 0)
+    return 1;
+
+  if (!tor_version_as_new_as(platform, "Tor 0.2.9.1-alpha-dev"))
+    return 0;
+
+  if (tor_version_as_new_as(platform, "Tor 0.2.9.5-alpha"))
+    return 0;
+
+  return 1;
+}

Replying to arma:

Also, rubiate, have you gotten to the point where you should make a Tor git repo somewhere and point us to those branches? :)

I guess so, since you asked!

I've set up a repo at https://viennan.net/git/tor.git and there's a branch called ticket20509 with the complete patch.

comment:19 follow-up: Changed 8 months ago by teor

  • Status changed from needs_revision to needs_review

Ok, next steps are:

  • run the unit tests in a few different environments,
  • run this code in a local test network using chutney,
  • run this code on a test directory authority and make sure it takes away the guard flag.

Around about the last step, we can consider merging it to 0.2.9 and master.

rubiate, have you used chutney before?
https://git.torproject.org/chutney.git

If you check out chutney and tor side-by-side, and then run tor's make test-network-all, it will run a series of integration tests using chutney.

comment:20 in reply to: ↑ 19 Changed 8 months ago by teor

  • Reviewer set to teor, arma
  • Status changed from needs_review to merge_ready
  • Version changed from Tor: 0.2.9.1-alpha to Tor: 0.2.8.9

Replying to teor:

Ok, next steps are:

  • run the unit tests in a few different environments,

The code works on macOS 10.12.1 x86_64 and i386, and Ubuntu 16.0.4 x86_64.

  • run this code in a local test network using chutney,

Chutney runs the code successfully, but ignores the actual Guard assignments because of TestingDirAuthVoteGuard.

  • run this code on a test directory authority and make sure it takes away the guard flag.

Turns out there need to be two separate patches for this:

  • rubiate's ticket20509 works for master and 0.2.9 (but will need to be cherry-picked to 0.2.9, it's based on master)
  • my ticket20509_028 works for 0.2.8
    • authorities require the stable flag for the guard flag in 0.2.9 (merge issue)
    • we added BUG in 0.2.9 (replaced with log_warn(LD_BUG, ...))

I can test the voting for the 0.2.8 patch on the test network, but not the consensus, because the test network uses consensus method 26, and 0.2.8 only has 22. But I looked at the vote, and the only versions getting Guard are 0.2.8.9 and 0.2.9.5-alpha.

This is the same for the patch to master (I only have 2 test authorities, so no 0.2.9 test yet), it is only voting Guard for 0.2.8.9 and 0.2.9.5-alpha.

(In fact, my two authorities disagree on whether a third 0.2.9.5-alpha relay should get Guard, which demonstrates that the existing requirements are still being imposed.)

I also checked the authority logs and they're fine.

Around about the last step, we can consider merging it to 0.2.9 and master.
...

I would be happy to see this merged, at least to 0.2.9 and master, with a backport to 0.2.8 after it has received a bit more testing (for example, arma testing it on moria1 on 0.2.8.9?).

Last edited 8 months ago by teor (previous) (diff)

comment:21 Changed 8 months ago by teor

Ok, I made a branch ticket20509_029 that works for both 029 and master. It's on https://github.com/teor2345/tor.git

comment:22 follow-up: Changed 8 months ago by arma

The ticket20509_029 branch seems to work as expected on moria1.

comment:23 in reply to: ↑ 22 ; follow-up: Changed 8 months ago by teor

Replying to arma:

The ticket20509_029 branch seems to work as expected on moria1.

I've tested ticket20509_028, ticket20509_029, and ticket20509 on the test network and they all behave as expected.

Let's get this merged, then:

  • ticket20509_028 to 0.2.8
  • ticket20509_029 to 0.2.9 and then merge 0.2.9 to master

comment:24 in reply to: ↑ 23 ; follow-up: Changed 8 months ago by arma

Replying to teor:

Let's get this merged, then:

One last question -- are we sure that 0.2.9.5-alpha has correct behavior? :)

comment:25 in reply to: ↑ 24 Changed 8 months ago by teor

Replying to arma:

Replying to teor:

Let's get this merged, then:

One last question -- are we sure that 0.2.9.5-alpha has correct behavior? :)

We tried to give it better behaviour - how can we test that #20499 is correct?
And we should probably avoid releasing 0.3.0.1-alpha until we're sure it's correct, for the same reason.
(If we're sure it works, but then we find out it doesn't, we can add more versions later, but that's not ideal.)

comment:26 follow-up: Changed 8 months ago by nickm

I am +1 on waiting a couple of days till we are really sure that the bug is gone now. Doing this once is annoying enough; doing it twice would be way yuckier.

comment:27 Changed 7 months ago by nickm

  • Keywords review-group-13 added

comment:28 Changed 7 months ago by teor

Should we also take away the V2Dir flag from relays with these versions?

comment:29 in reply to: ↑ 26 Changed 7 months ago by teor

  • Keywords TorCoreTeam201612 added; TorCoreTeam201611 removed

Replying to nickm:

I am +1 on waiting a couple of days till we are really sure that the bug is gone now. Doing this once is annoying enough; doing it twice would be way yuckier.

I have scanned 386 relays on the fallback whitelist for this bug as part of the new fallback list in #18828. None of them had a stale consensus.

Here is the methodology:

  • use the fallback whitelist, including recent operator opt-ins in

https://github.com/teor2345/tor/blob/fallbacks-201612/scripts/maint/fallback.whitelist

  • exclude known versions affected by #20499,
  • exclude versions not recommended by the directory authorities, then check if a microdesc consensus is expired. (My script fixes the issue in #20501 where the local time was being compared to a UTC time.)
  • download a microdesc consensus, and check if the expiry time is after the current time.

I think we should merge this patch to 0.2.8 and later:

  • ticket20509_028 to 0.2.8
  • ticket20509_029 to 0.2.9 and then merge 0.2.9 to master

comment:30 follow-up: Changed 7 months ago by teor

  • Status changed from merge_ready to needs_revision

This patch should take away both the Guard and V2Dir flag. It needs an update.

comment:31 Changed 7 months ago by teor

Correction: I found one relay on 0.2.9.5-alpha that still has this bug, see #20909.

comment:32 Changed 7 months ago by nickm

I'm a little uncomfortable with the 0.3.0.0-alpha-dev part, though I guess it's not going to hurt anything if we put 0.3.0.1-alpha out before too long.

I'm fairly well worried about #20909 though.

comment:33 Changed 6 months ago by nickm

  • Keywords review-group-14 added; review-group-13 removed

And that's all for review-group-13.

comment:34 Changed 6 months ago by teor

  • Status changed from needs_revision to merge_ready

Now we have deferred #20909, and put out 0.3.0.1-alpha, I think we can merge this to master, and backport to 0.2.8 and 0.2.9 so that the authorities pick it up sooner rather than later.

The branches are on my github:

  • ticket20509_028 to 0.2.8
  • ticket20509_029 to 0.2.9 and then merge 0.2.9 to master

comment:35 in reply to: ↑ 30 Changed 6 months ago by teor

  • Status changed from merge_ready to needs_revision

Replying to teor:

This patch should take away both the Guard and V2Dir flag. It needs an update.

Oops, it still needs this change.

comment:36 Changed 6 months ago by rubiate

  • Cc cb@… added

Does this work for the V2Dir flag?

Diff against last patch:

diff --git a/src/or/dirserv.c b/src/or/dirserv.c
index 400f04918..5e244e1c4 100644
--- a/src/or/dirserv.c
+++ b/src/or/dirserv.c
@@ -2176,6 +2176,7 @@ set_routerstatus_from_routerinfo(routerstatus_t *rs,
 {
   const or_options_t *options = get_options();
   uint32_t routerbw_kb = dirserv_get_credible_bandwidth_kb(ri);
+  int broken_guard = is_broken_guard_version(ri->platform);

   memset(rs, 0, sizeof(routerstatus_t));

@@ -2192,8 +2193,7 @@ set_routerstatus_from_routerinfo(routerstatus_t *rs,

   rs->is_valid = node->is_valid;

-  if (node->is_fast && node->is_stable &&
-      !is_broken_guard_version(ri->platform) &&
+  if (node->is_fast && node->is_stable && !broken_guard &&
       ((options->AuthDirGuardBWGuarantee &&
         routerbw_kb >= options->AuthDirGuardBWGuarantee/1000) ||
        routerbw_kb >= MIN(guard_bandwidth_including_exits_kb,
@@ -2221,7 +2221,7 @@ set_routerstatus_from_routerinfo(routerstatus_t *rs,
   strlcpy(rs->nickname, ri->nickname, sizeof(rs->nickname));
   rs->or_port = ri->or_port;
   rs->dir_port = ri->dir_port;
-  rs->is_v2_dir = ri->supports_tunnelled_dir_requests;
+  rs->is_v2_dir = ri->supports_tunnelled_dir_requests && !broken_guard;
   if (options->AuthDirHasIPv6Connectivity == 1 &&
       !tor_addr_is_null(&ri->ipv6_addr) &&
       node->last_reachable6 >= now - REACHABLE_TIMEOUT) {

Commits are in branch ticket20509 at https://viennan.net/git/tor.git

Web view of commits:

  1. https://viennan.net/gitweb/?p=tor.git;a=commitdiff;h=270bfbbba7048189739cc37e6a9fa87e470b0c3c
  2. https://viennan.net/gitweb/?p=tor.git;a=commitdiff;h=1b49cd2408c84e450d2699781eab1e2cdee8cad2
Last edited 5 months ago by rubiate (previous) (diff)

comment:37 Changed 5 months ago by nickm

  • Status changed from needs_revision to needs_review

(I think this should be needs_review?)

comment:38 Changed 5 months ago by nickm

  • Keywords review-group-15 added; review-group-14 removed

comment:39 Changed 4 months ago by teor

  • Status changed from needs_review to needs_revision

This issue is affected by #21449, so we actually need to exclude versions:

0.3.0.0-alpha-dev <= version < 0.3.0.1-alpha-dev 

which is equivalent to checking for this version regardless of git tag:

0.3.0.0-alpha-dev <= version <= 0.3.0.1-alpha-dev (git-ffffffffffffffffffffffffffffffffffffffff)

The 0.2.9 check is correct - it includes all of 0.2.9.4-alpha-dev regardless of git tags.

comment:40 Changed 4 months ago by teor

The best way of doing this is to use tor_version_same_series().

comment:41 Changed 4 months ago by rubiate

Didn't 15 conclude that the git tag isn't a problem?

comment:42 Changed 5 weeks ago by arma

Ok, we fucked this one up by ignoring it and then waiting for the patch to be perfect.

In the mean time, time has moved on, and now there remain only 7 relays running the broken versions:

r esbek1 3hNPyOXMTsil3maTTnCsnXAmcZc rECneYjmZntviDcuO8km13PIUt0 2017-05-20 06:2
6:49 195.191.233.221 443 80
s Fast HSDir Running Stable V2Dir Valid
v Tor 0.2.9.2-alpha
pr Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 LinkAuth=1 Microdesc=1-2 Relay=1-2
w Bandwidth=755
p reject 1-65535

r Biverse twYpwpoDKXnlrQDMUdIzCEWq/Wo o6IX2QTzXcxCV+SICEBi23Z96IY 2017-05-20 07:
26:41 178.63.97.34 9001 80
s Fast Guard HSDir Running Stable V2Dir Valid
v Tor 0.2.9.4-alpha
pr Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 LinkAuth=1 Microdesc=1-2 Relay=1-2
w Bandwidth=20000
p accept 43,53,79,88,443,465,563,587,706,873,993,995,1194,1533,2947,3386,3690,4321,5031,5222-5223,8008,8333,8443,9418,9420-9422,11371

r larry cQ8N7QnRNNVWSelq0N8XpLJ95Vw qdce5an15mLR8LsaYPONk1MTz14 2017-05-20 13:40
:18 216.12.198.82 443 8080
s Fast HSDir Running Stable V2Dir Valid
v Tor 0.2.9.5-alpha
pr Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 LinkAuth=1 Microdesc=1-2 Relay=1-2
w Bandwidth=252
p reject 1-65535

r curly tT9d5f1VyDa+1r3b/B3JBrRUOCE vFp7HtS121Vz/qp4eBc9gveD3WY 2017-05-20 16:30
:41 216.12.198.84 443 8080
s Fast HSDir Running Stable V2Dir Valid
v Tor 0.2.9.5-alpha
pr Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 LinkAuth=1 Microdesc=1-2 Relay=1-2
w Bandwidth=296
p reject 1-65535

r RToRpi xpH4Q7HofSS/A0HsfqA72wF66OY lIrkqhXI68YrcTdnooHmShGh3WU 2017-05-20 15:2
7:41 213.152.161.40 11752 58488
s Fast HSDir Running Stable V2Dir Valid
v Tor 0.2.9.5-alpha
pr Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 LinkAuth=1 Microdesc=1-2 Relay=1-2
w Bandwidth=512
p reject 1-65535

r moe +ekmb58Zf9XTuaqeByvxk6YYg24 SqRFx6+xMavuDLcbFJfftaSTQAc 2017-05-20 17:56:3
3 216.12.198.83 443 8080
s Fast HSDir Running Stable V2Dir Valid
v Tor 0.2.9.5-alpha
pr Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 LinkAuth=1 Microdesc=1-2 Relay=1-2
w Bandwidth=146
p reject 1-65535

r StoilePlayground /yGGQjtqjqwHl4K4G0y3QWUNxzI VMsmNTEvZM4dpiNZP3DN4UznLhA 2017-
05-20 11:26:58 193.7.177.223 9001 9030
s Running Stable V2Dir Valid
v Tor 0.2.9.5-alpha
pr Cons=1-2 Desc=1-2 DirCache=1 HSDir=1 HSIntro=3 HSRend=1-2 Link=1-4 LinkAuth=1 Microdesc=1-2 Relay=1-2
w Bandwidth=20
p reject 1-65535

I've reached out to 6 of the 7 of them, to get them to upgrade. Alas, the only one with no ContactInfo is also the only the one with the Guard flag. But I think our next step here is either to change the directory authorities to reject relays running the buggy versions, or to close this ticket and pretend it doesn't matter anymore.

comment:43 Changed 5 weeks ago by arma

Oh hey, 0.2.9.5-alpha is ok, right? (Modulo #20909)

So that leaves only 2 remaining affected relays, one of which is tiny and I've contacted, and the other of which is big and has the Guard flag and has no contactinfo.

comment:44 Changed 5 weeks ago by arma

  • Milestone changed from Tor: 0.2.9.x-final to Tor: 0.3.1.x-final
  • Status changed from needs_revision to needs_review
  • Summary changed from Directory authorities should take away Guard flag from relays with #20499 bug to Directory authorities should reject relays with #20499 bug

My bug20509 branch does the right thing.

I'm running it on moria1 now, and I tested it by running a relay on 0.2.9.4-alpha and confirming that it gets rejected with the right complaint.

I moved the ticket to milestone 0.3.1 so it doesn't get lost, but we should consider backports or there won't be much point.

comment:45 Changed 5 weeks ago by nickm

  • Resolution set to implemented
  • Status changed from needs_review to closed

cherry-picked to 0.2.9 and merged forward.

Note: See TracTickets for help on using tickets.