Relays should store HS descriptor without the complicated "am I the right one" logic

changed milestone to %Tor: 0.2.8.x-final

added component::core tor/tor milestone::Tor: 0.2.8.x-final must-fix-before-028-rc points::small priority::medium resolution::implemented severity::normal sponsor::R-must status::closed tor-hs type::enhancement labels

(I am hijacking David's ticket, with permission.)

Trac:
Description: This maybe sounds crazy but the idea here is that service and HSDir can have different view of the network so it's possible that a service thinks some relay is an HSDir but not the relay itself resulting in a failure to upload the descriptor (good thing we have 6 hsdirs!). Also, it would be useful in our figth against malicious HSDir enumerating .onion, we could find them before they actually become an HSDir.

As long as the relay sees that it's responsible for the descriptor ID, it should store it with or without the HSDir flag. Being responsible for the descID is important here else we can end up lowering the bar for anyone to upload arbitrary data enclosed in a descriptor. Altough this is possible right now, let's not make it possible for all relays at all time for any ID.

As for DoS consideration that is someone uploading lots and lots of descriptors in the first 96 hours before becoming an HSDir, then oops the relay is out of memory for legitimate descriptors. First, we currently have this "problem" and second we do purge our cache if memory usage goes to high (part of our oom).

We should NOT cache it when supports_tunnelled_dir_requests is unset. It's a requirement to become an HSDir that if we don't have we shouldn't do it. (DirCache 0 or ClientOnly or DirPort set, ...)

Thoughts?

to

Right now relays try to guess if they have the HSDir flag, and try to guess if they are the right HSDir for the descriptor to be uploaded, and they refuse the descriptor if they think they weren't supposed to receive it.

This behavior is ripe for consistency bugs, since maybe clients correctly have a different view of the network than the hsdir does, so it refuses a descriptor when it isn't supposed to refuse it.

It also generally adds complexity to hsdir behavior, which doesn't need to be there.

Now, there is one argument for refusing hidserv descriptors when you don't think you'll be asked for them by clients -- it makes it a little bit harder to use relays to store arbitrary data blobs (since you need to arrange for the data blobs to have content such that they are supposed to go to that hsdir for that day). But I don't think this barrier is any real barrier to somebody trying to stuff a relay with descriptors. I think defending against that DoS or misuse attack is worth exploring, but I don't think the current behavior represents a good tradeoff.
Summary: Relay should store HS descriptor even when they don't have the HSDir flag to Relays should store HS descriptor without the complicated "am I the right one" logic

As a side effect of this patch, once we implement it, we will be able to probe relays for whether they are snooping on hidden service descriptors, before they get the HSDir flag.

To me this is a cleaner design -- the relays offer the functionality that they offer, independent of whether the directory authorities have decided they are suitable for clients to use yet.

My ticket18332 branch implements this proposed change.

I'm moving the ticket to 0.2.8 since it has a patch for review, and this simplification should help with prop224 too.

As a bonus, if you like the ticket18332 branch, take a look at the commit on the ticket18332b branch and see if you like that too.

Trac:
Type: defect to enhancement
Status: new to needs_review
Milestone: Tor: 0.2.9.x-final to Tor: 0.2.8.x-final

Ok, commit b1917a0614 clobbered this one, so I have made a new branch, "ticket18332-try2", and pushed it.

Trac:
Reviewer: N/A to N/A

This is also going to need a patch to torspec, around where rend-spec says:

                "A hidden service directory node parses every received 
                 descriptor and only stores it when it thinks that it is 
                 responsible for storing that descriptor based on its own 
                 routing table."

(noted by dgoulet)

It appears that e167910fce2b83d9de3a252cdf02cabdafced14b implements the change, and the other patches on your branch just do cleanups and deadcode removals. What would you say if I took e167910fce in 0.2.8, and the rest in 0.2.9?

Other notes: e167910fce2b83d9de3a252cdf02cabdafced14b rip out hid_serv_responsible_for_desc_id()

NM.1 Does this make bridges start storing hs descriptors?

53902963383e1babfccb8a4ffc7ed4e8accf2214 rip out hid_serv_acting_as_directory()

NM.2 I wonder if this should call public_server_mode() instead of calling nothing; I wouldn't want bridges or clients or hidden services to start storing descriptors.

The other commits look okay to me.

Replying to nickm:

This is also going to need a patch to torspec, around where rend-spec says: {{{ "A hidden service directory node parses every received descriptor and only stores it when it thinks that it is responsible for storing that descriptor based on its own routing table." }}} (noted by dgoulet)

Yes. Good catch. I believe the fix is to remove that paragraph.

Replying to nickm:

It appears that e167910fce2b83d9de3a252cdf02cabdafced14b implements the change, and the other patches on your branch just do cleanups and deadcode removals. What would you say if I took e167910fce in 0.2.8, and the rest in 0.2.9?

Fine by me. The main reason to do it all now is so both branches are simpler, in case that becomes useful later (e.g. we have to fix a bug). But that's always the reason to do all the changes now now now, and this isn't any more compelling a case than the others.

Other notes: e167910fce2b83d9de3a252cdf02cabdafced14b rip out hid_serv_responsible_for_desc_id()

NM.1 Does this make bridges start storing hs descriptors?

I believe it does not change the situation. Bridges used to be willing to store hs descriptors (if you give them one whose identifier is sufficiently near the bridge's identity key), and they are after this change too. Maybe somebody should open a ticket. :)

The commit almost changes things for clients, since it does indeed no longer do the if (!server_mode(get_options())) check inside router_get_my_routerinfo(). They would need to have an open dirport, and you would need to be doing the rendezvous2 post via begin_dir, but begin_dir requires an ORPort, so it's not a client.

53902963383e1babfccb8a4ffc7ed4e8accf2214 rip out hid_serv_acting_as_directory()

NM.2 I wonder if this should call public_server_mode() instead of calling nothing; I wouldn't want bridges or clients or hidden services to start storing descriptors.

I think this is a fine plan. I'd want to do the check in directory_handle_command_post(). And then do the check again in rend_cache_store_v2_desc_as_dir() if we like redundancy.

I think the above notes mean I should change my changes file. I've pushed another commit which does this change.

Replying to arma:

Yes. Good catch. I believe the fix is to remove that paragraph.

This is now done in branch 'ticket18332' in my torspec repo.

Replying to arma:

I think this is a fine plan. I'd want to do the check in directory_handle_command_post(). And then do the check again in rend_cache_store_v2_desc_as_dir() if we like redundancy.

My ticket18332-try2 branch now includes a commit to prevent bridges from processing rendezvous2 attempts.

Marking these as must-fix-before-028-rc.

Actually, some of them may not need to get 'fixed' before the rc, but I believe that they should either get fixed, or we should have a good explanation of why they don't need to get fixed.

Trac:
Keywords: N/A deleted, must-fix-before-028-rc added

The addition of this check in rend_cache_store_v2_desc_as_dir() broke some unit tests:

+  if (!public_server_mode(options)) {
+    log_warn(LD_BUG, "rend_cache_store_v2_desc_as_dir() called but we're "
+             "not a public relay. Refusing.");
+    return -1;
   }

My branch ticket18332_028_01 fixes that in a "questionable ugly way". We could remove that check actually since it's made in the directory_handle_command_post prior to ever calling the store function and it's the only code path that goes into it.

It also broke that test and I don't know why:

routerlist/pick_directory_server_impl: [forking] 
  FAIL src/test/test_routerlist.c:328: assert(tor_memeq(rs->identity_digest, router3_id, DIGEST_LEN))
  [pick_directory_server_impl FAILED]

Trac:
Status: needs_review to needs_revision

Trac:
Sponsor: SponsorR to SponsorR-must

Ok, I just uploaded a ticket18332-try3 branch on my side.

It simply removes the call to public_server_mode from inside rend_cache_store_v2_desc_as_dir, since as you say that call draws in all sorts of other stuff that the unit tests aren't prepared for.

There's a tension here between simplicity of code, and the belt-and-suspenders approach we might take to protect ourselves from future mistakes. I think David gives a good example of how we'd need to hack the unit tests to survive the belt-and-suspenders approach here. But even in his hack, I worry that he has one place where he maybe should have set options_mutable->ORPort back to 0 but didn't, and also I wonder if there are new things that we malloc but don't free at the end of the unit tests. So in the argument for simplicity, I think not making more of a mess of the unit tests is a good balance.

Trac:
Status: needs_revision to needs_review

lgtm!

(Tested in chutney successfully and make test passes.)

merged to master. Thanks!

Trac:
Status: needs_review to closed
Resolution: N/A to implemented

closed

moved to tpo/core/tor#18332 (closed)

Relays should store HS descriptor without the complicated "am I the right one" logic

Child items 0

Activity