Opened 4 years ago

Closed 4 years ago

#14846 closed enhancement (implemented)

Controller: retrieve an HS descriptor of a service run by a user

Reported by: dgoulet Owned by: donncha
Priority: Medium Milestone: Tor: 0.2.8.x-final
Component: Core Tor/Tor Version: Tor: 0.2.7
Severity: Keywords: tor-hs
Cc: asn Actual Points:
Parent ID: #3521 Points: small
Reviewer: Sponsor: SponsorR

Description (last modified by arma)

Second part of #3521, this is the ability for the controller to retrieve an HS descriptor of a service run by a user (a local hidden service basically).

To make things perfectly clear, this is NOT a command to fetch a descriptor but a way to retrieve the content of an hidden service descriptor running on the same Tor as the control port.

Child Tickets

TicketTypeStatusOwnerSummary
#16291enhancementclosedEmit control event when a HS descriptor is generated

Change History (34)

comment:1 Changed 4 years ago by arma

Description: modified (diff)

comment:2 Changed 4 years ago by arma

Sounds like it could be useful. I think it might be tricky to implement though, since Tor doesn't keep a copy of the descriptor it uploaded, does it?

(rend_encode_v2_descriptors() makes some descriptors and then uploads them and then discards them.)

That said, keeping a copy of the last descriptor you encoded doesn't sound too hard, and would make this ticket pretty easy to implement.

Makes me wonder if we also want an event for publishing hsdescs (else, how can you know that the answer to this getinfo has changed).

comment:3 in reply to:  2 ; Changed 4 years ago by dgoulet

Replying to arma:

Sounds like it could be useful. I think it might be tricky to implement though, since Tor doesn't keep a copy of the descriptor it uploaded, does it?

(rend_encode_v2_descriptors() makes some descriptors and then uploads them and then discards them.)

Yes, upload_service_descriptor() discard them at the end.

That said, keeping a copy of the last descriptor you encoded doesn't sound too hard, and would make this ticket pretty easy to implement.

Either that or we recreate (without uploading) and dump it?

Makes me wonder if we also want an event for publishing hsdescs (else, how can you know that the answer to this getinfo has changed).

Maybe adding an Action UPLOADED to the HS_DESC event?

comment:4 in reply to:  3 ; Changed 4 years ago by arma

Replying to dgoulet:

That said, keeping a copy of the last descriptor you encoded doesn't sound too hard, and would make this ticket pretty easy to implement.

Either that or we recreate (without uploading) and dump it?

Careful! Recreating it will sometimes make a different descriptor than the one you published. For example,

  time_period = get_time_period(now, period, service_id);

so you could even end up with a different descriptor_id than the one you published.

Makes me wonder if we also want an event for publishing hsdescs (else, how can you know that the answer to this getinfo has changed).

Maybe adding an Action UPLOADED to the HS_DESC event?

Interesting idea. So far all the HS_DESC events are about clients visiting a hidden service, right? If so, would we want to jumble in the events from operating a hidden service too, or does that merit a new event type?

comment:5 in reply to:  4 Changed 4 years ago by dgoulet

Replying to arma:

Replying to dgoulet:

That said, keeping a copy of the last descriptor you encoded doesn't sound too hard, and would make this ticket pretty easy to implement.

Either that or we recreate (without uploading) and dump it?

Careful! Recreating it will sometimes make a different descriptor than the one you published. For example,

  time_period = get_time_period(now, period, service_id);

so you could even end up with a different descriptor_id than the one you published.

Yes it can but that's kind of the point of getting the latest possible descriptor when you query Tor with a configured HS though, as you mention it, might not be the one that is currently published (but soon to be). I'm not sure here which one is the more useful, the latest or the one published (or both)?...

Makes me wonder if we also want an event for publishing hsdescs (else, how can you know that the answer to this getinfo has changed).

Maybe adding an Action UPLOADED to the HS_DESC event?

Interesting idea. So far all the HS_DESC events are about clients visiting a hidden service, right? If so, would we want to jumble in the events from operating a hidden service too, or does that merit a new event type?

It's only client yes but since tor is a "all in one daemon", an HS service can also be a client for a user. I don't see how a client at some point can upload an HS desc thus this action would be quite specific to the HS service subsystem.

I guess it comes down to how we want to "perceive" the HS_DESC event as a possible event regardless of the subsystem at hand or should it be specific to a subsystem, here the client HS part. (The spec does not really define the "owner" of an HS_DESC event, sounds we could actually clarify that.)

comment:6 Changed 4 years ago by nickm

Owner: dgoulet deleted
Status: newassigned

comment:7 Changed 4 years ago by nickm

Keywords: 027-triaged-1-in added

Marking more tickets as triaged-in for 0.2.7

comment:8 Changed 4 years ago by isabela

Keywords: SponsorS added
Points: small
Priority: normaltrivial
Version: Tor: 0.2.7

comment:9 Changed 4 years ago by asn

Cc: asn added

comment:10 Changed 4 years ago by donncha

Owner: set to donncha

I've created a child ticket (#16291) to discuss the descriptor created event.

There seems to be two reasonable places where this control command could be placed. In #14845 the command to request a descriptor from the client-side cache was specified as GETINFO hs/client/desc/id/<ADDR>. Similarly the service-side cache command could be placed under GETINFO hs/desc/address/<ADDR>.

Alternatively the command could be added as an option under HSFETCH such as HSFETCH LOCAL=1 <addr>. HSFETCH is currently focused on making network requests to the HSDir system. I think that a GETINFO type command is more consistent when retrieving information from the local Tor descriptor cache.

It's important to keep the descriptors for the service-side cache and the client-side cache separate. Some refactoring will be needed to avoid lots of duplication in rend_cache_store_v2_desc_as_dir() and a rend_cache_store_v2_desc_as_service().

upload_service_descriptor() currently both generates and uploads the descriptor. A descriptor is only generated if PublishHidServDescriptors == 1. This function should be split to allow descriptor generation and descriptor upload to occur separately. It should be possible to generate a descriptor and store it in the service descriptor cache when PublishHidServDescriptors == 0.

comment:11 Changed 4 years ago by donncha

I've pushed a branch with an initial implementing of this control port command on my Github fork at https://github.com/DonnchaC/tor/pull/1 and https://github.com/DonnchaC/torspec/pull/1.

rendcache.c is starting to look a little messy with lots of duplication between rend_cache_lookup_entry(), rend_cache_lookup_v2_desc_as_service() and rend_cache_lookup_v2_desc_as_dir(). I'm open to suggestion on the best way to tidy it up.

With this branch the service caches it's generated descriptor in rend_encode_v2_descriptors() when the descriptor is normally generated. The descriptor will be generated and stored even if RendPostPeriod == 0.

comment:12 Changed 4 years ago by asn

Here is an initial review:

  • This could be made easier to review. A few later commits (like 997d596) change code that was introduced by previous commits. I'd suggest rebasing the branch so that related changes happen in one commit if it can be done.


Also, commit a454cd93b is a bit tricky to review. That's because it moves lots of code around, and also edits code in the same commit. Let me suggest, if it's possible, you make one commit that moves or re-indents code first, and then a second commit that alters code. Then it's easier to review the changes.

  • I agree that rend_cache_lookup_v2_desc_as_service() looks very similar to rend_cache_lookup_entry() (although you said rend_cache_lookup_v2_desc_as_dir() in your comment). Maybe you can merge your thing in rend_cache_lookup_entry() and also add an extra argument to distinguish between the client and the service-side cache? Would that make it look nice you think?
  • You do
    for (i = 0; i < smartlist_len(descs); i++)
            rend_encoded_v2_service_descriptor_free(smartlist_get(descs, i));
    
    we usually use SMARTLIST_FOREACH in such cases. Check it out.
  • You say:
        } else {
          log_info(LD_REND, "Successfully stored created v2 rend descriptors!");
        }
    
    is this true even after 997d596?
  • Please document the definition of rend_cache_service.


comment:13 Changed 4 years ago by asn

Status: assignedneeds_revision

comment:14 Changed 4 years ago by dgoulet

Keywords: TorCoreTeam201508 added

comment:15 Changed 4 years ago by donncha

Status: needs_revisionneeds_review

Thanks for the initial review:

  • Apologies those commits where a bit all over the place. I've no made git rebase my friend and will try to push tidier sets of commits in future :). I've split the identation and code changes from a454cd93b into separate commits.
  • I've added an argument to rend_cache_lookup_entry() so that it can also be used to query the service-side descriptor cache.
  • The code:
    for (i = 0; i < smartlist_len(descs); i++)
            rend_encoded_v2_service_descriptor_free(smartlist_get(descs, i));
    
    was already in upload_service_descriptor(). I didn't want to change that while adding more code, should I add a commit to change that to the SMARTLIST_FOREACH style?
  • Yes, that stored log message is still output after 997d596.
  • I've add a short description of rend_service_cache where is is defined in the rendcache.c.

I've pushed a new rebased branch which contains the suggestions you made at https://github.com/DonnchaC/tor/pull/2/commits. FYI GitHub has a ?w=1 option which hides whitespace changes in the displayed diff:
https://github.com/DonnchaC/tor/pull/2/files?w=1

comment:16 Changed 4 years ago by dgoulet

Status: needs_reviewneeds_revision

Here is a review per commit based on branch feature14846_2:

59b6d85a6f6a4abdad22d9efa6e5eccff6c1dbfe

  • In rendcache.c: rend_cache_service should be cleaned up in rend_cache_free_all().
  • In rendcache.c, fct rend_cache_store_v2_desc_as_service(), this statement should be move just before the if (!e) { so we avoid calling tor_free() twice on intro_contents if timestamp is newer. Good thing tor_free sets the pointer to NULL! but if it was not the case, we have a double free issue.
+  /* We don't care about the introduction points. */
+  tor_free(intro_content);
  • For the int service extra argument of rend_cache_lookup_entry, I propose an enum value here representing the "type of the cache" and the reason is that this argument now makes the function choose which cache to lookup in, so adding that enum would tell us which cache is being looked at when reading the code. Not only that, but we could see in the future having only one function to do lookup in rend caches which shares common code and could call more specific functions for different type of cache. Something like:
typedef enum {
  REND_CACHE_TYPE_CLIENT  = 1,
  REND_CACHE_TYPE_SERVICE = 2,
} rend_cache_type_t

and only one would be allowed (no | shanigans). At some point REND_CACHE_TYPE_DIR would be nice :).

fe43c0fb259cb331c1d91cc27968c2a230b44ca0

  • lgtm;

adca2261c4be56e769c35a5042e34e1e94023462

  • git show -b has nothing so lgtm :)

96969252a54f09555998c7e05ed43abe84324f87
347f0b116c2a02350519fbea4047b96424f4b1e0
56195224314e7637d5617e550660f47593ef53e0

  • lgtm;

comment:17 Changed 4 years ago by donncha

Status: needs_revisionneeds_review

Thank you for the helpful review. I've pushed two commits (hopefully) resolving those issues to https://github.com/DonnchaC/tor/pull/2.

Let me know if you would like me to rebase these fix's onto a new feature branch.

comment:18 Changed 4 years ago by nickm

Note: this branch is much easier to review if you consider it as a single patch, with whitespace changes suppressed. For me, that's:

git diff -b master...donnchac/feature14846_2

Overall notes:

  • It makes me a little bit nervous to have a single function to handle server-side and client-side lookups.
  • I think we should name the "service cache" to make it clear that it's our own descriptors we're asking about.
  • rend_cache_store_v2_desc_as_service() appears to be mostly duplicate code from rend_desc_cache_store_v2_desc_as_client(). Duplicate code usually indicates an opportunity for a function.
  • I wonder if this will conflict with any of dgoulet's patches for #16389.
  • Does anything ever expire entries from rend_cache_service, or do they stick around forever? (Currently this is handled for the other cache in rend_cache_clean.)

comment:19 Changed 4 years ago by dgoulet

Status: needs_reviewneeds_revision

comment:20 in reply to:  18 ; Changed 4 years ago by donncha

Replying to nickm:>

Thank you for the feedback

Overall notes:

  • It makes me a little bit nervous to have a single function to handle server-side and client-side lookups.

Should I revert to having separate rend_cache_lookup_v2_desc_as_service() and rend_cache_lookup_entry() functions? Alternatively I could add an assert for the cache type and require explicit specification of the cache rather than defaulting to the client cache. This should reduce the risk of information leaks between the client and service-side cache.

  • I think we should name the "service cache" to make it clear that it's our own descriptors we're asking about.

I'll rename it to rend_cache_local_service.

  • rend_cache_store_v2_desc_as_service() appears to be mostly duplicate code from rend_desc_cache_store_v2_desc_as_client(). Duplicate code usually indicates an opportunity for a function.

I'm don't think sharing code between rend_cache_store_v2_desc_as_service() and rend_desc_cache_store_v2_desc_as_client() would result in cleaner code. The client function needs to do validation based on the client's rend query. It also parses the descriptor's introduction points unlike rend_cache_store_v2_desc_as_service().

  • I wonder if this will conflict with any of dgoulet's patches for #16389.

I'll rebase on #16389 which is now merged in master.

  • Does anything ever expire entries from rend_cache_service, or do they stick around forever? (Currently this is handled for the other cache in rend_cache_clean.)

I'll add a rend_cache_type argument to rend_cache_clean() and then clean rend_cache_service when the client cache is cleaned.

comment:21 in reply to:  20 ; Changed 4 years ago by nickm

Replying to donncha:

Replying to nickm:>

Thank you for the feedback

Overall notes:

  • It makes me a little bit nervous to have a single function to handle server-side and client-side lookups.

Should I revert to having separate rend_cache_lookup_v2_desc_as_service() and rend_cache_lookup_entry() functions? Alternatively I could add an assert for the cache type and require explicit specification of the cache rather than defaulting to the client cache. This should reduce the risk of information leaks between the client and service-side cache.

That might be good. It's okay if they share code, but what I'm really concerned about is accidental API misuse.

  • I think we should name the "service cache" to make it clear that it's our own descriptors we're asking about.

I'll rename it to rend_cache_local_service.

Sounds fine!

  • rend_cache_store_v2_desc_as_service() appears to be mostly duplicate code from rend_desc_cache_store_v2_desc_as_client(). Duplicate code usually indicates an opportunity for a function.

I'm don't think sharing code between rend_cache_store_v2_desc_as_service() and rend_desc_cache_store_v2_desc_as_client() would result in cleaner code. The client function needs to do validation based on the client's rend query. It also parses the descriptor's introduction points unlike rend_cache_store_v2_desc_as_service().

Hmm. Possibly? Let's just open a new ticket once this is merged to return to the question.

  • I wonder if this will conflict with any of dgoulet's patches for #16389.

I'll rebase on #16389 which is now merged in master.

Thanks!

  • Does anything ever expire entries from rend_cache_service, or do they stick around forever? (Currently this is handled for the other cache in rend_cache_clean.)

I'll add a rend_cache_type argument to rend_cache_clean() and then clean rend_cache_service when the client cache is cleaned.

Sounds good!

comment:22 Changed 4 years ago by nickm

Keywords: PostFreeze027 added

comment:23 Changed 4 years ago by nickm

Keywords: TorCoreTeam201509 added; TorCoreTeam201508 removed

comment:24 Changed 4 years ago by nickm

Keywords: PostFreeze027 removed
Milestone: Tor: 0.2.7.x-finalTor: 0.2.8.x-final

comment:25 in reply to:  21 Changed 4 years ago by donncha

Status: needs_revisionneeds_review

Sorry for the delay on this, I've pushed a new branch feature14846_3 which is rebased on master (including #16389).

I've reverted to having separate functions to query the client-side and service-side descriptor caches. I also added an argument to rend_cache_clean() to clean rend_cache_local_service when the client rend cache is cleaned.

Please let me know if there is any further issues and I'll fix them up ASAP

comment:26 Changed 4 years ago by special

I think the GETINFO should be consistent with the existing hs/client/desc/id/<ADDR> option. How about hs/service/desc/id/<ADDR>?

Trivial typo in the changelog (for -> from).

Which replica of the descriptor you get is an implementation detail of the code, and the controller can't get both. Does that matter? I'm guessing no.

From a read through of the code, this looks good to me.

comment:27 in reply to:  26 ; Changed 4 years ago by dgoulet

Status: needs_reviewneeds_revision

Replying to special:

I think the GETINFO should be consistent with the existing hs/client/desc/id/<ADDR> option. How about hs/service/desc/id/<ADDR>?

Agree with special here.

Trivial typo in the changelog (for -> from).

Which replica of the descriptor you get is an implementation detail of the code, and the controller can't get both. Does that matter? I'm guessing no.

Code suggest that you'll get both actually. See control_event_hs_descriptor_created(), the replica is outputed in the control message.

From a read through of the code, this looks good to me.

Any chance we can modify control-spec.txt also as we merge this feature? I see a branch for this ticket in your github but pointing it here explicitely with the latest branch ready for merge would be good once we make the change above.

In rend_cache_clean(), I see this being added with static for which I don't think it's desired here, you just want a temporary "cache pointer" for the lifetime of the function.

+  static strmap_t *cache = NULL;

The side effect here would be that it's initialize to NULL *once* and then the if/else after sets it to whatever cache we need. The next call to that function, cache will point to the latest assignement and *NOT* be initialized to NULL thus always passing the assert() test after even if the cache type is wrong.

The rest lgtm!

comment:28 in reply to:  27 Changed 4 years ago by donncha

Replying to dgoulet:

Replying to special:

I think the GETINFO should be consistent with the existing hs/client/desc/id/<ADDR> option. How about hs/service/desc/id/<ADDR>?

Agree with special here.

Changed.

Trivial typo in the changelog (for -> from).

Fixed.

Which replica of the descriptor you get is an implementation detail of the code, and the controller can't get both. Does that matter? I'm guessing no.

Code suggest that you'll get both actually. See control_event_hs_descriptor_created(), the replica is outputed in the control message.

Your correct special. Tor will attempt to generate and store the two replica descriptors. The local service descriptor cache is indexed by service ID so the second replica descriptor will replace the first descriptor. The controller will only be able to retrieve the second replica which was added to the cache last.

The HS_DESC CREATED event alerts the controller that each descriptor (for each replica) has been created, the code doesn't currently allow both replicas to be retrieved.

From a read through of the code, this looks good to me.

Any chance we can modify control-spec.txt also as we merge this feature? I see a branch for this ticket in your github but pointing it here explicitely with the latest branch ready for merge would be good once we make the change above.

In rend_cache_clean(), I see this being added with static for which I don't think it's desired here, you just want a temporary "cache pointer" for the lifetime of the function.

+  static strmap_t *cache = NULL;

The side effect here would be that it's initialize to NULL *once* and then the if/else after sets it to whatever cache we need. The next call to that function, cache will point to the latest assignement and *NOT* be initialized to NULL thus always passing the assert() test after even if the cache type is wrong.

Thank you for the explanation, I have fixed this now.

I've pushed updated branches for the tor and torspec patches. Thank you special and dgoulet for your feedback on this patch.

comment:29 Changed 4 years ago by dgoulet

Status: needs_revisionneeds_review

tor patch lgtm!

torspec has a bike shedding issue, seems like one line is not wrapped correclty (line: 852). The rest lgtm!

comment:30 Changed 4 years ago by donncha

Thanks for the quick review, I've fixed the line wrapping in torspec pushed it to torspec/feature14846_2.

comment:31 Changed 4 years ago by nickm

Keywords: 028-triaged added

comment:32 Changed 4 years ago by nickm

Keywords: SponsorR removed
Sponsor: SponsorR

Bulk-replace SponsorR keyword with SponsorR sponsor field in Tor component.

comment:33 Changed 4 years ago by dgoulet

Keywords: controller 027-triaged-1-in SponsorS TorCoreTeam201509 028-triaged removed
Priority: trivialnormal

comment:34 Changed 4 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

Looks okay. Merged to master!

Note: See TracTickets for help on using tickets.