Last part of #3521 (moved), this is a bigger one. The ability to ask tor to fetch an HS descriptor from either the default HSdir set/replicas or onto the some specific ones given by the user.
This particular command will trigger external connections to fetch those descriptors.
Furthermore, we should think about if we want the results cached by the client or not which can have bad side effects depending on the intention. Considering #3523 (moved), if tor is able to do that at some point, there could be a use for this command to fetch and store a desc. from a specific HSdir.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
I think maybe it should; and probably it should be nonblocking. (Current designs in control world are that commands which might not finish immediately, and which cause Tor to hit the network, need to not block for the thing to finish.)
I think maybe it should; and probably it should be nonblocking. (Current designs in control world are that commands which might not finish immediately, and which cause Tor to hit the network, need to not block for the thing to finish.)
This is not intended to block since HSDir fetch is asynchronous anyway. That means probably that "250-" replies should be replaced by "650-" then?
haven't looked at the branch yet, but, i think a controller command to initiate the fetch, and if you want to see the answer you should listen for hsdesc events, is a totally reasonable design.
haven't looked at the branch yet, but, i think a controller command to initiate the fetch, and if you want to see the answer you should listen for hsdesc events, is a totally reasonable design.
That means no descriptor content dump but maybe that's fine considering that you can ask that the fetch results are cached and then use #14845 (moved) to get the content. Sounds much more simpler!
Also, lets use all caps for 'SERVER=' and 'CACHE='.
If one or more Server are given, they are used instead.
What happens when you specify multiple? Does it pick among them randomly?
If Cache is specified, the value "yes" means that the result will becached on the client.
Cached for how long? Permanently? Or do HS descriptors have a valid-until date? Can the cache be cleared?
The HS_DESC event should be used to getthe results of the fetches.
How long does it take to retrieve a hidden service descriptor in practice? This is a lot clunkier for controllers. How about a 'BLOCKING=n' call for "I'm willing to wait up to n ms to get this descriptor"?
Ok, with all the comments above, here is a much simpler version.
See branch ticket14847_03.
What's the reasoning for the cache=yes or cache=no part? That is, why not just let rend_cache_store_v2_desc_as_client(() look at the hsdesc you get back and decide whether to keep it based on whatever rules it uses now ("I don't have a newer one, etc")?
I think to do the async thing here we'll want to extend the HSDESC event to just tell you the descriptor right then. Otherwise there's a three step process ("initiate the launch", "notice the event", "getinfo the response"), and if you initiate a bunch of launches, and then get a bunch of events, you'll only be able to getinfo one of them, and you might not even know which one it was, etc.
(Whether you extend the HSDESC event always, or add a separate HSDESC_AND_DUMP event, or what, is a matter of taste that I will leave to you and Nick if you like this approach.)
Now that I think about it, there may also be some adventure here with all of the implicit "oh a failure just happened, that means I should launch this other action" logic in hsdesc fetches. Maybe this is a good time to clean up some of that logic, or maybe it will turn out to be easier than we think to work with it. Or I guess option three is that this will just be no fun. :)
{{{
If one or more Server are given, they are used instead.
}}}
What happens when you specify multiple? Does it pick among them randomly?
I kind of thought that it would cause Tor to initiate multiple fetches, one from each.
But, good question. I wonder if that feature is valuable enough for the complexity, compared to just making the controller send you one HSFETCH per fetch you want it to launch.
Or maybe David did indeed mean to choose just one.
{{{
The HS_DESC event should be used to get
the results of the fetches.
}}}
How long does it take to retrieve a hidden service descriptor in practice? This is a lot clunkier for controllers. How about a 'BLOCKING=n' call for "I'm willing to wait up to n ms to get this descriptor"?
That's exactly what we've been heading away from with the async approach. That said, David, it would indeed be nice to give the controller writers some guess about how long they might need to wait until they see their HSDESC received or failed. I think the answer is "it's like fetching a thing over the Tor network -- typically pretty fast, but sometimes 5 to even 60 seconds."
That's exactly what we've been heading away from with the async approach.
Present world for descriptor fetching is...
Controllers can make a simple, synchronous request to read cached descriptors.
Scripts can contact a dirauth's DirPort to actively fetch the fresh thing.
This is nice. It means scripts can piggyback on a cache or download what they need, and in either case it's simple and synchronous.
If we go with an asynchronous approach the first method people will want is a simple blocking 'I want a hidden service descriptor, give it to me' method. If it doesn't live in tor it'll be in stem and that's fine. We already do something similar with creating circuits - tor doesn't provide a blocking method so stem adds a listener and waits for the event indicating that it's done...
Just makes for a more interesting dance on my end.
Please be very, very careful though that a HSDESC is always emitted, 1:1, with a call of this method. If there's any use case where the controller doesn't get either a success or failure message it'll be left hanging indefinitely.
Ok, with all the comments above, here is a much simpler version.
See branch ticket14847_03.
What's the reasoning for the cache=yes or cache=no part? That is, why not just let rend_cache_store_v2_desc_as_client(() look at the hsdesc you get back and decide whether to keep it based on whatever rules it uses now ("I don't have a newer one, etc")?
The original idea was to give a choice to the user to keep the fetched descriptor or not. However, if we go with a new HSDESC_* event to dump the content when it arrives, the "cache=" part could be removed and by default keeps the latest.
I think to do the async thing here we'll want to extend the HSDESC event to just tell you the descriptor right then. Otherwise there's a three step process ("initiate the launch", "notice the event", "getinfo the response"), and if you initiate a bunch of launches, and then get a bunch of events, you'll only be able to getinfo one of them, and you might not even know which one it was, etc.
(Whether you extend the HSDESC event always, or add a separate HSDESC_AND_DUMP event, or what, is a matter of taste that I will leave to you and Nick if you like this approach.)
I'm not too familiar what are the best practices but could we do something like this with the EXTENDED events feature?
Now that I think about it, there may also be some adventure here with all of the implicit "oh a failure just happened, that means I should launch this other action" logic in hsdesc fetches. Maybe this is a good time to clean up some of that logic, or maybe it will turn out to be easier than we think to work with it. Or I guess option three is that this will just be no fun. :)
I know... this is why I want this command accepted asap so I can start working on it. The HS fetch code is very "monolithic" in a way that it's a big block that does a lot of diffrent things. It would need to be much more modularized so we can cherry-pick the actions we need for this command and not really go through the normal process of fetching a descriptor right now.
{{{
If one or more Server are given, they are used instead.
}}}
What happens when you specify multiple? Does it pick among them randomly?
I kind of thought that it would cause Tor to initiate multiple fetches, one from each.
But, good question. I wonder if that feature is valuable enough for the complexity, compared to just making the controller send you one HSFETCH per fetch you want it to launch.
Or maybe David did indeed mean to choose just one.
"They are used instead", I meant by that if there are more than one Server specified, they are all used and a fetch is triggered on all of them
{{{
The HS_DESC event should be used to get
the results of the fetches.
}}}
How long does it take to retrieve a hidden service descriptor in practice? This is a lot clunkier for controllers. How about a 'BLOCKING=n' call for "I'm willing to wait up to n ms to get this descriptor"?
That's exactly what we've been heading away from with the async approach. That said, David, it would indeed be nice to give the controller writers some guess about how long they might need to wait until they see their HSDESC received or failed. I think the answer is "it's like fetching a thing over the Tor network -- typically pretty fast, but sometimes 5 to even 60 seconds."
Yup, "few seconds" up to a dir request timeout of ? (I don't know the value here). Should be added to the spec!
Ok I took a stab at it so we can go forward. Pretty sure this is not the "silver bullet" we are looking for but I think it's a good start considering the previous discussion.
I've basically added a new event called HS_DESC_CONTENT and removed the cache= part of the HSFETCH command. Also, I fixed the issues raised by atagar in comment:10.
This new version adds the DescID, Replica and TimePeriod option to the HSFETCH command. After a discussion on IRC with arma, turns out it would be very useful to have the ability to control these.