Here are some initial comments based on reading the patch and the email, based on the assumption that the email is right, and that the patch does what the email says it should do.
Should next_sampled_idx persist across runs or be recalculated when we load guards? Right now it seems to start at 0 every time Tor starts. Maybe we should also re-calculate the sampled_idx values to be a dense array when we save/load the state, so that we aren't leaking more information than we intend to.
The patch will also need:
a short proposal. It can mostly be based on Florentin's email.
a patch to guard-spec.txt.
tests for the changed behavior.
a "changes" file (see doc/HACKING/)
editing so that "make check-spaces" passes.
Updates to the documentation and names of all the functions whose behavior has changed. (For example, sample_reachable_filtered_entry_guards` no longer does that the function's name implies or the documentation says.)
How much of that do the original authors want to do, and how much should we pick up?
Trac: Status: needs_review to needs_revision Priority: Medium to High
woups, yes currently it would start at 0 every time, which is bad. Recalculation based on the last sampled_idx from the state file + 1 should be good.
I would recommend the short proposal to include the required change over the confirmed list (i.e., this list needs sorting by sampled_idx instead of confirmed_idx). So this patch would need to handle that as well.
I am happy to help on writing a small proposal from my email, and I can pick up whatever is left from your bullet list starting from January (I am bit busy for these next two months :/).
Could the proposal includes a kind of "further work" suggestion? Describing the issue that Roger wrote in the email thread.
I've looked at entry_guard_parse_from_state(), and I can't work out where we create a dense set of indexes on load. (But I can see where we do it on save.)
Creating a dense set of indexes on load is important, because qsort() is not stable. Therefore, the first time a legacy state is loaded, the guards will be sorted in arbitrary order. And the order may change, every time the sort is performed.
I also wonder if we need to sort the guard list every time we select a guard. Instead, can we keep the list sorted when we add or delete guards?
Appending guards to the end of the list is just smartlist_add(). We have the smartlist_del_keeporder() to remove guards from the list.
Trac: Milestone: Tor: 0.4.3.x-final to Tor: 0.4.4.x-final Keywords: tor-spec prop271 deleted, tor-spec prop271 prop308 added Priority: High to Medium Status: closed to reopened Summary: Proposal 271 - straightforward improvements to Proposal 308 - choose guards in sampled order Resolution: implemented toN/A
I've looked at entry_guard_parse_from_state(), and I can't work out where we create a dense set of indexes on load. (But I can see where we do it on save.)
Creating a dense set of indexes on load is important, because qsort() is not stable. Therefore, the first time a legacy state is loaded, the guards will be sorted in arbitrary order. And the order may change, every time the sort is performed.
I also wonder if we need to sort the guard list every time we select a guard. Instead, can we keep the list sorted when we add or delete guards?
Appending guards to the end of the list is just smartlist_add(). We have the smartlist_del_keeporder() to remove guards from the list.
Yes, so, indeed I forgot to account for existing clients that would load states without any sampled_idx the first time they would use that patch. I need to workout a "fake" sampling order for those clients, or qsort() would not be stable as you mention (I guess taking the confirmed_idx ordering would be a good enough heuristic).
I am going to look closer to your recommendation on ordering. Thanks for your thoughtful review!
On loading, Tor sets the sampled_idx to the confirmed_idx. That should keep older clients to behave the same (and not reordering primary guards). On the next state save, the sampled_idx should be made dense.
Also, the patch applies now ordering when it seems necessary (a couple of redundant orderings have been removed).
Also, I was concerned by the fact that Tor assumes integrity of the state when loading it. If some application has write access to this file, making the client rotate guards until a chosen one is found shouldn't be too much of a hard task. Is that kind of threat relevant?
On loading, Tor sets the sampled_idx to the confirmed_idx. That should keep older clients to behave the same (and not reordering primary guards). On the next state save, the sampled_idx should be made dense.
Also, the patch applies now ordering when it seems necessary (a couple of redundant orderings have been removed).
Thanks!
Also, I was concerned by the fact that Tor assumes integrity of the state when loading it. If some application has write access to this file, making the client rotate guards until a chosen one is found shouldn't be too much of a hard task. Is that kind of threat relevant?
An attacker who can modify files on the local system could do many worse things. So those attacks are not really part of tor's threat model. To defend against those kinds of attacks, people should use an amnesiac system like TAILS.
File corruption is a risk, though. And tor could detect file corruption earlier with checksums. But that's a different ticket :-)