Google Summer of Code finished over a month ago, and during this time I've been tidying up my code a bit and reading it for the merge. You will find it on github:
This ticket is for the sole purpose of following the merge process and its progress. But as always I'm on IRC and mail if you want to contact me directly.
I just rebased against master this morning. Nick and Sebastian have been reviewing my code over the summer, but of course more sets of eyes are needed.
The test coverage for the diff generation and application is fine (see test_consdiff.c), but there aren't any tests for the stuff I wrote to wire it into serving and fetching consensus diffs. Not really sure how to go about that, can't really promise I'd have the time to dive into it.
And regarding commit messages and changelog entries, I pretty much went with my instinct. Chances are they can be improved - the commit messages for future reference and the changelog entries for future release changelogs - so criticism is welcome.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
First of all, the test don't compile. Multiple warnings (too many to paste here but here is an idea):
src/test/test_consdiff.c:25:3: warning: implicit declaration of function ‘test_eq_ptr’ [-Wimplicit-function-declaration] test_eq_ptr(sl, sls->list);src/test/test_consdiff.c:52:3: warning: implicit declaration of function ‘test_eq’ [-Wimplicit-function-declaration] test_eq(3, smartlist_slice_string_pos(sls, "a")); ^src/test/test_consdiff.c:81:3: warning: implicit declaration of function ‘test_memeq’ [-Wimplicit-function-declaration] test_memeq(e_lengths1, lengths1, sizeof(int) * 6);src/test/test_consdiff.c:161:3: warning: implicit declaration of function ‘test_assert’ [-Wimplicit-function-declaration] test_assert(!bitarray_is_set(changed1, 0));src/test/test_consdiff.c:920:12: error: ‘legacy_test_helper’ undeclared here (not in a function) { #name, legacy_test_helper, 0, &legacy_setup, test_consdiff_ ## name }
Ok that's a big diff chunk of code. There are a lot of commits that could be squashed together here to make things easier. For instance, I'm thinking of from the older one up to when consdiff.c is created and consdiff_client/_server.c are deleted.
So I see in proposal 140 that multiple consensus hash can be fetched by the client but this implementation only uses one single hash. Is it the spec that is out dated or there is no use to fetch multiple consensus now?
I'll skip all coding style issue here for now since you told me you might change it. Here some minor thing I've found while going over the code. I'ven't yet finish a full review
src/or/consdiff.h
Clarify what "list" contains.
"len" maybe should be renamed to something more meaninful because it does NOT represent the len of the list but the range that should be used in that list (of what I understand)?
src/or/consdiff.c
You should set "network-status-diff-version" in a static const variable at the begining of the file. You are using it twice and since it's part of a public ABI, good practice to have that declared somewhere instead of in the middle of the code. Same goes for "hash" in consdiff_get_digests().
Same goes for "X-Or-Diff-From-Consensus:" in directory.c
src/or/networkstatus.c
I would go for switch/case when you check consensus_flavor_t since it's an enum and the compiler warns you when you forget one. Just a cool extra useful protection. In networkstatus_get_latest_consensus_by_flavor() and networkstatus_get_latest_consensus_mmap_by_flavor()
src/or/dirserv.c
connection_dirserv_add_cons_diff_bytes_to_outbuf(): this one looks really like connection_dirserv_add_networkstatus_bytes_to_outbuf except the lookup if done somewhere else. I feel like this one could get refactor to either take an enum type of what we want or "lookup fct pointer" instead of duplication.
Ok I'll stop for now but I have to say that it's good work :). Will do another run at it after your comments/changes.
The compilation interface is not mvdan's fault: we deprecated the old unit test interface out from under him. Fortunately, we have an automated set of tools to fix it.
I've run them, and the result is branch consdiff-5-fixed in my public repository.
Thanks for pointing out that leak, tom. Will fix it.
And great to see all those comments, dgoulet! :)
Nick is right, I did notice I was using the deprecated method of testing but didn't look into it since it worked. I must have rebased against master some time later without noticing that the tests weren't compiling anymore. Will merge nickm's automated fix, thanks!
Regarding the rest, I'll try having a set of commits out by tomorrow.
One question though:
There are a lot of commits that could be squashed together here to make things easier. For instance, I'm thinking of from the older one up to when consdiff.c is created and consdiff_client/_server.c are deleted.
I did think about this when doing rebases during the summer - of course the initial number of commits was much larger - but I was often unsure how to rewrite the history properly.
For example, my reasoning behind leaving the old files there is that you can use git log/blame/etc with --follow to see where all the lines come from, without having a moderate chunk of code appearing from nowhere in consdiff.c. Or perhaps that is the best option?
I will try to rebase stuff like the test fixes into each of the commits that added tests, of course. That'll probably be the sixth branch.
Maybe we should make sure that the rm_rf function won't follow symlinks. Just in case.
consdiff.c
I need to review this later
consdiff.h
Does anything outside of consdiff.c use the structure here?
directory.c:
{get,set}_has_sent_bad_diff: I would be more comfortable if these assertions were just log_ratelim(LOG_WARN, LD_BUG) calls instead.
resolve_fetch_consensus:
It's stupid, but I'd prefer to see a check vs INT_MAX before casting body_len to int.
You need to pass an mmap to tor_munmap, not tor_free
connection_dir_client_reached_eof: We should probably log when we get a bad diff, and who sent it to us.
directory_handle_command_get:
tor_memdup() is better than malloc-plus-copy.
dir_fps might need a better name now.
dirserv.c
dirserv_lookup_cached_consensus_diff_by_digest should probably be renamed to ...by_hexdigest256.
The documentation for old_cached_consensus_by_digest should mention the types of the keys and the values.
new_cached_dir_comp looks to me like it's going to cause a memory leak or a double-free somehow.
dirserv_refresh_stored_consensuses should probably do something to check that the digest format is right, the date format is right, and that the consensuses are not corrupted.
Use char buf[constant]; tor_snprintf(buf, sizeof(buf)...), not char buf[constant]; tor_snprintf(buf, constant...)
Use the %ld printf format, not %li. (Almost nobody uses %i)
I think we have an alternative to atol that checks for errors better. Is it tor_parse_ulong?
In dirserv_remove_old_consensuses, you can get the current index into the smartlist iteration by looking at c_sl_idx. You don't need to have a separate i.
dirserv_update_consensus_diffs: I need to review this more later.
connection_dirserv_add_cons_diff_bytes_to_outbuf: does this decref the cached object correctly? Is this duplicate code?
networkstatus.c
networkstatus_get_old_consensuses_to_keep -- this can just use int, instead of int32.
current_{ns,md}_consensus_mmap needs to get passed to tor_munmap if it's set before we mmap a new one.
configuration
Instead of a boolean, maybe options->SaveConsensuses should be a number to save, or a maximum number of bytes to use when saving them?
Later:
I'd like to see fewer copies of strings done here. There's an easy way to do that, I think.
How expensive is dirserv_update_consensus_diffs? It seems kind of pricey. Maybe it needs to happen in the background?
I've started a new rebased branch, consdiff-6. What I've done so far:
Rebase against master and fix numerous conflicts
Replace usage of deprecated test_* functions
Replace usage of deprecated legacy_test_helper
Of course I still have all the fixes suggested above left to do. But those were the first steps to get my consdiff branch back up compiling and passing the tests :)
Okay, I just realised that nickm already did the test fixing, and in a slightly different way. So I cherry picked his fixes into consdiff-6 and then applied some check-spaces fixes on top.
I have now compiled a TODO list from all of your comments. I'll work through them, not sure how long it will take me but it will be a few days at least.
Quick question for dgoulet:
I think the usage of 'len' in smartlist_slice is correct. It means the length of the slice. It's similar to how slices in Go are defined (in fact I based my struct definition on that). What if I just improved the comments a bit?
Another question for dgoulet, regarding the declaration of "X-Or-Diff-From-Consensus" as a static const char* in directory.c - makes sense, but others like "X-Desc-Gen-Reason" don't do it. So I'm guessing that this should eventually be fixed for all of those static strings?
So in tor_rmdir, you say that we should check against S_IFLINK (that it is false) when checking for S_IFDIR. I thought that stat ran on the symlink, not on its target? In other words, we couldn't have both be true at the same time. If it's a symlink and it's not a folder, we delete it. In the opposite case, we enter the dir since we're not following a symlink.
smartlist_slice_t in consdiff.h instead of consdiff.c - this is mainly because back when I added it I wasn't sure whether I'd need it in other places. But also because I was thinking that the structure and its functions might be useful to other parts of Tor. After all, slicing a list is a very generic tool, and it sounds simple and cheap to me. Of course it would need polishing to be made public, so I think I'll just move it to consdiff.c for now.
Regarding what both you and dgoulet have said about the add_bytes function - GSoC was nearing its end and I was getting a bit confused by all the "serve bytes/files" code in dirserv.c, so what I did was take a function that kind of did what I wanted and tweaked it for my purposes. This has two obvious problems, the first that it duplicates code, but the second that I'm not exactly sure whether what I did is correct or not.
In other words, I would appreciate it if you could help me here. I remember nickm mentioning that this code has barely been touched in years, and that it's a tad confusing since it's gotten so complex. So I'd rather not do it alone :)
Oh, and I'll definitely be attending the winter dev meeting in Valencia in March. Will any of you guys be there?
{get,set}_has_sent_bad_diff: I would be more comfortable if these assertions were just log_ratelim(LOG_WARN, LD_BUG) calls instead.
I think I understand what you mean there, but I could find no mention of log_ratelim. I do see logs with LD_BUG in other places, but not sure what to do here.
So in tor_rmdir, you say that we should check against S_IFLINK (that it is false) when checking for S_IFDIR. I thought that stat ran on the symlink, not on its target? In other words, we couldn't have both be true at the same time. If it's a symlink and it's not a folder, we delete it. In the opposite case, we enter the dir since we're not following a symlink.
Maybe we should lstat it then? I worry that this could be used along with a symlink to rm_rf something we didn't intend to. Probably nothing to worry about given this use case, but a bit alarming.
smartlist_slice_t in consdiff.h instead of consdiff.c - this is mainly because back when I added it I wasn't sure whether I'd need it in other places. But also because I was thinking that the structure and its functions might be useful to other parts of Tor. After all, slicing a list is a very generic tool, and it sounds simple and cheap to me. Of course it would need polishing to be made public, so I think I'll just move it to consdiff.c for now.
Okay, unless you think that exposing it for testing could help test individual functions.
Regarding what both you and dgoulet have said about the add_bytes function - GSoC was nearing its end and I was getting a bit confused by all the "serve bytes/files" code in dirserv.c, so what I did was take a function that kind of did what I wanted and tweaked it for my purposes. This has two obvious problems, the first that it duplicates code, but the second that I'm not exactly sure whether what I did is correct or not.
In other words, I would appreciate it if you could help me here. I remember nickm mentioning that this code has barely been touched in years, and that it's a tad confusing since it's gotten so complex. So I'd rather not do it alone :)
Okay; I'll try to have a look.
Oh, and I'll definitely be attending the winter dev meeting in Valencia in March. Will any of you guys be there?
{get,set}_has_sent_bad_diff: I would be more comfortable if these assertions were just log_ratelim(LOG_WARN, LD_BUG) calls instead.
I think I understand what you mean there, but I could find no mention of log_ratelim. I do see logs with LD_BUG in other places, but not sure what to do here.
Whoops; the function is actually named "log_fn_ratelim".
current_{ns,md}_consensus_mmap needs to get passed to tor_munmap if it's set before we mmap a new one.
Am I not running tor_munmap already?
dir_fps might need a better name now.
This name was already introduced when I started my gsoc, so I'm not sure where it originates from.
I'll keep on working on the suggestions that I can get done on my own, about two thirds are already done. It would be great if I could have code reviews for consdiff.c as well as pointers for the dirserv.c add_cons_diff_bytes_to_outbuf as agreed :)
current_{ns,md}_consensus_mmap needs to get passed to tor_munmap if it's set before we mmap a new one.
Am I not running tor_munmap already?
I think that you're calling tor_munmap on shutdown, but you might not be calling it on reload.
dir_fps might need a better name now.
This name was already introduced when I started my gsoc, so I'm not sure where it originates from.
Right; this isn't something you need to fix yourself.
I'll keep on working on the suggestions that I can get done on my own, about two thirds are already done. It would be great if I could have code reviews for consdiff.c as well as pointers for the dirserv.c add_cons_diff_bytes_to_outbuf as agreed :)
After a quick chat with Nick, what is left to do is clear to me:
Clients will never keep more than one consensus, clarify that in prop140. Make servers give more than one base consensus in the headers when requesting a consensus diff.
Figure out the best use of the SaveConsensuses config option. Options (by my personal order of preference, will probably go with the first one):
Maximum storage to use on disk (oldest removed first)
Maximum number to keep on disk (oldest removed first)
Maximum amount of time to keep them for
Figure out the write_bytes_to_outbuf stuff - both for code duplication and for the proper functioning of my code.
Some other directory.c and dirserv.c improvements suggested by nickm
Apart from those, I only need more eyes on consdiff.c. The ed diff generation stuff is not simple when you first dive into it, so feel free to ping me with any questions you have. Do let me know if any chunks of code are missing comments too.
And one last thing, there are two items on my TODO list for which I need further info:
I'd like to see fewer copies of strings done here. There's an easy way to do that, I think.
nickm, what do you mean by 'here'?
How expensive is dirserv_update_consensus_diffs? It seems kind of pricey. Maybe it needs to happen in the background?
If we do that, we need some kind of mechanism to not serve any consensus diffs until they are all updated on disk and mmapped correctly. We need a thread-safe way to lock that out until it's complete. It could well get pricey, especially if the user sets a large SaveConsensuses value.
After a quick chat with Nick, what is left to do is clear to me:
[...]
I'd like to see fewer copies of strings done here. There's an easy way to do that, I think.
nickm, what do you mean by 'here'?
Let's forget about it for now and see if it matters in practice.
How expensive is dirserv_update_consensus_diffs? It seems kind of pricey. Maybe it needs to happen in the background?
If we do that, we need some kind of mechanism to not serve any consensus diffs until they are all updated on disk and mmapped correctly. We need a thread-safe way to lock that out until it's complete. It could well get pricey, especially if the user sets a large SaveConsensuses value.
Hm. This is not a prerequisite for merging this patch then, but it will tell us how urgent it is to do another patch on top of it to background this computation.
Also, I rebased on top of master again. The new branch is called "consdiff" instead of "consdiff-9" because I hope this will be the last rebase before the merge :)
The last commit adds four TODOs, which I think are the only things left to deal with.
// TODO: change bool with max size to take up in disk
I'll do this one.
Great.
// TODO: check that the stored consensus is intact
Not sure how to do this one yet.
Hm. Maybe we should store it along with a digest, or try to parse it before we compute the diff?
// TODO: duplicate code from add_networkstatus_bytes_to_outbuf, works but is
// probably buggy.
All of this code is confusing to me, so help from someone who initially wrote it would be great.
// TODO: might want to do this in the background and have a lock to not
// serve consensus diffs until they are updated
What do we want to do about this one after all?
I can do these last two.
The final one will depend on the runtime of the code. How long does it take to recompute all the diffs? If it's a few milliseconds, that's fine. If it's a few hundred milliseconds or worse, we probably need to stick it in another thread.
The last commit adds four TODOs, which I think are the only things left to deal with.
// TODO: change bool with max size to take up in disk
I'll do this one.
Great.
I'll do it on top of my current consdiff branch. Will you work on a branch of yours? Feel free to cherry pick my future commits.
// TODO: check that the stored consensus is intact
Not sure how to do this one yet.
Hm. Maybe we should store it along with a digest, or try to parse it before we compute the diff?
What if the digest is corrupt? :)
Trying to parse it sounds better. Although that just checks whether it is valid, not whether it is correct.
I can't think of a simple solution to all of this right now. Surely there must be other files that may be corrupt in the filesystem when Tor starts up. What does it do for the rest? Sounds to me like we would be better off with a generic mechanism to check whether the tor data/config dir hasn't been tinkered with.
// TODO: duplicate code from add_networkstatus_bytes_to_outbuf, works but is
// probably buggy.
All of this code is confusing to me, so help from someone who initially wrote it would be great.
// TODO: might want to do this in the background and have a lock to not
// serve consensus diffs until they are updated
What do we want to do about this one after all?
I can do these last two.
The final one will depend on the runtime of the code. How long does it take to recompute all the diffs? If it's a few milliseconds, that's fine. If it's a few hundred milliseconds or worse, we probably need to stick it in another thread.
Back when I ran tests on my algorithm code, I did compute times for single test cases. I never tested the speed of the whole diff generation process at startup.
I noticed the consensus diffs proposal had been sitting around unimplemented since 2008 and, after some cursory searches didn't turn up anything related, I decided to take a crack at it. I didn't find this ticket until I had mostly finished my implementation. Is the seven months since this was last meaningfully touched a long enough delay that I should submit my code for review, or would I be stepping on someones toes?
Hi, Yawning! I'm deferring these tickets assigned to you from 0.2.9 to 0.2.???, since you're going to be out for September. But if you wind up wanting to do any of them for 0.2.9 anyway, please feel free to move them back.
Bulk-adding "sponsor4" keyword to items that would appear to reduce low-bw clients' directory bandwidth usage. But we shouldn't build these without measurement/proposals: see #21205 (moved) and #21209 (moved).