Trac: Summary: Use mmap to hold our cached consensus, when we even need in RAM at all. to Use mmap to hold our cached consensus, when we even need it stored in RAM at all.
Turns out not to be redundant with #27247 (moved). Clients no longer need to keep a cached_dir_t, and that ticket stops them -- but when they load it, they still load it with read_file_to_string().
There's a slight difficulty in moving to mmap, however. When we mmap() a file, it is not a NUL-terminated string, but rather a non-terminated pile of bytes with an associated size_t.
There are three cases when a client needs a consensus text:
It's parsing it from disk
It's applying a consensus diff.
It's writing it to the controller
Case 3 is easy enough, since the client doesn't need to examine the bytes at all.
But for case 1 and case 2, our networkstatus parsing code and our consensus diff code both assume that the mapped object is NUL-terminated.
So before we can do this, we'll need to carefully revise a bunch of functions.
So before we can do this, we'll need to carefully revise a bunch of functions.
Carefully revised a bunch of functions in #27629 (moved).
Thanks! See my review in that ticket.
Also, there's no way this is making the 0.3.5 milestone, is there? Same with parent ticket.
We're focusing on sponsored work until the feature freeze on Friday. This ticket is sponsored work, so it might get in to 0.3.5.
Please be patient with us during the next few weeks. We appreciate your contributions, but we don't have enough people to review everything this week. We'll review your branches as soon as we get time.
If you'd like to speed up reviews on your code, you can:
run Travis or Appveyor CI on your branches, and link to the CI in your tickets
make sure there are unit tests for any code that you add or modify
This will read past the end of the mmap into invalid memory. It's missing some changes that are in the branch parselen1-on-ticket27247 (rebased on #27247 (moved) obviously).
Conversely, that branch was missing another mmap usage added in yours. (Added now.)
It looks like there are unrelated test fixups in the Stop memcpy'ing uncompressed consensuses when making diffs commit?
Github also seems to be wrongly showing the commit Replace "read consensus from disk" with "map consensus from disk". as being before Revise networkstatus parsing code to use lengths, when tpo shows the opposite.
This will read past the end of the mmap into invalid memory. It's missing some changes that are in the branch parselen1-on-ticket27247 (rebased on #27247 (moved) obviously).
Could you elaborate a bit on what and where the issue is Nick's patches?
networkstatus_set_current_consensus() passes s to a few functions (dirserv_set_cached_consensus_networkstatus, consdiffmgr_add_consensus) which don't have length arguments. They still assume the string is null-terminated and will read past the end of the mmap.
I think these patches looks good. Let's get them merged.
@cyberpunks: Thanks a lot for the work you have put into this ticket (via #27629 (moved)). I think your code looks good and I would have accepted this if there wasn't two implementations that was practically the same.
Let me try to explain a little bit why I ended up picking up the patches in PR #316 (moved) from Nick: When I review code I start out by looking at the "accumulated diff" (I ignore all the individual commits, but just focus on what was there before this set of changes vs. what is there after this set of changes). Once I'm happy with that, I walk over each individual patch to get an idea about the process on how you got there. In Tor, we very often use the "fixup" feature of Git for this very reason, so we can easily spot which things have been changed since last time we reviewed it. I looked at your changes and Nick changes and Nick's changes was easier (for me) to follow based on the commits themselves. I'm very happy you helped identifying the issues with the missing strlen() calls in Nick's changes, which was indeed bugs in the code. I hope this won't discourage you from contributing in the future, since your changes were in fact very good.