The collection of traffic statistics from routers is quite common. Recently, there was a minor scandal when a University network administrator upstream of UtahStateExits (and UtahStateMeekBridge) posted that they had collected over 360G of netflow records to boingboing:
https://lists.torproject.org/pipermail/tor-relays/2015-August/007575.html
Unfortunately, the comment has since disappeared, but the tor-relays archives preserve it.
This interested me, so I asked some questions about the defaults and record resolution, and did some additional searching. It turns out that Cisco IOS routers have an "inactive flow timeout" that by default is 15 seconds, and it can't be set lower than 10 seconds. What this timeout does is cause the router to emit a new netflow "record" for a connection that is idle for that long, even if it stays open. Several other routers have similar settings. The Fortinet default is also 15 seconds for this. For Juniper, it is also 30 seconds (but Juniper routers can set it as low as 4 seconds).
With this information, I decided to write a patch that sends padding on a client's Tor connection bidirectionally at a random interval that we can control from the consensus, with a default of 4s-14s. It only sends padding if the connection is idle. It does not pad connections that are used only for tunneled directory traffic.
It also gives us the ability to control how long we keep said connections open. Since the default netflow settings for Cisco also generate a record for active flows after 30 minutes, it doesn't make a whole lot of sense to pad beyond that point.
This should mean that the total overhead for this defense is very low, especially since we have recently moved to only one guard. Well under 50 bytes/second for at most 30 minutes.
I still have a few questions, though, which is why I put so many people in Cc to this ticket. I will put my questions in the first comment.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
Current head is 0c29e83ede329728d4b606a9a2af1a858b517880. My plan is add fixup commits there and then squash again before merge.
I've tested it on Chutney testing networks, and appears to be behaving fine, at least based on loglines.
My questions are highlighted in the patch with "XXX:" comments, but here is also a brief summary.
Questions for Karsten/Roger:
Are the rephist.c stats enough information to graph the overhead for this and other padding defenses? Note that I only output exactly 24 hours worth of data, and no history.
Do you want other stats, such as the average lifespan of client OR connections, in case we want to tune that aspect of the defense, too?
Questions for Athena/Nickm:
I had to create a high-res version of channel_t::timestamp_active (channel_t::timestamp_active_highres). Am I setting it in the right places to correspond to when packets are actually being sent/received on the channel, or should I set it elsewhere?
Should I completely replace timestamp_active with my version? I had to remove a timestamp in channel_timestamp_drained() though, because that definitely could get called when no packets were actually sent.
In Chutney testing networks, I sometimes saw circuits manage to open without a current channel. This is marked with XXX in circuit_send_next_onion_skin(). Is this a bug?
I also had to create a channel_t::used_for_nondir_circs member to indicate that a channel was being used for something other than tunneled dirconns. Did I set this one in the right place, or is there a better place?
Yawning, you're on the Cc to tell me how much my C sucks (and also answer any other questions above :).
The patch is not terribly complicated, and can be tuned/disabled via consensus params. I was aiming for 0.2.7, unless we suspect there will be need for major changes/overhaul.
Oh, we may also want to tweak the Wgx weights in the consensus for this, depending on the amount of padding overhead that we notice, since it will mean more traffic to Guard-flagged nodes but not other nodes. I'm not sure where asn is at with respect to updating those weights for the switch to 1 guard node. Are those changes also on track for 0.2.7, or will they slip to 0.2.8?
FWIW, I think we'll want to update these Wgx weights once more, because I think we should be switching from 3 directory guards to 2-hop tunneled dir cons through the main guard. So that is also something to consider with respect to timelines for these weight updates and this patch.
Trac: Cc: andrea, karsten, nickm, yawning, arma to andrea, karsten, nickm, yawning, arma, asn
Preliminary review, attempts to answer questions, etc. (Can't promise anything with an 0.2.7 merge yet, got to see what other folks think of the code and design. The timeframe is very tight. :)
Preliminary review, high-level requirements stuff. This doesn't all need to be on Mike, but it does all need to get done.
Needs-proposal. There should be one in enough detail (and there may already be one in enough detail!) that somebody else could make code that works the same. I bet that this would be a nice short proposal that wouldn't take too long. But it needs a spec-level proposal all the same. The proposal also needs to get review and attention.
All the functions and structures need documentation.
All new stuff in options needs documentation in doc/tor.1.txt.in
Needs a changes file.
Needs tests.
Nitty-gritty code stuff:
In C, functions that take no arguments are declared int foo(void), not int foo().
Why does get_rep_hist-padding_count_lines return NULL when read or write count is 0? Should that be an && ?
You can't format uint64_t as %ld. Use the magic U64_FORMAT and U64_PRINTF_ARG macros instead.
I'd suggest "has_been_used_for_nondir_circs" instead of "used_for_nondir_circs". The latter could be confused to mean "should only be used for...".
connection_run_housekeeping is already huge; it would be great if we could extract as much as possible of the new code into a function.
Looks like there's a memory leak on pad_event and conn_id_ptr in launch_netflow_pending_callback.
"Scheduled a netflow padding cell, but connection already closed." -- this probably shouldn't be notice; I bet it will trigger often.
Are you casting away the const on chan in send_netflow_padding? That's a bit scary to me.
DFLT_NETFLOW_INACTIVE_KEEPALIVE_MAX should probably be a function so it's clear that something is happening inside it.
Medium-level design stuff:
How badly does this fail when we don't have monotonic time?
gettimeofday() is basically free on Linux, but it's more expensive elsewhere. We need to figure out how expensive it is and actually figure out whether (say) cached_
Libevent doesn't like it so much when we have tens of thousands of timers added in random order. It's O(lg n) to add or remove one, and IIRC the algorithm starts to get sad around that point. We'd better make sure this doesn't matter.
I think connection_get_by_global_id() does a linear search. I bet that won't be affordable.
Overall comments, first impressions:
I dig the idea of tricking netflow hardware. A pox upon them!
I wonder if we can abstract this code so that the logic of when to generate packets is separated a bit more from the logic that sends them, so we can drop in other algorithms more modularly in the future.
Trac: Cc: andrea, karsten, nickm, yawning, arma, asn to andrea, karsten, nickm, yawning, arma
Nick asked me to opine on the urgency of this patch. I haven't looked at the design or patch in detail yet. Here's a slightly-cleaned-up paste of my answer to him.
Big picture answer: yes, I think we should experiment with padding approaches, with the goal of stymying some of the potential traffic analysis attacks out there -- website fingerprinting, end-to-end correlation, and the things in between. Padding between the guard and the client is especially appealing because a) it looks like it can provide pretty good mileage, and also b) I expect that we'd have an easier time raising more capacity at guards (compared to exits) if we publicize the reason why we need it.
I think this is a huge research area where we need to get the whole PETS community thinking about it. We partly contributed to some potential misunderstandings about the efficacy of end-to-end correlation attacks at scale, by saying "Assume the correlation attack works perfectly and instantaneously, I don't know if it does, but it might" and having that turn into "Everybody knows the correlation attack works perfectly ad instantaneously".
I've been envisioning even like a grand challenge: "Hey everybody, here are five attacks, they sure seem hard to resolve, especially all at once, but let's think about ways to increase the false positives at scale." For some of them even a small bump in false positive rate would be huge in practice. It would be neat to get two different designs and then have people analyze the heck out of them. Ideally even more than two.
I think picking the first one Mike ran across is a fine thing to deploy in the mean time, but we shouldn't rush to deploy it, or put too much stock in its being right.
For a little while I was thinking "man, this is just going to cause some research group to write a paper about how we're morons because look, this padding thing doesn't help here and here." But then I realized, that's great! Whatever it takes to get them to write the paper.
I am in the process of addressing Nick's comments. I am also relocating all of this code to channelpadding.c, and refactoring it to try to use the channel abstraction layer instead of connections (where possible).
Roger - I'm in complete agreement with your statements, save for hesitation on moving quickly. This is a narrow case where it's really easy to do what we want from a technical POV. So long as we ensure that this patch is doing what we intend (which is just to send at least one cell on a connection every 15s), then I think getting this patch out there faster will move everything you said forward quicker - mobilizing the research community, making people excited to run more fast guard nodes, etc. And if we find out it isn't doing what we intend, or causing too much load, we turn it off from the consensus. Release early, release often! Move fast and break stuff (yeah I just said that). Etc etc.
Some third-rate researchers will be sure to deliberately misinterpret this defense so they can get a cheap publication, but I also suspect that some good researchers will tell us what else we could do against the more complicated, higher-resolution cases than default-configuration netflow records.
I also believe that future defenses will be completely orthogonal to the netflow defense code and can be completely ignorant of it in their implementation and still remain optimal, since if they decide to send padding for any reason, then the netflow defense won't (since the netflow defense only sends padding if the connection is idle).
I think picking the first one Mike ran across is a fine thing to deploy in the mean time, but we shouldn't rush to deploy it, or put too much stock in its being right.
It is for these reasons that if Nick wants to delay this feature until 0.2.8 (and it looks like he does), I will support him on it. This isn't an "oh, we just do this simple thing and then everybody is clearly safer" situation, and these sorts of "emergent behavior" designs often have surprising side effects (or heck, effects) as they get rolled out more widely.
This is not to say that I don't think we should do the feature. We should. We just shouldn't screw up everything else in 0.2.7 that's been getting good testing for months now.
Ok, well I am going to withhold arguing about risking a delay of 0.2.7 or otherwise impacting it in an unrecoverable way, and instead just try to get this done. We can make the call on Sept 1st if I've accomplished that. It seems premature to make that call now.
I'd also prefer it if the people who I've asked questions of not assume "Well, I guess I can ignore Mike's questions now", because that will impact the quality of what I manage to get done by Sept 1st.
I've been envisioning even like a grand challenge: "Hey everybody, here are five attacks, they sure seem hard to resolve, especially all at once, but let's think about ways to increase the false positives at scale." For some of them even a small bump in false positive rate would be huge in practice. It would be neat to get two different designs and then have people analyze the heck out of them. Ideally even more than two.
Another question we should pose to the research community is "we know this netflow thing is done, we know that a certain amount of metrics tracking is required as part of the administering a large scale network, is there a way to get an acceptable amount of information in a privacy preserving manner?".
Everything has been refactored and reorganized into channelpadding.{c,h}, and the code is generally a lot more organized and properly abstracted to use channel_t. Channel lookup during timer invocation is now O(1), but this will only work for TLS-based channels (because connection_t has an index into the connection array that allows O(1) lookups, but channel_t does not provide O(1) lookups based on its global_identifier).
The following issues remain, which I will fix later:
Still needs a proposal
Still needs tests
Still needs a changes file
Still has lots of questions for folks listed in comment:1 and in the XXX comments in the code.
The following were non-issues as far as I can tell:
rep_hist_get_padding_count_lines() deliberately omits the extra-info lines if either the read or write cells were empty, because in either case we don't have enough info to safely publish stats.
I don't think I actually leaked conn_id_ptr. In any case, I refactored that code and now pass an allocated struct. I don't think I leak that one either.
I don't think I was casting away a const in send_netflow_padding(). That function is now called channelpadding_send_padding_cell_callback(). The cast has been replaced by proper macro usage, but there was no const there anyway, unless I missed something?
If the clock jumps, at worst we would have emitted a warn about the padding time being in the past. Now we only emit a notice. Should we double-check here anyway somehow?
I'm still now sure what to do about the following:
Am I still leaking pad_event from tor_evtimer_new in channelpadding_schedule_padding()? I cargo-culted that from dns_launch_correctness_checks() in dns.c, so if I'm leaking, that function probably is too..
Is there a way to test for a slow gettimeofday()? I noticed some TIME_IS_FAST ifdefs in util.c, but nothing sets that define. We can use time() if we need to, and it will still work fine, but we'll end up sending more padding due to truncation error in that case (which is why I added timestamp_active_highres in the first place).
How should we check if there are too many libevent timers scheduled? Note that the code didn't (and still doesn't) schedule a callback unless we're within 1 second of the padding timeout. It just waits for the next invocation in those cases. That should mean that even if all connections are always idle, only 1/10 of them are scheduling timer callbacks (because the function is called once per second, and the timeout range is 10 seconds wide). We can still check for to many timer callbacks anyway, and call directly in that case, but how do I do that? I've added an XXX in the code where we'd need to do this.
There are still lots of XXX's in the code for my other questions.
I'll look at this when I have a moment, which realistically will probably be sometime next week. But chiming in a bit on your comments...
I'd like to see how this interacts with the circuit scheduler since we can avoid sending padding if we have user payload to transmit instead. If you do that already, great. If not, interactions here need to be carefully considered (and the relevant optimizations made).
[snip]
I'm still now sure what to do about the following:
Is there a way to test for a slow gettimeofday()? I noticed some TIME_IS_FAST ifdefs in util.c, but nothing sets that define. We can use time() if we need to, and it will still work fine, but we'll end up sending more padding due to truncation error in that case (which is why I added timestamp_active_highres in the first place).
Systems with a slow gettimeofday() will likely have a slow time() as well, so I don't see much of a point here. The only systems I can think of that fall under this that are relays are old Linux (who cares) and some of the BSD variants (FreeBSD in certain virtualization envs in particular, may be fixed).
What matters here is "does the OS vDSO certain libc calls".
How should we check if there are too many libevent timers scheduled? Note that the code didn't (and still doesn't) schedule a callback unless we're within 1 second of the padding timeout. It just waits for the next invocation in those cases. That should mean that even if all connections are always idle, only 1/10 of them are scheduling timer callbacks (because the function is called once per second, and the timeout range is 10 seconds wide). We can still check for to many timer callbacks anyway, and call directly in that case, but how do I do that? I've added an XXX in the code where we'd need to do this.
If too many libevent timers are a problem, then why not use 1 libevent timer, and a doubly linked list of padding events? (Or a more sophisticated data structure if insertion is too expensive)...
I also pushed some changes based on Karsten's rephist.c review to both netflow_padding-squashed and netflow_padding-squashed2. I force-pushed netflow_padding-squashed2 so that it remains a single diff, but kept the history in netflow_padding-squashed. Both branches should be identical in delta.
Yawning: I took your comment about gettimeofday() into account, and pushed a new commit (to both squashed and squashed2) to change all the places where I need high resolution to simply use the tv_sec timeval field instead of also calling time(). That way, the new code will be no slower than the old code, at least.
However, as I said on IRC yesterday, I believe this code is independent of the circuit scheduler. Not only are we padding at the connection level (and only when there are no other packets to send), but we also need to pad on a connection long after all circuits are gone from it. I think involving circuit information is not right (in addition to being a complicated layer coordination issue) for these reasons.
Your suggestion about making a data structure to compensate for libevent's timer paucity is interesting and maybe ultimately the right plan, but right now I can't think of anything that won't still max out at 1000 timers (since I still want to preserve millisecond resolution on actual packet delivery). I also worry that this complexity may be a bit error prone and fragile. I'd be much happier just giving up, emitting a notice, and sending the packet directly from channelpadding_schedule_padding() without a timer in the case that we have more than say 50k timers (or whatever we think libevent will start breaking at), and optimizing later if that actually happens in the wild (which I still think is unlikely at our current client load and timeout values).
It may also be possible that the common case will be even slower as a result of this data structure, unless we do something really simple like make an array of 1000 smartlists that we can index into based on the current millisecond. But even that may end up slower in the common case..
The old netflow_padding-squashed branch still preserves history, in case Nick wants to see the delta since last review (though it is probably actually larger and harder to read than the single-commit version, due to refactoring into channelpadding.c). I will continue to preserve history in netflow_padding-squashed, to ease future reviews.
Here is a summary of the issues in the code that I'd like some input on, if possible. Each of these has one or more matching XXX comments in the patch:
Still not sure if I'm setting timestamp_active_highres in the places that accurately reflect network activity
Not sure how to best handle libevent timers. Can we check how many are outstanding? How? Should we remove timers in the case where traffic arrives before the timer expires (which eliminates the need for it), or will that be slower? Do we really need an auxiliary data structure?
Am I actually leaking pad_event still? Do I need to free it myself in the callback?
Am I setting has_been_used_for_nondir_circs in the right places? (And how/why does Chutney complete a circuit without a valid channel for it???)
I still need to test this on PT bridges, to see if the is_canonical test fails for them.
I added a ReducedConnectionPadding torrc option to reduce padding (for mobile users). Unfortunately, since padding is bidirectional, I don't currently have a way to fully disable padding that the server sends other than closing the OR connection earlier. With the current values, this should reduce the worst-case per-connection padding overhead from ~180KB to ~18KB (while still preventing multiple netflow records from being created for the duration of the shorter connection if the relay sends padding). Is this a problem, or a feature? The only alternative seems to be to create sub-fields of CELL_PADDING to communicate padding preferences to the relay..
Do we want any more statistics (like average per-orconn padding stats) exported in extra-info?
Karsten is still working with me to tune the rephist.c stat parameters for optimal information reduction.
I think all of the other comments from nickm, Yawning, and Karsten have been addressed.