Opened 13 months ago

Closed 6 months ago

#28634 closed defect (implemented)

Design a first useful padding machine (hiding client-side intro/rend circuits)

Reported by: asn Owned by: mikeperry
Priority: Very High Milestone: Tor: 0.4.1.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: wtf-pad, tor-relay, tor-cell, padding, 041-proposed, network-team-roadmap-2019-Q1Q2, 041-should, nickm-merge, dgoulet-merge
Cc: nickm Actual Points:
Parent ID: Points: 8
Reviewer: mikeperry Sponsor: Sponsor2

Description

#28142 will get merged with no active padding machine. This ticket is about designing a useful padding machine that can act as a decent testbed for the WTF-PAD project.

Child Tickets

TicketStatusOwnerSummaryComponent
#28693closedAdd an option to disable circuit paddingCore Tor/Tor
#28780closedcircpadding: Add machine flag for not closing circuit if machine is activeCore Tor/Tor
#29085closedWTF-PAD: Reduce monotime usage because of performance issuesCore Tor/Tor
#29203closedAdd a way to specify machines as reduced circuit paddingCore Tor/Tor
#29204closedInspect circuit queue during padding decisionsCore Tor/Tor
#29231closedRelays vastly underreport write-total in padding-counts line in extrainfo descriptorCore Tor/Tor
#30173closedEnsure circuit padding can be safely disabled from consensusCore Tor/Tor

Change History (44)

comment:1 Changed 13 months ago by nickm

IMO it doesn't need to be a useful padding machine if we can't get a useful one done in time. If the best we can do is "cheap and harmless" instead of "useful", that would still give us a testbed.

comment:2 Changed 12 months ago by asn

Reviewer: mikeperry

Here are some thoughts https://github.com/torproject/tor/pull/461#discussion_r239838732

The above plan depends on getting #28780 and #28633 done in time.

comment:3 Changed 12 months ago by mikeperry

#28633 is done, I believe. #28780 could technically be called nice-to-have gravy. The machines won't be effective against any real adversary without it since lifetimes give away too much info, but we could at least test the machines.

The harmless piece is the tricky bit, though. The Padding TODO file has items for sending an ordered preference list of machine choices in the negotiation, and a way to stop re-trying negotiation if it keeps failing. One (or both) of these must be done to meet the harmless property. I *think* that having an ordered preference will be sufficient for safety, if we specify a "null" machine that is the last preference/fallback in case of error, and if you negotiate a "null" machine, you don't try to negotiate anything more on that circuit.

So if we really want something that can work in 0.4.0, we definitely need to implement this preference ordering idea with explicit "null" fallback. And then we can try to get #28780 done after that.

comment:4 Changed 11 months ago by asn

OK I have some WIP here: https://github.com/torproject/tor/pull/635

tl;dr I made a machine that obfuscates client-side IP circuits. I did some initial testing on chutney and it seems to work okay-ish. Needs more work and feedback.

Version 0, edited 11 months ago by asn (next)

comment:5 Changed 11 months ago by mikeperry

Re "XXX This is not good enough since this randomization is global...This should be randomized for each circuit" in +setup_machine_states_for_hiding_ip_circuits().. This is what the circpad_state_t.length_dist is for -- it allows you to specify a probability distribution that governs how many cells the state should last for (and length_includes_nonpadding specifies if that should count nonpadding cells too).

comment:6 Changed 11 months ago by asn

Milestone: Tor: 0.4.0.x-finalTor: unspecified

comment:7 Changed 11 months ago by nickm

Keywords: 041-proposed added

comment:8 Changed 11 months ago by mikeperry

Points: 4

comment:9 Changed 11 months ago by mikeperry

Priority: MediumVery High

comment:10 Changed 11 months ago by nickm

Sponsor: Sponsor2

comment:11 Changed 8 months ago by asn

Status: newneeds_review

OK I have an initial PoC here, as well as a spec patch:
https://github.com/torproject/torspec/pull/72

Check out the code here:
https://github.com/torproject/tor/pull/868

I'm setting this to needs_review to receive some comments from Mike.
I'd also like some suggestions on how to do the bandwidth overhead analysis in the spec.

Thanks!

Here is how an intro point circuit looks like with this patch in chutney (had to change the latency to fit chutney):

IP circuit 6 was created at 20:49:49.973
Relay cells:
0: 20:49:49.978 -> EXTEND2
1: 20:49:50.000 <- EXTENDED2
2: 20:49:50.001 -> EXTEND2
3: 20:49:50.023 <- EXTENDED2
4: 20:50:25.520 -> PADDING_NEGOTIATE
5: 20:50:25.520 -> EXTEND
6: 20:50:25.528 -> DROP
7: 20:50:25.532 <- PADDING_NEGOTIATED
8: 20:50:25.532 <- DROP
9: 20:50:25.537 -> DROP
10: 20:50:25.545 -> DROP
11: 20:50:25.552 -> DROP
12: 20:50:25.553 <- DROP
13: 20:50:25.554 <- EXTENDED
14: 20:50:25.555 -> INTRODUCE1
15: 20:50:25.556 <- DROP
16: 20:50:25.556 <- DROP
17: 20:50:25.561 -> DROP
18: 20:50:25.565 -> DROP
19: 20:50:25.569 -> DROP
20: 20:50:25.573 -> DROP
21: 20:50:25.578 <- DROP
22: 20:50:25.578 <- DROP
23: 20:50:25.584 -> DROP
24: 20:50:25.592 -> DROP
25: 20:50:25.597 -> DROP
26: 20:50:25.601 <- DROP
27: 20:50:25.601 <- DROP
28: 20:50:25.602 <- DROP
29: 20:50:25.625 <- DROP
30: 20:50:25.625 <- DROP
31: 20:50:25.626 <- DROP
32: 20:50:25.626 <- INTRODUCE_ACK

and here is a rendezous circuit:

RP circuit 8 was created at 20:49:51.978
Relay cells:
0: 20:49:51.986 -> EXTEND2
1: 20:49:52.116 <- EXTENDED2
2: 20:49:52.117 -> EXTEND2
3: 20:49:52.162 <- EXTENDED2
4: 20:50:25.519 -> PADDING_NEGOTIATE
5: 20:50:25.519 -> ESTABLISH_RENDEZVOUS
6: 20:50:25.525 -> DROP
7: 20:50:25.532 <- PADDING_NEGOTIATED
8: 20:50:25.535 -> DROP
9: 20:50:25.540 -> DROP
10: 20:50:25.545 -> DROP
11: 20:50:25.552 -> DROP
12: 20:50:25.553 <- RENDEZVOUS_ESTABLISHED
13: 20:50:25.554 <- DROP
14: 20:50:25.555 <- DROP
15: 20:50:25.560 -> DROP
16: 20:50:25.564 -> DROP
17: 20:50:25.569 -> DROP
18: 20:50:25.578 <- DROP
19: 20:50:25.578 <- DROP
20: 20:50:25.601 <- DROP
21: 20:50:25.601 <- DROP
22: 20:50:25.601 <- DROP
23: 20:50:25.625 <- DROP
24: 20:50:25.625 <- DROP
25: 20:50:25.625 <- DROP
26: 20:50:25.626 <- DROP
27: 20:50:25.635 <- DROP
28: 20:50:25.677 <- DROP
29: 20:50:25.677 <- DROP
30: 20:50:25.681 <- DROP
31: 20:50:25.704 <- RENDEZVOUS2
32: 20:50:25.705 -> BEGIN
33: 20:50:25.749 <- CONNECTED
34: 20:50:25.837 -> DATA
35: 20:50:26.753 -> DATA

comment:12 Changed 8 months ago by asn

Points: 48

(This was marked for 8 points, not 4)

comment:13 Changed 8 months ago by teor

Reviewer: mikeperry

Remove Mike as reviewer, because he's overloaded.
We'll work out what to do with these tickets in the weekly meeting.

comment:14 Changed 8 months ago by teor

Reviewer: mikeperry

Mike is the only possible reviewer for this draft.

comment:15 Changed 8 months ago by mikeperry

Status: needs_reviewneeds_revision

I have some comments on the PR. Some need fixups; some may just need discussion.

Also I found that Rob Jansen and Marc Juarez actually did make a proper classifier out of Kwon's decision tree algorithm, so we might need to be a little more careful with considering how such a classifier will behave (I have some comments about how we might do that in the PR). Also maybe we can get them to re-run their classifier for us on their beefy Shadow simulator box: https://www.robgjansen.com/publications/insidejob-ndss2018.pdf

For these reasons, we might also want #30092 for this. I could do it if we agree.

Last edited 8 months ago by mikeperry (previous) (diff)

comment:16 Changed 8 months ago by asn

OK I have a new branch in bug28634: https://github.com/torproject/tor/pull/635
and I also updated the spec PR: https://github.com/torproject/torspec/pull/72

This branch takes the simplistic approach of only trying to make the initial circuit setup cell sequence blend in with general circuits. I find this to be more sophisticated than a look-like-nothing and it also has minimal overhead. I also addressed your comments in the previous PR.

Also, this is a fine start if we want to build this into a more advanced machine. We can just create new states that handle obfuscation after the initial circuit setup sequence.

Last edited 8 months ago by asn (previous) (diff)

comment:17 Changed 8 months ago by asn

Status: needs_revisionneeds_review

comment:18 Changed 8 months ago by asn

Summary: Design a useful padding machine that we can enableDesign a first useful padding machine (hiding client-side intro/rend circuits)

comment:19 Changed 7 months ago by mikeperry

Status: needs_reviewneeds_revision

Some minor comments on the PR for the code and the spec.

High level questions+comments:

  1. Should we #define CIRCPAD_STATE_SPOOF_GENERAL CIRCPAD_STATE_BURST, and use a different terminology (and variables) for that first state? I think using the wtf-pad names can be confusing, since we're not using the histograms in the same way it did.
  2. Should we flag these machines as reduced padding using #29203?
  3. It's hard to follow force-pushed PRs like this. I personally would prefer a fresh PR squashed down rather than worrying about which version of the force-push I click on for a commit when doing the review.


I can do the renaming of the burst state in the code and spec if we agree on a name for the first state of the intro and rend machines.

Last edited 7 months ago by mikeperry (previous) (diff)

comment:20 in reply to:  19 Changed 7 months ago by asn

Replying to mikeperry:

Some minor comments on the PR for the code and the spec.

Thanks for the review!

High level questions+comments:

  1. Should we #define CIRCPAD_STATE_SPOOF_GENERAL CIRCPAD_STATE_BURST, and use a different terminology (and variables) for that first state? I think using the wtf-pad names can be confusing, since we're not using the histograms in the same way it did.

I agree that the BURST and GAP state names are kinda random right now. There is also the question of how circpad_state_to_string() should work in case we rename them.

Perhaps for these machines, we could have more specific names like CIRCPAD_STATE_BURST can become CIRCPAD_STATE_OBFUSCATE_CIRC_SETUP? SPOOF_GENERAL also seems like a decent generic name, but not sure what kind of machine would like that name, instead of defining their own.

  1. Should we flag these machines as reduced padding using #29203?

Yes, that's a good question. Right now "reduced padding" is a very vague concept...

Right now these machines are very low overhead at 15 padding cells in average, but perhaps we should not flag them as reduced padding until we learn what exactly reduced padding means by doing proper bandwidth overhead examinations.

I'd like to do these examinations as part of my PhD research, so perhaps we can avoid markin them as such, and wait for the data? I don't think the world is gonna go crazy if we have to disable these machines in case we find that the bandwidth is too much.

  1. It's hard to follow force-pushed PRs like this. I personally would prefer a fresh PR squashed down rather than worrying about which version of the force-push I click on for a commit when doing the review.


Sorry about that. Will have that in mind for next time.

I can do the renaming of the burst state in the code and spec if we agree on a name for the first state of the intro and rend machines.

Also replied to the comments on the PR.

comment:21 Changed 7 months ago by mikeperry

Ok, to summarize the above discussion and the plan related to child tickets, here's what I think we should do for forward progress:

  1. #define CIRCPAD_STATE_OBFUSCATE_CIRC_SETUP CIRCPAD_STATE_BURST and also change the one info log line that uses circpad_state_to_string() to use %d for state number instead (I think info lines don't need string names here, and it will simplify state re-naming for custom machines).
  2. Let's not make use of the reduced_padding flag from #29203 until we measure true overhead, as you say.
  3. Let's keep the safety check for no tokens left in circpad_machine_schedule_padding(), but let's remove the burst->burst transition in the rend machine that might have required it.
  4. In addition to the "two infinity bin" hack in the origin side intro circuit machine, let's *also* have no tokens in the INFINITY-1 bin slot, so that only CIRCPAD_DELAY_INFINITE can ever get chosen by that machine (unless we can think of a cleaner way to do this).
  5. Let's remove usage of should_negotiate_end.
  6. Let's always use CIRCPAD_TOKEN_REMOVAL_NONE (and never use_rtt_estimate), so I can write a version of #29085 that does not cause us to call monotime when these modes are used (so we don't have to worry about that overhead for these machines). If you want to add a CIRCPAD_DIST_FIXED that always returns the value of param1 for use as a length_dist, that might be the cleanest way to avoid token removal for 1 token (rather than making a hacky use of CIRCPAD_DIST_UNIFORM to return rand(1,1)).

I will work on #28780 and #29085 this week under the above assumptions. Can you take care of the above for the machines in this ticket? (Note the above items are not ordered in any kind of importance or priority nor am I trying to make binding decrees; I'm just trying to summarize. In particular: the state naming thing is not as important as the others and it if it is annoying and time consuming or you disagree, just skip it).

Last edited 7 months ago by mikeperry (previous) (diff)

comment:22 in reply to:  21 Changed 7 months ago by asn

Status: needs_revisionneeds_review

Replying to mikeperry:

Ok, to summarize the above discussion and the plan related to child tickets, here's what I think we should do for forward progress:

  1. #define CIRCPAD_STATE_OBFUSCATE_CIRC_SETUP CIRCPAD_STATE_BURST and also change the one info log line that uses circpad_state_to_string() to use %d for state number instead (I think info lines don't need string names here, and it will simplify state re-naming for custom machines).
  2. Let's not make use of the reduced_padding flag from #29203 until we measure true overhead, as you say.
  3. Let's keep the safety check for no tokens left in circpad_machine_schedule_padding(), but let's remove the burst->burst transition in the rend machine that might have required it.
  4. In addition to the "two infinity bin" hack in the origin side intro circuit machine, let's *also* have no tokens in the INFINITY-1 bin slot, so that only CIRCPAD_DELAY_INFINITE can ever get chosen by that machine (unless we can think of a cleaner way to do this).
  5. Let's remove usage of should_negotiate_end.
  6. Let's always use CIRCPAD_TOKEN_REMOVAL_NONE (and never use_rtt_estimate), so I can write a version of #29085 that does not cause us to call monotime when these modes are used (so we don't have to worry about that overhead for these machines). If you want to add a CIRCPAD_DIST_FIXED that always returns the value of param1 for use as a length_dist, that might be the cleanest way to avoid token removal for 1 token (rather than making a hacky use of CIRCPAD_DIST_UNIFORM to return rand(1,1)).

I will work on #28780 and #29085 this week under the above assumptions. Can you take care of the above for the machines in this ticket? (Note the above items are not ordered in any kind of importance or priority nor am I trying to make binding decrees; I'm just trying to summarize. In particular: the state naming thing is not as important as the others and it if it is annoying and time consuming or you disagree, just skip it).

OK! Here is a branch that does all the above (I think):
https://github.com/torproject/tor/pull/1008
It's also rebased on the latest version of your #28780 branch (squashed).

It seems to work well in my chutney testing. I will pass it to alex so that I can also test it on his relay!

Last edited 7 months ago by asn (previous) (diff)

comment:23 Changed 7 months ago by mikeperry

Status: needs_reviewneeds_revision

This looks good! We're getting real close IMO.

Last thing I can think of: As an extra safety mechanism, I think we should use the machine-specific padding limits allowed_padding_count = 20 and max_padding_percent = 1 on these two machines. That way we know that even if they go haywaire somehow, we have the padding limits to stop meltdown (and to log that we're hitting the limit and something is going wrong).

comment:24 Changed 7 months ago by mikeperry

Status: needs_revisionneeds_review

https://github.com/mikeperry-tor/tor/tree/bug28634_latest_28780 has https://github.com/mikeperry-tor/tor/commit/3a2f97d6908d9dc6a73bf05d9e26bcda03bc00bb with this limit count.

I think we're good now, but I want to give this another deep review Monday.

comment:25 Changed 7 months ago by mikeperry

Parent ID: #28632

comment:26 Changed 7 months ago by asn

Looks good to me! I also successfuly tested that the rate limiting rules apply correctly, and also cannot trigger on the current machines.

One last thing we might want to fix is that I've been getting spammed by this log message in chutney even for machines were RTT is not enabled (like these ones):

May 14 12:25:18.162 [notice] Got two cells back to back on a circuit before estimating RTT.

comment:27 Changed 7 months ago by gaba

Keywords: network-team-roadmap-2019-Q1Q2 added

comment:28 in reply to:  26 ; Changed 7 months ago by mikeperry

Replying to asn:

Looks good to me! I also successfuly tested that the rate limiting rules apply correctly, and also cannot trigger on the current machines.

One last thing we might want to fix is that I've been getting spammed by this log message in chutney even for machines were RTT is not enabled (like these ones):

May 14 12:25:18.162 [notice] Got two cells back to back on a circuit before estimating RTT.

I believe this should be fixed by #29085. While optimizing to avoid monotime, I added a fast-path that also has the effect of avoiding the branch that contains this log message if the feature is disabled.

Which chutney tests are you testing with and how are you doing verification?

Do you think we could add something simple to the chutney tests to check for any obvious weirdness automatically?

comment:29 in reply to:  28 Changed 7 months ago by asn

Replying to mikeperry:

Replying to asn:

Looks good to me! I also successfuly tested that the rate limiting rules apply correctly, and also cannot trigger on the current machines.

One last thing we might want to fix is that I've been getting spammed by this log message in chutney even for machines were RTT is not enabled (like these ones):

May 14 12:25:18.162 [notice] Got two cells back to back on a circuit before estimating RTT.

I believe this should be fixed by #29085. While optimizing to avoid monotime, I added a fast-path that also has the effect of avoiding the branch that contains this log message if the feature is disabled.

Hm, I don't think that's the case. At least in my testing I got both the above msg and Circuit sent two cells back to back before estimating RTT with the branch from comment:24, plus the latest #28780 and #29085.

Which chutney tests are you testing with and how are you doing verification?

I've been testing with chutney hs-v3 template. Here are some useful queries:
grep -R relay_ip net/nodes/*/*log to find which relay got the relay-side IP machine
$ grep marked net/nodes/008c/info.log to check that the circuits get closed as intended. For example:

$ grep marked net/nodes/008c/info.log 
May 15 17:11:34.411 [warn] Circuit 10 is not marked for close because of a  pending padding machine.
...
May 15 17:14:53.195 [info] circuit_mark_for_close_(): Circuit 3868216775 (id: 10) marked for close at src/core/or/circuituse.c:1509 (orig reason: 9, new reason: 0)

where the above is with MaxCircuitDirtiness 3 minutes on the client-side.

Do you think we could add something simple to the chutney tests to check for any obvious weirdness automatically?

I doubt we can test this feature automatically with chutney with a simple mod, but I don't know chutney enough to give a comprehensive answer.

Mike this whole thing looks good to me, apart from the annoying RTT log messages. Can you fix the RTT log messages, and give it a test on chutney, and then we can mark as merge_ready? I will check-in online at night.

comment:30 Changed 7 months ago by mikeperry

Ok, so close! I fixed some things up and made a new pull request: https://github.com/torproject/tor/pull/1029

Here's what I did:

  • Rebased a few times due to conflicts
  • Fixed the RTT log message (this message was due to our decision to switch to counting padding negotiation as non-padding earlier in this ticket)
  • Improved comments in the machines about the packet traces
  • After the RTT fix, I used chutney hs-v3 test to check packet patterns with https://github.com/asn-d6/padanalyzer to make sure I didn't break anything
  • Demoted some logs and ensured correct log domains.
  • Remove the log messages that https://github.com/asn-d6/padanalyzer needs (they should be info/debug level rather than warn, but demoting them would break padanalyzer anyway, so...)

I am convinced that this branch functionally does what we want now. However, I think it would be useful to do the following:

  • It bothers me that the client and middle-relay states are in the same function for each machine. I would rather have each machine be in its own function, even if they are mostly (or even exactly) the same machine (not sure if this is heretical wrt code duplication, but I think code readability should trump that)
  • We *maybe* should find some way to write a unit or chutney test to ensure that future changes to circuit setup don't change out packet patterns (if these machines actually do survive a research cycle, then they risk being broken by things like my RTT fix)

comment:31 Changed 7 months ago by asn

Status: needs_reviewmerge_ready

OK, I took Mike's branch from above and rebased and squashed it so that it's nicer to merge: https://github.com/torproject/tor/pull/1030

I also posted the proposal that explains the machines in https://lists.torproject.org/pipermail/tor-dev/2019-May/013822.html

Mike, I did not split the machine states in different functions due to lack of time, and being afraid of causing a bug. I agree with you it will be better this way, but perhaps we can do it post-merge as a refactoring commit.

comment:32 Changed 7 months ago by nickm

Milestone: Tor: unspecifiedTor: 0.4.1.x-final

comment:33 Changed 7 months ago by nickm

Keywords: 0411-alpha added

comment:34 Changed 7 months ago by nickm

This probably needs a changes file, right?

As a user, how would I turn this on?

comment:35 Changed 7 months ago by mikeperry

Status: merge_readyneeds_revision

I will write a changes file and see if I can do that refactoring that I mentioned safely.

comment:36 Changed 7 months ago by nickm

Keywords: 041-should added

comment:37 Changed 7 months ago by mikeperry

Status: needs_revisionmerge_ready

Ok! Refactored, and added changes file that also explains how this feature gets enabled/disabled:
https://github.com/torproject/tor/pull/1033

comment:38 Changed 7 months ago by nickm

CI passes, and I see an ack from asn on IRC. Merged! There is still an open child ticket, though. Could somebody please take appropriate action on that depending on its status and relation to this ticket?

comment:39 Changed 7 months ago by mikeperry

I asked dgoulet about how severe the interaction between #29034 and this functionality is on IRC and in the #29034.

#29034 is also a pre-existing problem/leak, fwiw. So in any case, we're only triggering it more often than before.

comment:40 Changed 7 months ago by teor

Owner: set to mikeperry
Status: merge_readyassigned

comment:41 Changed 7 months ago by teor

Status: assignedmerge_ready

comment:42 Changed 7 months ago by nickm

Keywords: 0411-alpha removed

comment:43 Changed 6 months ago by dgoulet

Keywords: nickm-merge dgoulet-merge added

comment:44 Changed 6 months ago by nickm

Resolution: implemented
Status: merge_readyclosed

Merged the last child -- marking this one as closed too.

Note: See TracTickets for help on using tickets.