Opened 4 years ago

Closed 22 months ago

#17857 closed enhancement (implemented)

Create a consensus param to disable (netflow) padding if RSOS is enabled

Reported by: teor Owned by: mikeperry
Priority: Medium Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-hs, single-onion, review-group-20, review-group-23
Cc: mikeperry Actual Points:
Parent ID: Points: 1
Reviewer: dgoulet Sponsor:

Description (last modified by mikeperry)

Because RSOS and Proposal 252 both appear to cause an onion service to make a lot of orconns to many relays, the total overhead of the netflow padding defense will be much larger for these services.

In an ideal world, we'd find some way to minimize the number of orconns that these services make, because the increased level of connection multiplexing would be better against traffic analysis attacks, and thus also mean less overhead for padding defenses (including netflow padding, but beyond that as well).

In the meantime, however, we should provide the ability to disable netflow padding via the consensus for these services. With a consensus param, we can monitor the netflow padding overhead in relay extra-info descriptors, and experiment with turning padding for RSOS/SOS on and off while observing the change in total overhead at relays.

Child Tickets

Change History (43)

comment:1 Changed 4 years ago by mikeperry

Cc: mikeperry added

comment:2 Changed 4 years ago by mikeperry

Description: modified (diff)
Summary: Disable (netflow) padding if RSOS is enabledCreate a consensus param to disable (netflow) padding if RSOS is enabled

comment:3 Changed 4 years ago by mikeperry

Cc: special added

comment:4 Changed 4 years ago by teor

Keywords: rsos sos added

This applies to both RSOS and SOS.

comment:5 Changed 3 years ago by nickm

Milestone: Tor: 0.2.8.x-finalTor: 0.2.9.x-final

It is impossible that we will fix all 226 currently open 028 tickets before 028 releases. Time to move some out. This is my second pass through the "new" and tickets, looking for things to move to 0.2.9.

comment:6 Changed 3 years ago by asn

Keywords: tor-hs added
Points: small

comment:7 Changed 3 years ago by isabela

Points: small1

comment:8 Changed 3 years ago by teor

This awaits both #16861 and #17178 being merged.

comment:9 Changed 3 years ago by teor

Parent ID: #17178

comment:10 Changed 3 years ago by isabela

Keywords: isaremoved added
Milestone: Tor: 0.2.9.x-finalTor: 0.2.???

comment:11 Changed 3 years ago by teor

A reminder that if we merge Single Onion Services (#17178) and Netflow Padding (#16861) in 0.2.9, we'll likely want a Consensus Parameter to Disable Netflow Padding For Single Onion Services (#17857) as well.

I think we should do a consensus parameter for Tor2web as well. It also makes a lot of connections to many HSDir and intro point relays (and rend point relays if Tor2webRendezvousPoints is not set).

The parameters could be named:

  • nf_pad_single_onion
  • nf_pad_tor2web

Individual operators can override the consensus parameter using the torrc option.

And I think both parameters should operate as follows:

  • 0: No Padding
  • 1: Full Padding (Default)

Do we want to have a value for reduced padding, or is that a client-controlled option only?

comment:12 Changed 3 years ago by asn

I get bad feelings about adding more torrc options or consensus parameters, but the bandwidth overhead here is indeed worth reducing.

BTW, if we introduce the consensus parameter, why turn it on by default? Does the netflow padding offer any additional security to SOS services or tor2web clients? IIUC, if the client of the SOS service does netflow padding, then they are protected. I'm not sure how that works out for tor2web.

Also, reading the top post:

In the meantime, however, we should provide the ability to disable netflow padding via the consensus for these services. With a consensus param, we can monitor the netflow padding overhead in relay extra-info descriptors, and experiment with turning padding for RSOS/SOS on and off while observing the change in total overhead at relays.

I think that asking the dirauths to toggle this switch just for us to do statistical measurements is not going to work out. Also, I feel that the stat noise will cover any traffic produced by the few SOS services on the network.

comment:13 Changed 3 years ago by mikeperry

For KISS principles: I think the simplest thing to do here is to add a consensus parameter that disables netflow padding for all Tor processes have RSOS or tor2web enabled (ie: for the service/server side, not for clients). It should be an emergency-use only thing, for us to set if the overhead looks exceptionally high, and we suspect RSOS/tor2web to be a major source of the problem, and we're looking to stop the bleeding.

For all other non-emergency tuning, it should be enough to advise RSOS services to disable or reduce netflow padding themselves via torrc, or make a new release with different default behavior for them.

Furthermore: I'm guessing that the overhead for most RSOS and tor2web users won't actually be very high. The netflow padding doesn't add overhead to connections that stay active, and both RSOS and tor2web server-side instances are likely to be under heavy use. This also makes me lean in the direction of only providing an emergency "shut-er-down" switch (which we already have for netflow padding as a whole).

comment:14 Changed 3 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:15 Changed 3 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:16 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:17 Changed 2 years ago by nickm

Keywords: isaremoved removed

comment:18 Changed 2 years ago by teor

Milestone: Tor: unspecifiedTor: 0.3.1.x-final

Mike, did you still want to do this?
We merged netflow padding into 0.3.1.

comment:19 Changed 2 years ago by nickm

Owner: set to mikeperry
Status: newassigned

comment:20 Changed 2 years ago by mikeperry

FYI: I am working on this. I am going to implement the simple emergency killswitch I described in comment:13. Clients won't be able to override the consensus, unless they recompile to remove the check (it will be client-side enforced, because relays don't know if we're RSOS or Tor2Web).

To check for tor2web and RSOS, I'm going to check get_options()->Tor2WebMode and get_options()->HiddenServiceSingleHopMode. Let me know if I should be checking something else.

comment:21 in reply to:  20 Changed 2 years ago by teor

Replying to mikeperry:

FYI: I am working on this. I am going to implement the simple emergency killswitch I described in comment:13. Clients won't be able to override the consensus, unless they recompile to remove the check (it will be client-side enforced, because relays don't know if we're RSOS or Tor2Web).

To check for tor2web and RSOS, I'm going to check get_options()->Tor2WebMode and get_options()->HiddenServiceSingleHopMode. Let me know if I should be checking something else.

rend_non_anonymous_mode_enabled() will tell you if either Tor2web or v2 single onion services are enabled.

I opened #22694 in 0.3.2 to make sure we do this for v3 single onion services. (v3 tor2web isn't being implemented.)

comment:22 Changed 2 years ago by mikeperry

Status: assignedneeds_review

Ok, I actually used the torrc options so we can turn these off independently.

The branch is mikeperry/ticket17857, commit 1273047465f80bfc5a5d06332b30ba58c93bfa90.

comment:23 in reply to:  22 Changed 2 years ago by teor

Status: needs_reviewneeds_revision

Replying to mikeperry:

Ok, I actually used the torrc options so we can turn these off independently.

T1. Please use rend_service_allow_non_anonymous_connection(): it does consistency checks on HiddenServiceNonAnonymousMode and HiddenServiceSingleHopMode. (Also, those options are only referenced directly in config.c and rendservice.c, and the tests, let's keep it that way.)

T2. And use rend_client_allow_non_anonymous_connection(), it does consistency checks on NON_ANONYMOUS_MODE_ENABLED and Tor2webMode.

T3. There's duplicate code like this in two different locations for single onion and Tor2web:

  • networkstatus_get_param(NULL, "nf_pad_tor2web", 1, 0, 1)

Please use an abstraction function, so that the defaults are set in one place.

Some other things to be aware of:

  • if a Tor2web client is using Tor2webRendezvousPoints, it will directly connect to just a few nodes for the rendezvous step (unless they go down). So Tor2web rendezvous padding is config-dependent. And the stats on it may be high on just a few relays.
  • this code unconditionally disables padding for every Tor2web or single onion circuit, even if that circuit is a multi-hop circuit. That's probably ok, because Tor2web makes single hop connections even to HSDirs (which is a denial of service risk), and Single Onion Services disables EntryNodes even on its 3-hop HSDir connections.

comment:24 Changed 2 years ago by special

Cc: special removed

comment:25 Changed 2 years ago by teor

Keywords: single-onion added; rsos removed

Now there's only one kind of single onion service, change rsos to single-onion

comment:26 Changed 2 years ago by teor

Keywords: sos removed

Now there's only one kind of single onion service, remove all references to old sos tag

comment:27 Changed 2 years ago by mikeperry

Status: needs_revisionneeds_information

Ok, I added #defines for the consensus params and defaults in a fixup commit, but as for using the functions instead of the options directly, there is a problem: I don't see an easy way to use the tor2web function, because it depends on a #define for NON_ANONYMOUS_MODE. This makes it impossible to unit test the behavior unless I rebuild all of Tor (or somehow rebuild only the the Tor files that check the NON_ANONYMOUS_MODE #define) just for my test.

Do any other tests rebuild various c files with different #define values for cases like this? I didn't see any.. Do we need to write custom makefile rules now, or is that just crazy? It seems crazy this late in the 0.3.1 game for sure, and seems a little crazy generally...

IMO options validation like this should be done at torrc load time and during a control port config update, not at every single use in the runtime.

Given that, I'm inclined to leave it as torrc options only.

comment:28 in reply to:  27 Changed 2 years ago by teor

Status: needs_informationneeds_revision

Replying to mikeperry:

Ok, I added #defines for the consensus params and defaults in a fixup commit, but as for using the functions instead of the options directly, there is a problem: I don't see an easy way to use the tor2web function, because it depends on a #define for NON_ANONYMOUS_MODE. This makes it impossible to unit test the behavior unless I rebuild all of Tor (or somehow rebuild only the the Tor files that check the NON_ANONYMOUS_MODE #define) just for my test.
Do any other tests rebuild various c files with different #define values for cases like this? I didn't see any.. Do we need to write custom makefile rules now, or is that just crazy? It seems crazy this late in the 0.3.1 game for sure, and seems a little crazy generally...

Our unit tests don't work for Tor2web. And it will be deprecated in 0.3.2 or 0.3.3 when v2 hidden services are deprecated.

IMO options validation like this should be done at torrc load time and during a control port config update, not at every single use in the runtime.

It's done at torrc load and on use. Please open a separate ticket if you want to remove the validation on every use.

Given that, I'm inclined to leave it as torrc options only.

I don't mind what we do with Tor2web, it's going to be removed soon. And the current option usage is not consistent in the codebase.

But it's important that we use accessors for single onion services, so it's easier to implement #22694 and similar tickets. And so that access to the single onion options is consistent across the codebase. (If you want to change that, please open a separate ticket.)

Also, there's a more fundamental issue with this ticket:

When a relay that's on 0.3.1 or later activates padding at its end for a single onion service or Tor2web client that's on 0.3.0 or earlier, that client doesn't know about padding. So it can't turn the padding off when this option is set. I known this impacts the largest single onion service, which doesn't upgrade versions very often. I also suspect it will impact many Tor2web services on old Tor versions.

What happens if we can't turn off padding on some Tor2web and single onion services?
Do we need an overall padding kill switch for non-padding aware clients?

comment:29 Changed 2 years ago by mikeperry

Status: needs_revisionneeds_review

I added a fixup to use the rend_service accessor.

For the rest of it, we're a bit late in the game to decide that we want to have a new negotiation mechanism to disable one-ended padding. Furthermore, if the padding overhead starts crushing us even before all clients and relays have upgraded to use padding fully yet, we should just disable the entire netflow padding feature and figure out how to deal with that problem, rather than polluting the code with more complicated kill switches and negotiation methods based on guesses about what might cause problems, especially for issues that will only be a problem for a short period of time.

comment:30 in reply to:  29 Changed 2 years ago by teor

Status: needs_reviewneeds_revision

Replying to mikeperry:

I added a fixup to use the rend_service accessor.

Thanks!

There's a redundant check of get_options()->HiddenServiceSingleHopMode in channel_do_open_actions():

-        || (get_options()->HiddenServiceSingleHopMode &&
+        || (rend_service_allow_non_anonymous_connection(get_options()) &&
+            get_options()->HiddenServiceSingleHopMode &&

Once that's fixed, please flip the ticket to merge_ready.

For the rest of it, we're a bit late in the game to decide that we want to have a new negotiation mechanism to disable one-ended padding. Furthermore, if the padding overhead starts crushing us even before all clients and relays have upgraded to use padding fully yet, we should just disable the entire netflow padding feature and figure out how to deal with that problem, rather than polluting the code with more complicated kill switches and negotiation methods based on guesses about what might cause problems, especially for issues that will only be a problem for a short period of time.

Yes, that would require HSDirs and Intro Points and Rend Points to know when they are connected to a single onion service or Tor2web, which is possible (except for single onion -> HSDir, which is 3 hops). But that code hasn't been written yet (see #22688 and #22689). And then we'd need to add another kill switch for these cases.

If you don't think that's necessary, that's ok. But it could be a year or two before this code is running on most single onion services. And we might deprecate v2 hidden services and Tor2web before it's ever running on most Tor2web instances.

comment:31 Changed 2 years ago by mikeperry

Status: needs_revisionmerge_ready

Ugh, fixup pushed.

That's what I get for trying to do this in a hurry while in another meeting...

comment:32 Changed 2 years ago by dgoulet

Quick look at this (because it is relevant for prop224), this if condition is kind of *mad* ;). Can we put it in a function (or extract some of it) or at least break it down with more semantic? Anything that takes a while to parse and understand the flow of the condition is imo error prone overtime.

+    if (!get_options()->ConnectionPadding ||
+        (get_options()->Tor2webMode &&
+         !networkstatus_get_param(NULL,
+             CHANNELPADDING_TOR2WEB_PARAM,
+             CHANNELPADDING_TOR2WEB_DEFAULT, 0, 1))
+        || (rend_service_allow_non_anonymous_connection(get_options()) &&
+            !networkstatus_get_param(NULL,
+                CHANNELPADDING_SOS_PARAM,
+                CHANNELPADDING_SOS_DEFAULT, 0, 1))) {

comment:33 Changed 2 years ago by nickm

Status: merge_readyneeds_revision

comment:34 Changed 2 years ago by mikeperry

Status: needs_revisionmerge_ready

Ok, I broke apart that condition into separate if..else statements and commented.

comment:35 Changed 2 years ago by nickm

Keywords: review-group-20 added

Creating review-group-20

comment:36 Changed 2 years ago by dgoulet

Reviewer: dgoulet
Status: merge_readyneeds_revision

Wait. I have a question here before we merge this.

Why is channel_do_open_actions() looking at the consensus param by itself but yet we have a dedicated function that updates static value with the param values channelpadding_new_consensus_params() and then we have channelpadding_decide_to_pad_channel() that basically do the checks that are done in channel.c.

Could channel_do_open_actions() just use that decide function which would remove duplicated code and make channel ask the "padding subsystem" directly instead of taking decision on its own?

comment:37 in reply to:  36 Changed 2 years ago by mikeperry

Status: needs_revisionneeds_information

Replying to dgoulet:

Wait. I have a question here before we merge this.

Why is channel_do_open_actions() looking at the consensus param by itself but yet we have a dedicated function that updates static value with the param values channelpadding_new_consensus_params() and then we have channelpadding_decide_to_pad_channel() that basically do the checks that are done in channel.c.

Could channel_do_open_actions() just use that decide function which would remove duplicated code and make channel ask the "padding subsystem" directly instead of taking decision on its own?

Well, the decide function also will schedule and/or send padding, which we should not be doing upon open.. But, we could remove the channel.c checks entirely an rely only on the channelpadding_decide_to_pad_channel() logic getting called about a second after channel open.. There is some chance that this would mean that the relay side might send a padding packet or two before the client side decides to send a disable command in that case, but a single packet doesn't matter all that much.. If we'd rather keep channel_do_open_actions() short and fast for some reason, we can rip the checks out of it.

Let me know what you think. I don't care either way, so long as the bikeshed is magenta :).

I also need to defer testing any change to this code for a bit so I can test some other stuff for researchers before PETS, so if we want to change it again, it won't be my top priority.

comment:38 Changed 2 years ago by teor

I don't think we need to change this code, what do you think, dgoulet?

comment:39 Changed 2 years ago by dgoulet

Thanks Mike/teor, I understand better the difference.

channel_do_open_actions() still has too much code regarding padding. It has 4 if/else branches which 3 of them disable padding and one of them reduces it but all of them look at torrc options *specific* to padding. IMO, this is a "violation" of separation between subsystems and I'm no big fan­.

I would have like to see this big if/else in some sort of "channelpadding_consider_on_open()" or sth which would not clobber channel.c with padding stuff and then we can take care of code duplication much easily afterwards. And afaict, that padding code in channel_do_open_actions() is not tested by the unit test.

I understand the time constraint here, that "it is tested and any other changes delay++" so I'm fine for now but my concerns will at least be on the record ;).

Nick, it's your call now :).

comment:40 Changed 23 months ago by dgoulet

Status: needs_informationneeds_review

Back in needs_review, I think it's mostly nickm call at this point.

comment:41 Changed 23 months ago by nickm

Keywords: review-group-23 added

Put 0.3.1.x needs_review tickets into review-group-23

comment:42 Changed 23 months ago by dgoulet

Status: needs_reviewmerge_ready

Arf... that should be in nickm's court after my last comment. If you like it nickm, it's lgtm.

comment:43 Changed 22 months ago by nickm

Resolution: implemented
Status: merge_readyclosed

I think this is okay. I agree with dgoulet about refactoring being desirable, but IMO that doesn't have to be for 0.3.1.x. I am squashing and merging this in 0.3.1.x.

FWIW, this kind of thing is an example of why I believe that SoS and anonymous onion services shouldn't share a tor instance: it's not clear to me how we would want to handle padding in those cases.

Note: See TracTickets for help on using tickets.