Opened 3 years ago

Closed 3 years ago

#20262 closed defect (wontfix)

Onion services startup time always gets revealed

Reported by: twim Owned by: twim
Priority: Medium Milestone: Tor: 0.3.0.x-final
Component: Core Tor/Tor Version:
Severity: Major Keywords: tor-hs, research, review-group-11
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor: SponsorR-can

Description

Due to dead code in rend_consider_services_upload() startup time of onion services always gets revealed.

If service descriptor is not uploaded yet we add random delay from [rendinitialpostdelay;rendinitialpostdelay+rand(2*rendpostperiod)] ( [30s;30s+2h] ):

if (!service->next_upload_time) {
  service->next_upload_time =
        now + rendinitialpostdelay + crypto_rand_int(2*rendpostperiod);

But this delay is useless when we're checking whether we should upload:

    if (intro_points_ready &&
        (service->next_upload_time < now ||
        (service->desc_is_dirty &&
         service->desc_is_dirty < now-rendinitialpostdelay))) {
      /* Upload */

Because descriptor is dirty for never-yet-uploaded services it always gets uploaded after being stable for rendinitialpostdelay seconds. next_upload_time is further in future than rendinitialpostdelay stabilization stuff.
So it goes.

I made a patch to unbork this function to work properly.
But it raised a problem. We got used to expect that descriptors are going to be uploaded pretty soon. But 'now' they will be uploaded with delay up to 2h. That's not okay. Should we make a torrc option like RevealOnionServiceStartupTime that defaults to 1?

Child Tickets

Attachments (1)

Unborked-rend_consider_services_upload.patch (3.9 KB) - added by twim 3 years ago.

Download all attachments as: .zip

Change History (20)

comment:1 Changed 3 years ago by twim

Component: - Select a componentCore Tor/Tor
Severity: NormalMajor
Status: newneeds_review

comment:2 Changed 3 years ago by asn

Keywords: tor-hs research added
Milestone: Tor: 0.3.0.x-final
Sponsor: SponsorR-can

I think the plan here is to remove the delay for now, since the current 30s delay is useless. So having no delay is the same as the current situation.

If in the future we want to add a random delay for security purposes, someone should write a proposal with security analysis.

comment:3 Changed 3 years ago by cypherpunks

Why do you say that is the plan? Who decided it? Enough is enough of trying to launder orders from bloodthirsty external US military people as rational community decisions.

comment:4 in reply to:  3 Changed 3 years ago by asn

Replying to cypherpunks:

Why do you say that is the plan? Who decided it? Enough is enough of trying to launder orders from bloodthirsty external US military people as rational community decisions.

Hm, this escalated quickly here.

When I say this is the plan, I meant that this is my opinion, and it also aligns with the opinion of other devs when we discussed it in Seattle.

In a few words, I think that a deterministic 30s delay offers absolutely nothing in terms of security, and it actually harms reachability for some use cases (e.g. onionshare). So I think reducing it from deterministic 30s to deterministic 0s changes nothing in terms of security, and actually improves Tor performance.

Now, if we think that probablistic delays actually offer something in terms of security, I'd like someone to do a proper security analysis of what they offer, and how long the delays will be.

You disagree? Please make your point but also include a convincing security analysis. Personally, I don't take orders from external US military people, and who knows this might even include yourself.

comment:5 in reply to:  2 ; Changed 3 years ago by twim

Replying to asn:

I think the plan here is to remove the delay for now, since the current 30s delay is useless. So having no delay is the same as the current situation.

Yes, we do want to remove this delay (#20082 and tor-dev@ thread). But there is code that claims that it works certain way while it doesn't. This ticket is to make it work as it should and supposed to.
So we have to either drop thus dead code or resurrect it (my patch). Resurrecting will break our lovely assumptions about how descriptors gets published.

If in the future we want to add a random delay for security purposes, someone should write a proposal with security analysis.

See ticket description about saving this behavior as torrc option.

comment:6 in reply to:  5 ; Changed 3 years ago by teor

Replying to twim:

Replying to asn:

If in the future we want to add a random delay for security purposes, someone should write a proposal with security analysis.

See ticket description about saving this behavior as torrc option.

A torrc option like this will be confusing if merged into the current master, because it contains a function rend_service_reveal_startup_time() which defaults to 0.

comment:7 in reply to:  6 Changed 3 years ago by twim

Replying to teor:

A torrc option like this will be confusing if merged into the current master, because it contains a function rend_service_reveal_startup_time() which defaults to 0.

Right. It's because it does

return rend_service_non_anonymous_mode_enabled(options);

When introducing this torrc option this func should be changed to something like

return options->RevealOnionServiceStartupTime ||
       rend_service_non_anonymous_mode_enabled(options);

comment:8 Changed 3 years ago by teor

(I added a note to #20082 about removing that function, because that patch makes it unused. We can always add it back later.)

comment:9 Changed 3 years ago by twim

I think that because of #20082 and that startuptime should be revealed by default (that's what users expect), it's reasonable to introduce ObfuscateOnionServiceStartupTime instead that defaults to 0. Thus it would not conflict with Single Onions and SOS operators can obfuscate startup time also (maybe pointless).
If implemented, it should be built on top of #20082 (without lower bound for initial delay) just with crypto_rand_int(2*rendpostperiod) delay instead of immediate first upload.

comment:10 Changed 3 years ago by teor

The original commit was intended to reduce linkability between hidden service instances on the same tor instance, rather than hide the startup time itself:

commit 4b76fe8
Author: Roger Dingledine <arma@torproject.org>
Date:   Mon Nov 15 09:05:54 2004 +0000
....

    Also make sure the hidden service descriptors are at a random
    offset from each other, to hinder linkability.

comment:11 Changed 3 years ago by twim

Linkability hindering is more relevant to #20082.
One can find code for this ticket in obfuscate-startup-time branch at https://github.com/nogoegst/tor or https://gitlab.com/nogoegst/tor. It's being rebased against branch from #20082.

Last edited 3 years ago by twim (previous) (diff)

comment:12 Changed 3 years ago by nickm

Keywords: review-group-11 added

comment:13 Changed 3 years ago by nickm

Owner: set to twim
Status: needs_reviewassigned

setting owner

comment:14 Changed 3 years ago by nickm

Status: assignedneeds_review

comment:15 Changed 3 years ago by asn

Initial review:

  • I handle dozens of trac tickets every week. The fact that #20262 and #20082 change the exact same part of the codebase for different reasons and are being developed at the same time, confuses me _a lot_ since I can't remember what's the rationale for each change and which ticket I should consult.

I think these two tickets should be merged into a single "Revisit onion service startup time logic" ticket and a single branch that rules them all should be created.

  • If the only point here is to reduce linkability between hidden service instances on the same tor instance why do we do it for hosts that only have a single onion service?

Also, why not do it for ephemeral services then as well? Isn't that a threat there?

And talking about threats, if we don't insert a delay, who is the attacker here? Who can actually link the multiple HS instances? Their descriptors will go to different HSDirs most probably, so it's probably not the HSDIr. Who is the attacker? I still have not seen a proper security analysis here...

  • I'm still sad that this branch does not simplify that part of the code. I was assuming that the result of these two tickets would simplify the code there and potentially split it into multiple tidy functions. Instead it seems like we add more inline spaggheti code and torrc options. I'm inclined to not accept any code here that does not make that function easy to follow just by skimming it. Also let's add some tests.
  • The changes file mentions that Now there is a random delay up to 5m for each on-disk service, but in the code I see +#define DISKSERVICE_SHUFFLING_PERIOD_TESTING (5*60). Isn't that only applied for testing network?
  • What's the point of the torrc option? It's not documented, and what does it even do?Who will ever toggle that thing on? Please let's reduce the torrc option bloat.

I'm looking at the code here:

if (!service->last_upload_time) {
     /* We do obfuscate onion services startup time. Set random delay */
     /* before the first upload. */
      if (options->ObfuscateOnionStartupTime) {
        service->next_upload_time +=
                           (time_t) crypto_rand_int(2*options->RendPostPeriod);
      } else {
        /* Non-ephemeral services are started at the same time that links */
        /* them and thus reveals that they are operated by same entity. */
        /* Randomizing initial delay for each of these services. */
        if (!is_ephemeral) {
          service->next_upload_time +=
            (time_t) crypto_rand_int(diskservice_shuffling_period);
        }
      }
    }

and the flow is really confusing to me, and I also don't understand why these things happen. Why do we use diskservice_shuffling_period) in one point and 2*options->RendPostPeriod on the other point.

  • Nitpicking: What's the point of +#define DISKSERVICE_SHUFFLING_PERIOD (0)? Seems to me it adds more code and complexity without increasing readability.
  • Nitpicking: There is no nesting or brackets in the else clause, which makes that block of code very hard to read.
  • This branch seems to be editing the changelog of #20082?? Please let's have a single branch for this whole project.

All in all, I feel like the security rationale for this bugfix is sprinkled over two different trac tickets and a mailing list thread, and this makes it very hard to understand why the changes are as is. Also the code patches seem more complex than they need to be and a bit hard to follow.

comment:16 Changed 3 years ago by asn

Status: needs_reviewneeds_revision

comment:17 in reply to:  15 Changed 3 years ago by twim

Replying to asn:

I handle dozens of trac tickets every week. The fact that #20262 and #20082 change the exact same part of the codebase for different reasons and are being developed at the same time, confuses me _a lot_ since I can't remember what's the rationale for each change and which ticket I should consult.
All in all, I feel like the security rationale for this bugfix is sprinkled over two different trac tickets and a mailing list thread, and this makes it very hard to understand why the changes are as is. Also the code patches seem more complex than they need to be and a bit hard to follow.

Sorry for leaving this ticket as needs_review because it supposed to be based on the code after fixing rend_consider_services_upload() code in #20082. This one and #20082 were closely related in the beginning but after it all bifurcated due to the fact that there are two purposes of initial post delay:

  • (part of #20082) Unlink multiple onion services running on the same tor instance by not uploading all the descriptors at the same time (following spec)
  • (#20262) Obfuscate/hide exact startup time in [0:2*RendPostPeriod]. It should be an optional feature. That's why there is torrc option here.

And talking about threats, if we don't insert a delay, who is the attacker here? Who can actually link the multiple HS instances? Their descriptors will go to different HSDirs most probably, so it's probably not the HSDIr. Who is the attacker? I still have not seen a proper security analysis here...

It's the purpose of #20082 and this discussion should be done there (see above).

Sorry for the mess. :\

comment:18 Changed 3 years ago by asn

Hey twim, how about we close this ticket and work on #20082 for things related to HS startup time from now on?

comment:19 Changed 3 years ago by twim

Resolution: wontfix
Status: needs_revisionclosed

Okay, it seems to be that we don't starve for this option. And for now it just would make everything bloated.
We can reopen it in case we need it afterwards.

Note: See TracTickets for help on using tickets.