Opened 11 months ago

Last modified 2 months ago

#31851 new task

Allow Tor to be compiled without support for relay mode

Reported by: teor Owned by:
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-design, network-team-roadmap-2020Q1, 043-deferred, 044-deferred
Cc: nickm Actual Points: 0.2
Parent ID: Points: 5
Reviewer: nickm Sponsor:

Description

Let's make some more optional modules.

Our target set of modules might include:

  • dirauth - the code only used by directory authorities (including bridge authorities)
  • dircache - the code only used by directory caches and directory authorities
  • relay - the code only used by relays and directory authorities
  • common - the code used by all roles

I'll do a design, and a proposed CI build strategy, and then get it reviewed.

Child Tickets

TicketStatusOwnerSummaryComponent
#30860closedteorAdd a chutney job that runs on macOS, so that IPv6 chutney tests workCore Tor/Tor
#32123closedteorImplement minimal --disable-relay-modeCore Tor/Tor
#32137closednickmSplit {feature,core,app}/*/include.am out of core/include.amCore Tor/Tor
#32139acceptednickmDisable all dirauth options when those modules are disabledCore Tor/Tor
#32162newMake router.c relay-onlyCore Tor/Tor
#32163closedteorStop using HAVE_MODULE_{DIRAUTH,RELAY} inside functionsCore Tor/Tor
#32213closedteorPhase 0: Disable minimal dirauth and relay options when those modules are disabledCore Tor/Tor
#32244closedteorPhase 0: Disable the relay_periodic.c entry point in --disable-module-relayCore Tor/Tor
#32245closedteorPhase 0: Disable the relay_sys.c entry point in --disable-module-relayCore Tor/Tor
#32395newDisable all relay options when those modules are disabledCore Tor/Tor
#32487closednickmPhase 1: Stop compiling "acting as a directory cache" in --disable-module-relayCore Tor/Tor
#32488newPhase 1: Stop compiling "Responding to CREATE and EXTEND cells" in --disable-module-relayCore Tor/Tor
#32489newPhase 1: Stop compiling "Responding to BEGIN cells" in --disable-module-relayCore Tor/Tor
#32490newPhase 1: Stop compiling "Listening for OR and Dir connections " in --disable-module-relayCore Tor/Tor
#33366closednickmDisable dns.c when relay mode is disabledCore Tor/Tor
#33368closednickmDon't compile ext_orport.c when relay mode is disabled.Core Tor/Tor
#33370closednickmDon't build selftest.c when relay mode is disabledCore Tor/Tor
#33389closednickmDisable routerkeys.c and part of connection_or.c when building without relay mode.Core Tor/Tor
#33414needs_revisionnickmSplit router.c, and disable it (mostly) when building without relay support.Core Tor/Tor

Change History (19)

comment:1 Changed 11 months ago by teor

Actual Points: 0.2
Status: assignedneeds_review

Here is my proposed design:

Existing:

  • --disable-module-dirauth
    • Build tor without the Directory Authority module: tor can not run as a directory authority.
    • Disables AuthoritativeDirectory (minimal)

Partially Implemented:

  • --disable-module-dirauth (continued)
    • Disables *AuthoritativeDir*, and MinUptimeHidServDirectoryV2 options
      • Maybe these options should move under Directory Authority Server Options in the man page
    • Disables all the options under Directory Authority Server Options

New:

  • --disable-module-dircache
    • Build tor without the Directory Cache module: tor can not run as a directory cache or authority. Implies --disable-module-dirauth.
    • Disables DirPort and DirCache (minimal)
    • Disables all the other options under Directory Server Options
    • Tor can't currently run as a dirauth without a DirPort, so we need the dircache to dirauth dependency
  • --disable-module-relay
    • Build tor without the Relay module: tor can not run as a relay, bridge, directory cache, or authority. Implies --disable-module-dircache and --disable-module-dirauth.
    • Disables ORPort and sets ClientOnly to 1 (minimal)
    • Disables all the other options under Server Options
    • Disables the --list-fingerprint, RelayBandwidth*, MaxAdvertisedBandwidth, PerConnBW*, and ServerTransportPlugin options
      • Maybe some of these options should move under Server Options in the man page
    • Tor can't currently run as a dircache without an ORPort, so we need the relay to dircache dependency

Out of scope:

  • an onion service module
  • a module for code that is used by clients but not relays (for example, address reachability)
  • splitting bridge and non-bridge code

Here's the CI design:

  • make all the CI jobs explicit using include (rather than the mix of matrix and non-matrix jobs we have right now)
    • we might want to backport this change, so future CI backports are easier
  • delete one of each pair of similar jobs with no options, rust, distcheck, and module-dirauth
  • add dircache and relay jobs

Here's how I want to proceed:

  1. Make all CI jobs explicit, delete similar jobs, and backport
  2. Implement minimal --disable-module-dircache which disables DirPort and DirCache, with CI, but don't disable any code
  3. Implement minimal --disable-module-relay which disables ORPort and sets ClientOnly 1, with CI, but don't disable any code
  4. For each source code module, decide if it can be disabled when relay, dircache, or dirauth is disabled, and implement that change
  5. Depending on the config or control refactors, also disable the config or control for those modules

Should we avoid confusion by calling these options --disable-relay-mode ?

Last edited 11 months ago by teor (previous) (diff)

comment:2 Changed 11 months ago by nickm

Status: needs_reviewnew

I think that this proposal is reasonable. Let's make a new ticket or set of tickets for disabling relay/cache stuff specifically, or change this ticket to be about relay/cache stuff specifically.

I think that it is reasonable to have the ability to disable dircache and relay code inside the codebase, but I do wonder whether it makes sense to expose them separately in the configuration. Right now, there's not really any way to be a useful dircache without being a relay, and we assume that nearly any relay is potentially a dircache. Maybe having a --disable-relay-mode makes sense here.

These modules are intended to be relay-only:

  • feature/relay

These modules are intended to be dircache-only:

  • feature/dircache

These modules are _mostly_ relay-only:

  • feature/hibernate
  • feature/stats

These modules have significant parts that are relay-only:

  • feature/hs
  • feature/hs_common
  • feature/rend
  • core/crypto
  • lib/compress
  • core/mainloop
  • core/or

(In general, if a module is _mostly_ relay only, we should split it into two modules, one of which is completely disabled and one of which is not.)

I also expect will also find parts of src/core and src/app that are relay-specific.


In general, as we are disabling code, we should remember that it is better to only stub out code that is called from higher-level modules. When code is called from lower-level modules, we should look for a way to remove the inverted dependency entirely.

Last edited 11 months ago by nickm (previous) (diff)

comment:3 in reply to:  2 ; Changed 11 months ago by teor

Summary: Implement some more optional modules in TorAllow Tor to be compiled without support for relay mode

Replying to nickm:

I think that this proposal is reasonable. Let's make a new ticket or set of tickets for disabling relay/cache stuff specifically, or change this ticket to be about relay/cache stuff specifically.

I've made this ticket the parent ticket for the work.

I think that it is reasonable to have the ability to disable dircache and relay code inside the codebase, but I do wonder whether it makes sense to expose them separately in the configuration. Right now, there's not really any way to be a useful dircache without being a relay, and we assume that nearly any relay is potentially a dircache. Maybe having a --disable-relay-mode makes sense here.

I think it's helpful to expose one new option. Adding only one new option also simplifies our CI and our testing matrix. And I'd like to make the name consistent with the existing dirauth module.

Here are the new descriptions:

  • --disable-dirauth-mode
    • hidden alias --disable-module-dirauth
    • Build tor with authority mode disabled: tor can not run as a directory authority or bridge authority.
  • --disable-relay-mode
    • Build tor with relay mode disabled: tor can not run as a relay, bridge, or authority. Implies --disable-dirauth-mode.

I'm not sure if it's worth keeping the relay/dircache distinction when we name the split modules. In any case, we shouldn't distinguish between them when compiling (for example, with separate relay and dircache macros). Because we are not going to test that distinction.

These modules are intended to be relay-only:

  • feature/relay

These modules are intended to be dircache-only:

  • feature/dircache

These modules are _mostly_ relay-only:

  • feature/hibernate

What happens if you try to use AccountingMax on a client or onion service?
If it works, maybe we should keep it enabled on clients for now.

  • feature/stats

The relay stats can go in stats_relay.
DirReqStatistics is dircache-only, we could put it in stats_dircache.

These modules have significant parts that are relay-only:

  • feature/hs
  • feature/hs_common
  • feature/rend

I think we'll end up extracting hs_relay and rend_relay. Maybe also hs_dircache and rend_dircache.

  • lib/compress

I think we'll end up extracting compress_dircache.

  • core/crypto
  • core/mainloop
  • core/or

I think we'll end up extracting crypto_relay, mainloop_relay, and or_relay. Maybe also the corresponding dircache modules.

(In general, if a module is _mostly_ relay only, we should split it into two modules, one of which is completely disabled and one of which is not.)

I also expect will also find parts of src/core and src/app that are relay-specific.

Yes, we'll need to go through all the code :-)

In general, as we are disabling code, we should remember that it is better to only stub out code that is called from higher-level modules. When code is called from lower-level modules, we should look for a way to remove the inverted dependency entirely.

Is there a quick guide or picture of our higher-level and lower-level modules?

How should we prioritise the modules?

  1. Quick wins
  2. Easy stubs for lower-level modules
  3. Stubs and remove low-high dependencies for higher-level modules

comment:4 in reply to:  3 Changed 11 months ago by nickm

Replying to teor:

In general, as we are disabling code, we should remember that it is better to only stub out code that is called from higher-level modules. When code is called from lower-level modules, we should look for a way to remove the inverted dependency entirely.

Is there a quick guide or picture of our higher-level and lower-level modules?

There is a little of this in CodeStructure.md , but it needs to be more explicit. I hope we'll get to this with the tor-guts revisions.

The layers are, from higher to lower level:

App -> Feature -> Core -> Lib

The Lib layer is well-factored internally. The core, feature, and app layers are still not.

How should we prioritise the modules?

  1. Quick wins
  2. Easy stubs for lower-level modules
  3. Stubs and remove low-high dependencies for higher-level modules

I think "quick wins" is a good place to start. If we can go lower-level towards higher-level, we will be glad we did, but we won't always have the option. I also think that it might be a good idea to start with "feature/dircache" and "feature/relay" themselves, and move outwards from there.

When I approached this kind of refactoring with lib, I found that it was often hard to anticipate what would be hard to extract and what would be easy to extract, so I had a lot of false starts where I started on one approach, and then binned it in favor of another. I think a similar exploratory approach might be helpful to me here.

comment:5 Changed 11 months ago by teor

If we can go lower-level towards higher-level, we will be glad we did, but we won't always have the option.

How can I discover the modules (or files) that are only used by a particular set of modules, and no other modules?

For example, I want to answer questions like this:

  • which modules are only used by src/feature/dirauth?
    • are they disabled when src/feature/dirauth is disabled? Or do I need to add them to --disable-module-dirauth?
  • which header files are only used by dirauth, dircache, and relay?
    • are they disabled when relay is disabled?

I think practracker/includes.py toposort almost does this task. But it doesn't have a concept of "exclusively used by this set of modules".

comment:6 Changed 11 months ago by nickm

How can I discover the modules (or files) that are only used by a particular set of modules, and no other modules?

I usually use shell pipelines for this. For example, to find all the headers that are included from dirauth, I would do:

git grep '#include "' src/feature/dirauth/ | awk '{print $2}' |sort  |uniq > HEADERS

Then to find out which ones are not used elsewhere, I would do something like

for h in $(cat headers); do 
  printf "$h "; git grep -l "#include $h" |grep -v feature/dirauth |grep -v src/test | wc -l
done |sort -n -k2

That could probably be cleaner, but you get the idea. We can and should make scripts for this once we have more experience with exactly which tests are useful.

comment:7 Changed 11 months ago by teor

Keywords: network-team-roadmap-? added

comment:8 Changed 10 months ago by gaba

Keywords: network-team-roadmap-october added; network-team-roadmap-? removed

comment:9 Changed 10 months ago by teor

We want to write a list of code-chunks to disable, in rough order of ease-of-implementation.

For each code chunk, we want to:

  • create stronger code boundaries
  • make those boundaries clear in code naming/layout

We should do these steps:

  1. Move code into a directory called *_relay, *_dircache, or *_dirauth
  2. Move code between files as needed
  3. Remove upwards dependencies, and other unnecessary dependencies
  4. Refactor code, so code boundaries are clearer:
    • split functions
    • rename functions
  5. Make any other changes that improve modularity

comment:10 Changed 10 months ago by nickm

Here's my attempt to list the parts of the relay code that could be modularized, in rough order of operation. Within the phases, things could be done in any order. I've tried to handle things on a roughly "top down" basis, removing each thing before removing the things that it depends on.

PHASE 0.

  1. The relay_periodic.c entry point.
  2. The relay_sys.c entry point.

PHASE 1.

  1. Acting as a directory cache.
  2. Responding to CREATE and EXTEND cells
  3. Responding to BEGIN cells
  4. Listening for OR connections
  5. Accounting
  6. Generating and uploading descriptors.
  7. Self-testing
  8. Responding to introduce/establish_intro/establish_rend cells.

PHASE 2.

  1. Server-side DNS
  2. Key management.
  3. Statistics backend code.
  4. TLS responder code.
  5. Pluggable transport code (particularly src/feature/client/transports.c)

PHASE 3.

  1. Whatever is left.

At each stage, we should work to minimize layer-violations: there should generally not be calls from src/core/ into relay-specific code, and we should plan to refactor as needed to minimize them. We can reduce layer violations in parallel with the above.

Last edited 6 months ago by teor (previous) (diff)

comment:11 Changed 10 months ago by teor

I opened #32244 and #32245 for Phase 0, and also labelled #32213 as Phase 0.

Edit: Phase 0

Last edited 9 months ago by teor (previous) (diff)

comment:12 Changed 9 months ago by teor

I opened 4 child tickets for the first 4 steps of Phase 1.

Unlike phase 0, I think we may be able to disable some core, ext?, lib, and trunnel code, along with these features.

comment:13 Changed 8 months ago by teor

Owner: teor deleted
Status: newassigned

This is a top-level ticket, and Sponsor 31 is over.
We should assign child tickets as we decide to work on them.

comment:14 Changed 8 months ago by gaba

Keywords: network-team-roadmap-2020Q1 added; network-team-roadmap-october removed

comment:15 Changed 7 months ago by nickm

Keywords: 043-deferred added

All 0.4.3.x tickets without 043-must, 043-should, or 043-can are about to be deferred.

comment:16 Changed 7 months ago by nickm

Milestone: Tor: 0.4.3.x-finalTor: 0.4.4.x-final

comment:17 Changed 7 months ago by gaba

Sponsor: Sponsor31-can

No more sponsor 31. All this tickets remained open after sponsor 31 ended.

comment:18 Changed 6 months ago by teor

Status: assignednew

Change tickets that are assigned to nobody to "new".

comment:19 Changed 2 months ago by nickm

Keywords: 044-deferred added
Milestone: Tor: 0.4.4.x-finalTor: unspecified

Bulk-remove tickets from 0.4.4. Add the 044-deferred label to them.

Note: See TracTickets for help on using tickets.