Opened 3 weeks ago

Last modified 6 days ago

#31851 new task

Allow Tor to be compiled without support for relay mode

Reported by: teor Owned by: teor
Priority: Medium Milestone: Tor: 0.4.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-design, network-team-roadmap-october
Cc: nickm Actual Points: 0.2
Parent ID: Points: 5
Reviewer: nickm Sponsor: Sponsor31-can

Description

Let's make some more optional modules.

Our target set of modules might include:

  • dirauth - the code only used by directory authorities (including bridge authorities)
  • dircache - the code only used by directory caches and directory authorities
  • relay - the code only used by relays and directory authorities
  • common - the code used by all roles

I'll do a design, and a proposed CI build strategy, and then get it reviewed.

Child Tickets

TicketStatusOwnerSummaryComponent
#30860merge_readyteorAdd a chutney job that runs on macOS, so that IPv6 chutney tests workCore Tor/Tor

Change History (8)

comment:1 Changed 3 weeks ago by teor

Actual Points: 0.2
Status: assignedneeds_review

Here is my proposed design:

Existing:

  • --disable-module-dirauth
    • Build tor without the Directory Authority module: tor can not run as a directory authority.
    • Disables AuthoritativeDirectory (minimal)

Partially Implemented:

  • --disable-module-dirauth (continued)
    • Disables *AuthoritativeDir*, and MinUptimeHidServDirectoryV2 options
      • Maybe these options should move under Directory Authority Server Options in the man page
    • Disables all the options under Directory Authority Server Options

New:

  • --disable-module-dircache
    • Build tor without the Directory Cache module: tor can not run as a directory cache or authority. Implies --disable-module-dirauth.
    • Disables DirPort and DirCache (minimal)
    • Disables all the other options under Directory Server Options
    • Tor can't currently run as a dirauth without a DirPort, so we need the dircache to dirauth dependency
  • --disable-module-relay
    • Build tor without the Relay module: tor can not run as a relay, bridge, directory cache, or authority. Implies --disable-module-dircache and --disable-module-dirauth.
    • Disables ORPort and sets ClientOnly to 1 (minimal)
    • Disables all the other options under Server Options
    • Disables the --list-fingerprint, RelayBandwidth*, MaxAdvertisedBandwidth, PerConnBW*, and ServerTransportPlugin options
      • Maybe some of these options should move under Server Options in the man page
    • Tor can't currently run as a dircache without an ORPort, so we need the relay to dircache dependency

Out of scope:

  • an onion service module
  • a module for code that is used by clients but not relays (for example, address reachability)
  • splitting bridge and non-bridge code

Here's the CI design:

  • make all the CI jobs explicit using include (rather than the mix of matrix and non-matrix jobs we have right now)
    • we might want to backport this change, so future CI backports are easier
  • delete one of each pair of similar jobs with no options, rust, distcheck, and module-dirauth
  • add dircache and relay jobs

Here's how I want to proceed:

  1. Make all CI jobs explicit, delete similar jobs, and backport
  2. Implement minimal --disable-module-dircache which disables DirPort and DirCache, with CI, but don't disable any code
  3. Implement minimal --disable-module-relay which disables ORPort and sets ClientOnly 1, with CI, but don't disable any code
  4. For each source code module, decide if it can be disabled when relay, dircache, or dirauth is disabled, and implement that change
  5. Depending on the config or control refactors, also disable the config or control for those modules

Should we avoid confusion by calling these options --disable-relay-mode ?

Last edited 3 weeks ago by teor (previous) (diff)

comment:2 Changed 3 weeks ago by nickm

Status: needs_reviewnew

I think that this proposal is reasonable. Let's make a new ticket or set of tickets for disabling relay/cache stuff specifically, or change this ticket to be about relay/cache stuff specifically.

I think that it is reasonable to have the ability to disable dircache and relay code inside the codebase, but I do wonder whether it makes sense to expose them separately in the configuration. Right now, there's not really any way to be a useful dircache without being a relay, and we assume that nearly any relay is potentially a dircache. Maybe having a --disable-relay-mode makes sense here.

These modules are intended to be relay-only:

  • feature/relay

These modules are intended to be dircache-only:

  • feature/dircache

These modules are _mostly_ relay-only:

  • feature/hibernate
  • feature/stats

These modules have significant parts that are relay-only:

  • feature/hs
  • feature/hs_common
  • feature/rend
  • core/crypto
  • lib/compress
  • core/mainloop
  • core/or

(In general, if a module is _mostly_ relay only, we should split it into two modules, one of which is completely disabled and one of which is not.)

I also expect will also find parts of src/core and src/app that are relay-specific.


In general, as we are disabling code, we should remember that it is better to only stub out code that is called from higher-level modules. When code is called from lower-level modules, we should look for a way to remove the inverted dependency entirely.

Last edited 3 weeks ago by nickm (previous) (diff)

comment:3 in reply to:  2 ; Changed 3 weeks ago by teor

Summary: Implement some more optional modules in TorAllow Tor to be compiled without support for relay mode

Replying to nickm:

I think that this proposal is reasonable. Let's make a new ticket or set of tickets for disabling relay/cache stuff specifically, or change this ticket to be about relay/cache stuff specifically.

I've made this ticket the parent ticket for the work.

I think that it is reasonable to have the ability to disable dircache and relay code inside the codebase, but I do wonder whether it makes sense to expose them separately in the configuration. Right now, there's not really any way to be a useful dircache without being a relay, and we assume that nearly any relay is potentially a dircache. Maybe having a --disable-relay-mode makes sense here.

I think it's helpful to expose one new option. Adding only one new option also simplifies our CI and our testing matrix. And I'd like to make the name consistent with the existing dirauth module.

Here are the new descriptions:

  • --disable-dirauth-mode
    • hidden alias --disable-module-dirauth
    • Build tor with authority mode disabled: tor can not run as a directory authority or bridge authority.
  • --disable-relay-mode
    • Build tor with relay mode disabled: tor can not run as a relay, bridge, or authority. Implies --disable-dirauth-mode.

I'm not sure if it's worth keeping the relay/dircache distinction when we name the split modules. In any case, we shouldn't distinguish between them when compiling (for example, with separate relay and dircache macros). Because we are not going to test that distinction.

These modules are intended to be relay-only:

  • feature/relay

These modules are intended to be dircache-only:

  • feature/dircache

These modules are _mostly_ relay-only:

  • feature/hibernate

What happens if you try to use AccountingMax on a client or onion service?
If it works, maybe we should keep it enabled on clients for now.

  • feature/stats

The relay stats can go in stats_relay.
DirReqStatistics is dircache-only, we could put it in stats_dircache.

These modules have significant parts that are relay-only:

  • feature/hs
  • feature/hs_common
  • feature/rend

I think we'll end up extracting hs_relay and rend_relay. Maybe also hs_dircache and rend_dircache.

  • lib/compress

I think we'll end up extracting compress_dircache.

  • core/crypto
  • core/mainloop
  • core/or

I think we'll end up extracting crypto_relay, mainloop_relay, and or_relay. Maybe also the corresponding dircache modules.

(In general, if a module is _mostly_ relay only, we should split it into two modules, one of which is completely disabled and one of which is not.)

I also expect will also find parts of src/core and src/app that are relay-specific.

Yes, we'll need to go through all the code :-)

In general, as we are disabling code, we should remember that it is better to only stub out code that is called from higher-level modules. When code is called from lower-level modules, we should look for a way to remove the inverted dependency entirely.

Is there a quick guide or picture of our higher-level and lower-level modules?

How should we prioritise the modules?

  1. Quick wins
  2. Easy stubs for lower-level modules
  3. Stubs and remove low-high dependencies for higher-level modules

comment:4 in reply to:  3 Changed 3 weeks ago by nickm

Replying to teor:

In general, as we are disabling code, we should remember that it is better to only stub out code that is called from higher-level modules. When code is called from lower-level modules, we should look for a way to remove the inverted dependency entirely.

Is there a quick guide or picture of our higher-level and lower-level modules?

There is a little of this in CodeStructure.md , but it needs to be more explicit. I hope we'll get to this with the tor-guts revisions.

The layers are, from higher to lower level:

App -> Feature -> Core -> Lib

The Lib layer is well-factored internally. The core, feature, and app layers are still not.

How should we prioritise the modules?

  1. Quick wins
  2. Easy stubs for lower-level modules
  3. Stubs and remove low-high dependencies for higher-level modules

I think "quick wins" is a good place to start. If we can go lower-level towards higher-level, we will be glad we did, but we won't always have the option. I also think that it might be a good idea to start with "feature/dircache" and "feature/relay" themselves, and move outwards from there.

When I approached this kind of refactoring with lib, I found that it was often hard to anticipate what would be hard to extract and what would be easy to extract, so I had a lot of false starts where I started on one approach, and then binned it in favor of another. I think a similar exploratory approach might be helpful to me here.

comment:5 Changed 3 weeks ago by teor

If we can go lower-level towards higher-level, we will be glad we did, but we won't always have the option.

How can I discover the modules (or files) that are only used by a particular set of modules, and no other modules?

For example, I want to answer questions like this:

  • which modules are only used by src/feature/dirauth?
    • are they disabled when src/feature/dirauth is disabled? Or do I need to add them to --disable-module-dirauth?
  • which header files are only used by dirauth, dircache, and relay?
    • are they disabled when relay is disabled?

I think practracker/includes.py toposort almost does this task. But it doesn't have a concept of "exclusively used by this set of modules".

comment:6 Changed 3 weeks ago by nickm

How can I discover the modules (or files) that are only used by a particular set of modules, and no other modules?

I usually use shell pipelines for this. For example, to find all the headers that are included from dirauth, I would do:

git grep '#include "' src/feature/dirauth/ | awk '{print $2}' |sort  |uniq > HEADERS

Then to find out which ones are not used elsewhere, I would do something like

for h in $(cat headers); do 
  printf "$h "; git grep -l "#include $h" |grep -v feature/dirauth |grep -v src/test | wc -l
done |sort -n -k2

That could probably be cleaner, but you get the idea. We can and should make scripts for this once we have more experience with exactly which tests are useful.

comment:7 Changed 13 days ago by teor

Keywords: network-team-roadmap-? added

comment:8 Changed 6 days ago by gaba

Keywords: network-team-roadmap-october added; network-team-roadmap-? removed
Note: See TracTickets for help on using tickets.