Build tor without the Directory Authority module: tor can not run as a directory authority.
Disables AuthoritativeDirectory (minimal)
Partially Implemented:
--disable-module-dirauth (continued)
Disables AuthoritativeDir, and MinUptimeHidServDirectoryV2 options
Maybe these options should move under Directory Authority Server Options in the man page
Disables all the options under Directory Authority Server Options
New:
--disable-module-dircache
Build tor without the Directory Cache module: tor can not run as a directory cache or authority. Implies --disable-module-dirauth.
Disables DirPort and DirCache (minimal)
Disables all the other options under Directory Server Options
Tor can't currently run as a dirauth without a DirPort, so we need the dircache to dirauth dependency
--disable-module-relay
Build tor without the Relay module: tor can not run as a relay, bridge, directory cache, or authority. Implies --disable-module-dircache and --disable-module-dirauth.
Disables ORPort and sets ClientOnly to 1 (minimal)
Disables all the other options under Server Options
Disables the --list-fingerprint, RelayBandwidth*, MaxAdvertisedBandwidth, PerConnBW*, and ServerTransportPlugin options
Maybe some of these options should move under Server Options in the man page
Tor can't currently run as a dircache without an ORPort, so we need the relay to dircache dependency
Out of scope:
an onion service module
a module for code that is used by clients but not relays (for example, address reachability)
splitting bridge and non-bridge code
Here's the CI design:
make all the CI jobs explicit using include (rather than the mix of matrix and non-matrix jobs we have right now)
we might want to backport this change, so future CI backports are easier
delete one of each pair of similar jobs with no options, rust, distcheck, and module-dirauth
add dircache and relay jobs
Here's how I want to proceed:
0. Make all CI jobs explicit, delete similar jobs, and backport
Implement minimal --disable-module-dircache which disables DirPort and DirCache, with CI, but don't disable any code
Implement minimal --disable-module-relay which disables ORPort and sets ClientOnly 1, with CI, but don't disable any code
For each source code module, decide if it can be disabled when relay, dircache, or dirauth is disabled, and implement that change
Depending on the config or control refactors, also disable the config or control for those modules
Should we avoid confusion by calling these options --disable-relay-mode ?
Trac: Actualpoints: N/Ato 0.2 Status: assigned to needs_review
I think that this proposal is reasonable. Let's make a new ticket or set of tickets for disabling relay/cache stuff specifically, or change this ticket to be about relay/cache stuff specifically.
I think that it is reasonable to have the ability to disable dircache and relay code inside the codebase, but I do wonder whether it makes sense to expose them separately in the configuration. Right now, there's not really any way to be a useful dircache without being a relay, and we assume that nearly any relay is potentially a dircache. Maybe having a --disable-relay-mode makes sense here.
These modules are intended to be relay-only:
feature/relay
These modules are intended to be dircache-only:
feature/dircache
These modules are mostly relay-only:
feature/hibernate
feature/stats
These modules have significant parts that are relay-only:
feature/hs
feature/hs_common
feature/rend
core/crypto
lib/compress
core/mainloop
core/or
(In general, if a module is mostly relay only, we should split it into two modules, one of which is completely disabled and one of which is not.)
I also expect will also find parts of src/core and src/app that are relay-specific.
In general, as we are disabling code, we should remember that it is better to only stub out code that is called from higher-level modules. When code is called from lower-level modules, we should look for a way to remove the inverted dependency entirely.
I think that this proposal is reasonable. Let's make a new ticket or set of tickets for disabling relay/cache stuff specifically, or change this ticket to be about relay/cache stuff specifically.
I've made this ticket the parent ticket for the work.
I think that it is reasonable to have the ability to disable dircache and relay code inside the codebase, but I do wonder whether it makes sense to expose them separately in the configuration. Right now, there's not really any way to be a useful dircache without being a relay, and we assume that nearly any relay is potentially a dircache. Maybe having a --disable-relay-mode makes sense here.
I think it's helpful to expose one new option. Adding only one new option also simplifies our CI and our testing matrix. And I'd like to make the name consistent with the existing dirauth module.
Here are the new descriptions:
--disable-dirauth-mode
hidden alias --disable-module-dirauth
Build tor with authority mode disabled: tor can not run as a directory authority or bridge authority.
--disable-relay-mode
Build tor with relay mode disabled: tor can not run as a relay, bridge, or authority. Implies --disable-dirauth-mode.
I'm not sure if it's worth keeping the relay/dircache distinction when we name the split modules. In any case, we shouldn't distinguish between them when compiling (for example, with separate relay and dircache macros). Because we are not going to test that distinction.
These modules are intended to be relay-only:
feature/relay
These modules are intended to be dircache-only:
feature/dircache
These modules are mostly relay-only:
feature/hibernate
What happens if you try to use AccountingMax on a client or onion service?
If it works, maybe we should keep it enabled on clients for now.
feature/stats
The relay stats can go in stats_relay.
DirReqStatistics is dircache-only, we could put it in stats_dircache.
These modules have significant parts that are relay-only:
feature/hs
feature/hs_common
feature/rend
I think we'll end up extracting hs_relay and rend_relay. Maybe also hs_dircache and rend_dircache.
lib/compress
I think we'll end up extracting compress_dircache.
core/crypto
core/mainloop
core/or
I think we'll end up extracting crypto_relay, mainloop_relay, and or_relay. Maybe also the corresponding dircache modules.
(In general, if a module is mostly relay only, we should split it into two modules, one of which is completely disabled and one of which is not.)
I also expect will also find parts of src/core and src/app that are relay-specific.
Yes, we'll need to go through all the code :-)
In general, as we are disabling code, we should remember that it is better to only stub out code that is called from higher-level modules. When code is called from lower-level modules, we should look for a way to remove the inverted dependency entirely.
Is there a quick guide or picture of our higher-level and lower-level modules?
How should we prioritise the modules?
Quick wins
Easy stubs for lower-level modules
Stubs and remove low-high dependencies for higher-level modules
Trac: Summary: Implement some more optional modules in Tor to Allow Tor to be compiled without support for relay mode
In general, as we are disabling code, we should remember that it is better to only stub out code that is called from higher-level modules. When code is called from lower-level modules, we should look for a way to remove the inverted dependency entirely.
Is there a quick guide or picture of our higher-level and lower-level modules?
There is a little of this in CodeStructure.md , but it needs to be more explicit. I hope we'll get to this with the tor-guts revisions.
The layers are, from higher to lower level:
App -> Feature -> Core -> Lib
The Lib layer is well-factored internally. The core, feature, and app layers are still not.
How should we prioritise the modules?
Quick wins
Easy stubs for lower-level modules
Stubs and remove low-high dependencies for higher-level modules
I think "quick wins" is a good place to start. If we can go lower-level towards higher-level, we will be glad we did, but we won't always have the option. I also think that it might be a good idea to start with "feature/dircache" and "feature/relay" themselves, and move outwards from there.
When I approached this kind of refactoring with lib, I found that it was often hard to anticipate what would be hard to extract and what would be easy to extract, so I had a lot of false starts where I started on one approach, and then binned it in favor of another. I think a similar exploratory approach might be helpful to me here.
Then to find out which ones are not used elsewhere, I would do something like
for h in $(cat headers); do printf "$h "; git grep -l "#include $h" |grep -v feature/dirauth |grep -v src/test | wc -ldone |sort -n -k2
That could probably be cleaner, but you get the idea. We can and should make scripts for this once we have more experience with exactly which tests are useful.
Here's my attempt to list the parts of the relay code that could be modularized, in rough order of operation. Within the phases, things could be done in any order. I've tried to handle things on a roughly "top down" basis, removing each thing before removing the things that it depends on.
PHASE 0.
The relay_periodic.c entry point.
The relay_sys.c entry point.
PHASE 1.
Acting as a directory cache.
Responding to CREATE and EXTEND cells
Responding to BEGIN cells
Listening for OR connections
Accounting
Generating and uploading descriptors.
Self-testing
Responding to introduce/establish_intro/establish_rend cells.
PHASE 2.
Server-side DNS
Key management.
Statistics backend code.
TLS responder code.
Pluggable transport code (particularly src/feature/client/transports.c)
PHASE 3.
Whatever is left.
At each stage, we should work to minimize layer-violations: there should generally not be calls from src/core/ into relay-specific code, and we should plan to refactor as needed to minimize them. We can reduce layer violations in parallel with the above.