Opened 2 months ago

Closed 6 weeks ago

Last modified 6 days ago

#22106 closed enhancement (fixed)

Initial Rust support

Reported by: Sebastian Owned by:
Priority: Medium Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: rust
Cc: chelseakomlo, nickm, ahf, mikeperry, isis, teor, asn Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by Sebastian)

Hi there, requesting review and advice for an initial branch to lay some groundwork for Rust support in Tor.

Here's the question: We're preventing cargo from contacting the internet during build/tests (and I think we definitely should do that). That means we will have to vendor the dependencies we're relying on. I see three possible options:

1) Just commit them along with the Rust and C source code
2) Use a separate repository with a git submodule to have them in an external repository, but have a somewhat tight coupling as well as a consistent path inside the source tree for builds from git/builds from a tarball
3) Use a separate repository, no git submodule. Use configure magic to ensure we have the dependencies available (either via educated guess next to the tor.git repo, via env variable or - if building from a tarball - inside the tree)

I have no clear preference, as all options have upsides and issues of their own. As a data point, the libc crate which we'll definitely depend on is 860KB in size (1.3MB uncompressed), the tiny_keccak crate is 40KB compressed (80KB uncompressed).

The branch is add_rust in my repo. This branch has been reviewed by komlo (thanks!), but more review cannot hurt. It currently does not pass make distcheck if Rust is enabled (--enable-rust configure flag) unless the dependencies are already in cargo's cache - it is pending a resolution to the question above. To try it out yourself, edit out the --frozen parameter from the cargo invocation and let it talk to the internet. Doing that means make distcheck passes with Rust enabled.

Once this branch is reviewed, potentially amended and merged, we're ready to have two more branches to base on top of this work. A partial reimplementation of the protover code, and a complete reimplementation of the consdiff code. Both make use of the rust_str_t/RustString API we're introducing here. Next up is a document of the "so you want to use Rust for Tor hacking?" variety.

Child Tickets

Change History (24)

comment:1 Changed 2 months ago by Sebastian

To see a browsable example of the generated documentation, check out http://sebastianhahn.net/tor/doc/tor_util/

comment:2 Changed 2 months ago by Sebastian

  • Description modified (diff)

comment:3 Changed 2 months ago by chelseakomlo

  • Cc chelseakomlo added; ckomlo removed

comment:4 in reply to: ↑ description ; follow-up: Changed 2 months ago by dgoulet

Replying to Sebastian:

Here's the question: We're preventing cargo from contacting the internet during build/tests (and I think we definitely should do that). That means we will have to vendor the dependencies we're relying on. I see three possible options:

Interesting... so Rust is also a thing with a separate "package manager". My main worry about this is how we are going to handle security updates (or even just updates) with Rust code dependencies shipped with Tor. How does that work once we ship a tor and a Rust dependency update is needed? We need to make a new stable or we can just tell our operators "$ rust upgrade" or "$ apt upgrade" (whatever the command) ?

More information to understand how that will play out would be appreciated because seems whatever option we choose here, we'll have this potential problem of doing a new stable release?

1) Just commit them along with the Rust and C source code
2) Use a separate repository with a git submodule to have them in an external repository, but have a somewhat tight coupling as well as a consistent path inside the source tree for builds from git/builds from a tarball
3) Use a separate repository, no git submodule. Use configure magic to ensure we have the dependencies available (either via educated guess next to the tor.git repo, via env variable or - if building from a tarball - inside the tree)

Uncertain here... I would go with git submodule (2) before having it outside (3), that's for sure but it is my personal preference. We should avoid (1) if we can imo (for Rust dependencies ofc), git history is important :).

[snip]

Once this branch is reviewed, potentially amended and merged, we're ready to have two more branches to base on top of this work. A partial reimplementation of the protover code, and a complete reimplementation of the consdiff code. Both make use of the rust_str_t/RustString API we're introducing here. Next up is a document of the "so you want to use Rust for Tor hacking?" variety.

About this (and taking protover for the sake of the example but it applies to the other reimplementation). I strongly believe that we should either have Rust code do protover or the C code but not both. Maintaining two code base for one single feature won't be fun and adds much more work on the maintainer/testing/bugs side of things.

IMO, if we embrace Rust for a subsystem, let's go 100% with it and dump the C one. Having two implementations for the same thing is not bad as a concept but I think it's bad when both are maintained and put in production in the same code base. So having a subsystem in Rust thus implies that to build/run Tor, Rust is needed, period. I'm aware of the transition period between C and Rust making it unavoidable for maintaining two code base but that's the price to pay for any scenarios but my point is really about not having the C one released once we transition, only maintained for LTS.

My two cents.

comment:5 follow-up: Changed 2 months ago by teor

  • Status changed from new to needs_revision

1) Just commit them along with the Rust and C source code

We typically have tightly-bound dependencies in src/ext, this would be a good place for tor_util and maybe tiny_keccak (as we already have a C keccak in ext).

I think it's ok to expect people to install rust's libc: we already do this with libevent and {open,libre,*}SSL. They'll have to install rust, so installing libc is a reasonable ask.

The code looks reasonable to me, but I have only known rust for a few weeks.

Specific commits:

9a96733a2dab56342d6b3de1f2c2915429b21725

Should we run the following rust tests during make check?

  • tiny_keccak (yes, if we include it in ext)
  • libc (no, but we might want to run it on platforms with poor rust support, so let's say that in the instructions)

57838056b20eb2fc555ff8dc9148fca1e22be3d3

src/or/config.c:

This violates the version-spec, which allows a single EXTRA_INFO with no spaces in brackets.
https://gitweb.torproject.org/torspec.git/tree/version-spec.txt#n22

This matters for get_version(), because it's sent in response to GETINFO version. This will at least break stem's version parsing, see:
https://gitweb.torproject.org/stem.git/tree/stem/version.py#n168

(Also see #22110 for a rare case where we break the version spec already, and #22109 for adding unit tests for parsing our own version.)

(It would also matter for get_short_version(), which is in the descriptor, but we're not adding "rust" to that.)

I'd prefer get_version add "(rust-*version*)", if possible.
But maybe it doesn't matter than much, we don't give C compiler versions.

In tor_int in main.c, we have existing code that does:

    if (strstr(version, "alpha") || strstr(version, "beta"))
      log_notice(LD_GENERAL, "This version is not a stable Tor release. "
                 "Expect more bugs than usual.");

Do we want to log some message about rust being experimental as well?

src/test/test_rust.c:

I'd tt_assert() the string is non-null before taking its length.

comment:6 in reply to: ↑ 4 Changed 2 months ago by Sebastian

Replying to dgoulet:

Replying to Sebastian:

Here's the question: We're preventing cargo from contacting the internet during build/tests (and I think we definitely should do that). That means we will have to vendor the dependencies we're relying on. I see three possible options:

Interesting... so Rust is also a thing with a separate "package manager". My main worry about this is how we are going to handle security updates (or even just updates) with Rust code dependencies shipped with Tor. How does that work once we ship a tor and a Rust dependency update is needed? We need to make a new stable or we can just tell our operators "$ rust upgrade" or "$ apt upgrade" (whatever the command) ?

More information to understand how that will play out would be appreciated because seems whatever option we choose here, we'll have this potential problem of doing a new stable release?

We link Rust statically, any required updates to a shipped library will require a rebuild at least. We're intentionally limiting our dependencies on external crates dramatically to make this less of a problem, but yes - if one of the libraries we ship has a security update, we'll need to ship a new version. Note that the same is true for a few C dependencies we ship as well. It's annoying, but with the current state of Rust crates in linux distributions a different model seems not feasible at all.

To make it clear: Yes, security updates in Rust dependencies require a new stable version.

Once this branch is reviewed, potentially amended and merged, we're ready to have two more branches to base on top of this work. A partial reimplementation of the protover code, and a complete reimplementation of the consdiff code. Both make use of the rust_str_t/RustString API we're introducing here. Next up is a document of the "so you want to use Rust for Tor hacking?" variety.

About this (and taking protover for the sake of the example but it applies to the other reimplementation). I strongly believe that we should either have Rust code do protover or the C code but not both. Maintaining two code base for one single feature won't be fun and adds much more work on the maintainer/testing/bugs side of things.

IMO, if we embrace Rust for a subsystem, let's go 100% with it and dump the C one. Having two implementations for the same thing is not bad as a concept but I think it's bad when both are maintained and put in production in the same code base. So having a subsystem in Rust thus implies that to build/run Tor, Rust is needed, period. I'm aware of the transition period between C and Rust making it unavoidable for maintaining two code base but that's the price to pay for any scenarios but my point is really about not having the C one released once we transition, only maintained for LTS.

My two cents.

We're not ready to rely on Rust. This would immediately make Tor unbuildable on a bunch of architectures in Debian, for example. This is supposed to be an experiment that we can stop if it doesn't work out for us, or embrace fully later. We're not planning to perpetually maintain two implementations in the same codebase.

comment:7 in reply to: ↑ 5 ; follow-up: Changed 2 months ago by Sebastian

Replying to teor:

1) Just commit them along with the Rust and C source code

We typically have tightly-bound dependencies in src/ext, this would be a good place for tor_util and maybe tiny_keccak (as we already have a C keccak in ext).

I am not sure about that. if we want to use cargo for this (and I think we do, otherwise we have to mirror all the dependencies etc in our make system), there's a support to have a local crates.io "mirror". Having that in a separate location seems good (maybe ext/rust/ would be good).

I think it's ok to expect people to install rust's libc: we already do this with libevent and {open,libre,*}SSL. They'll have to install rust, so installing libc is a reasonable ask.

I don't know what you mean with install rust'c libc. It's a crate that needs to be available during building, not a dynamic library you can link to or something. The crate provides bindings for different host libc implementations.

Specific commits:

9a96733a2dab56342d6b3de1f2c2915429b21725

Should we run the following rust tests during make check?

  • tiny_keccak (yes, if we include it in ext)
  • libc (no, but we might want to run it on platforms with poor rust support, so let's say that in the instructions)

often, crates have more dependencies for running their tests or might even require an unstable version of the compiler to run them. Maybe we should add a new make target to run tests for rust dependencies?

57838056b20eb2fc555ff8dc9148fca1e22be3d3

src/or/config.c:

This violates the version-spec, which allows a single EXTRA_INFO with no spaces in brackets.
https://gitweb.torproject.org/torspec.git/tree/version-spec.txt#n22

This matters for get_version(), because it's sent in response to GETINFO version. This will at least break stem's version parsing, see:
https://gitweb.torproject.org/stem.git/tree/stem/version.py#n168

(Also see #22110 for a rare case where we break the version spec already, and #22109 for adding unit tests for parsing our own version.)

(It would also matter for get_short_version(), which is in the descriptor, but we're not adding "rust" to that.)

I'd prefer get_version add "(rust-*version*)", if possible.
But maybe it doesn't matter than much, we don't give C compiler versions.

In tor_int in main.c, we have existing code that does:

    if (strstr(version, "alpha") || strstr(version, "beta"))
      log_notice(LD_GENERAL, "This version is not a stable Tor release. "
                 "Expect more bugs than usual.");

Do we want to log some message about rust being experimental as well?

Great catch, thanks! Perhaps rust_welcome_string() should instead just give a line of its own that we only emit if we're building with Rust, and we don't reflect it in the version at all?

src/test/test_rust.c:

I'd tt_assert() the string is non-null before taking its length.

Thanks, fixup pushed.

comment:8 in reply to: ↑ 7 ; follow-up: Changed 2 months ago by teor

Replying to Sebastian:

Replying to teor:

1) Just commit them along with the Rust and C source code

We typically have tightly-bound dependencies in src/ext, this would be a good place for tor_util and maybe tiny_keccak (as we already have a C keccak in ext).

I am not sure about that. if we want to use cargo for this (and I think we do, otherwise we have to mirror all the dependencies etc in our make system), there's a support to have a local crates.io "mirror". Having that in a separate location seems good (maybe ext/rust/ would be good).

I think it's ok to expect people to install rust's libc: we already do this with libevent and {open,libre,*}SSL. They'll have to install rust, so installing libc is a reasonable ask.

I don't know what you mean with install rust'c libc. It's a crate that needs to be available during building, not a dynamic library you can link to or something. The crate provides bindings for different host libc implementations.

Oh, ok, then yes, a local crates mirror seems sensible.
And a make target to set it up. I can't quite work out how to do it!

Specific commits:

9a96733a2dab56342d6b3de1f2c2915429b21725

Should we run the following rust tests during make check?

  • tiny_keccak (yes, if we include it in ext)
  • libc (no, but we might want to run it on platforms with poor rust support, so let's say that in the instructions)

often, crates have more dependencies for running their tests or might even require an unstable version of the compiler to run them. Maybe we should add a new make target to run tests for rust dependencies?

Seems like a good idea.

Is there a linter (?) or style checker (make check-spaces) that we might want to run?

comment:9 in reply to: ↑ 8 ; follow-up: Changed 2 months ago by Sebastian

Replying to teor:

Replying to Sebastian:

I think it's ok to expect people to install rust's libc: we already do this with libevent and {open,libre,*}SSL. They'll have to install rust, so installing libc is a reasonable ask.

I don't know what you mean with install rust'c libc. It's a crate that needs to be available during building, not a dynamic library you can link to or something. The crate provides bindings for different host libc implementations.

Oh, ok, then yes, a local crates mirror seems sensible.
And a make target to set it up. I can't quite work out how to do it!

There's a subcommand for cargo called vendor (not installed automatically) that can do that. I can add a make target for it once we decided which of the options for mirroring we're taking.

Specific commits:

9a96733a2dab56342d6b3de1f2c2915429b21725

Should we run the following rust tests during make check?

  • tiny_keccak (yes, if we include it in ext)
  • libc (no, but we might want to run it on platforms with poor rust support, so let's say that in the instructions)

often, crates have more dependencies for running their tests or might even require an unstable version of the compiler to run them. Maybe we should add a new make target to run tests for rust dependencies?

Seems like a good idea.

Is there a linter (?) or style checker (make check-spaces) that we might want to run?

There's both, clippy (a linter, requires Rust nightly) and rustfmt. Not yet sure how to best integrate them. Both are rapidly evolving.

comment:10 in reply to: ↑ 9 ; follow-up: Changed 2 months ago by teor

Replying to Sebastian:

Replying to teor:

Replying to Sebastian:

I think it's ok to expect people to install rust's libc: we already do this with libevent and {open,libre,*}SSL. They'll have to install rust, so installing libc is a reasonable ask.

I don't know what you mean with install rust'c libc. It's a crate that needs to be available during building, not a dynamic library you can link to or something. The crate provides bindings for different host libc implementations.

Oh, ok, then yes, a local crates mirror seems sensible.
And a make target to set it up. I can't quite work out how to do it!

There's a subcommand for cargo called vendor (not installed automatically) that can do that. I can add a make target for it once we decided which of the options for mirroring we're taking.

When I ran the rust build without the crates, it failed halfway through make.
Can we check at configure time instead?
That's when we fail if we don't have other dependencies installed.

Then we would need the rust dependencies to be a shell script (or a line in the rust install doc), not a make target.

...

Is there a linter (?) or style checker (make check-spaces) that we might want to run?

There's both, clippy (a linter, requires Rust nightly) and rustfmt. Not yet sure how to best integrate them. Both are rapidly evolving.

My experience with rustfmt is that it wraps to 100 characters by default. We probably want 79 to match our C standards:
https://gitweb.torproject.org/tor.git/tree/doc/HACKING/CodingStandards.md#n125

And we want to wrap comments as well.

We could actually enforce formatting, which would be nice.

comment:11 in reply to: ↑ 10 Changed 2 months ago by Sebastian

Replying to teor:

Replying to Sebastian:

Replying to teor:

Replying to Sebastian:

I think it's ok to expect people to install rust's libc: we already do this with libevent and {open,libre,*}SSL. They'll have to install rust, so installing libc is a reasonable ask.

I don't know what you mean with install rust'c libc. It's a crate that needs to be available during building, not a dynamic library you can link to or something. The crate provides bindings for different host libc implementations.

Oh, ok, then yes, a local crates mirror seems sensible.
And a make target to set it up. I can't quite work out how to do it!

There's a subcommand for cargo called vendor (not installed automatically) that can do that. I can add a make target for it once we decided which of the options for mirroring we're taking.

When I ran the rust build without the crates, it failed halfway through make.
Can we check at configure time instead?
That's when we fail if we don't have other dependencies installed.

Of course we will do that, but since different mechanisms are required for the different potential solutions I outlined above, none of them are implemented right now. See the note about it failing unless dependencies are in Cargo's cache, in the first post here. This seems unrelated to making vendoring easier.

Then we would need the rust dependencies to be a shell script (or a line in the rust install doc), not a make target.

...

Is there a linter (?) or style checker (make check-spaces) that we might want to run?

There's both, clippy (a linter, requires Rust nightly) and rustfmt. Not yet sure how to best integrate them. Both are rapidly evolving.

My experience with rustfmt is that it wraps to 100 characters by default. We probably want 79 to match our C standards:
https://gitweb.torproject.org/tor.git/tree/doc/HACKING/CodingStandards.md#n125

We already do that via a config file. I forgot to add it to the commits here, but it's added now.

And we want to wrap comments as well.

We could actually enforce formatting, which would be nice.

This is harder to do than it appears, because it requires everyone to be in sync with rustfmt versions. Since a tool exist that can reformat, I would like to not require it at first except at the review stage, and then explore our options.

comment:12 Changed 2 months ago by Sebastian

Pushed a branch with a fixup commit to fix the version incompatibility and instead just log a line of its own

comment:13 Changed 8 weeks ago by Sebastian

Wow. This took half a day and a lot of ugly hacks, but this is ready for review now. It can build without rust (default), with rust and locally supplied dependencies via submodule (default if using rust), with rust and locally supplied dependencies in a given directory (via configure env variable) or with rust and using crates.io (via configure switch).

We detect Rust compiler version (requiring 1.14 as of now, which will be included in stretch) as well as the presence of all required crates during configure.

comment:14 Changed 8 weeks ago by Sebastian

  • Status changed from needs_revision to needs_review

comment:15 Changed 8 weeks ago by asn

  • Cc asn added

comment:16 Changed 8 weeks ago by chelseakomlo

  • Keywords rust added

comment:17 Changed 7 weeks ago by nickm

(I hope that other Rust specialists will review this with Sebastian, since I'm going to have to be pretty focused on 0.3.1 this week.)

comment:18 Changed 7 weeks ago by chelseakomlo

I tested the functionality of this, and it works well. Having someone more familiar with Tor's build system look at this would be great.

A couple initial thoughts:

  • Having setup instructions would be helpful for those testing it/setting up Rust for the first time.
  • I imagine using Sebastian's repository as a git submodule is a good short term solution to get us up and running, but having a longer-term solution (not tied to a particular user) may be good.

comment:19 Changed 6 weeks ago by tedmielczarek

Just for reference, in Firefox we decided to vendor our dependencies into our main repository using the cargo vendor command[1]. They're in third_party/rust in mozilla-central[2]. We added a mach vendor rust command to our mach build interface to run vendoring with the right parameters[3]. We do a number of checks there, including checking license compatibility of vendored crates and checking that we're not accidentally committing huge files contained in vendored crates. Our build system writes out a .cargo/config file[4] in the object directory after running configure that uses cargo source replacement[5] to point cargo at the directory containing vendored crates instead of crates.io. We then build our Rust code by passing --frozen to cargo[6], which ensures that cargo doesn't fetch anything from the network.

One annoyance here is that this makes local development that adds new dependencies from crates.io a little more frustrating, as developers have to vendor new deps before building. We could omit --frozen in builds done outside of automation, but that would mean that developers would wind up with cargo fetching crates from crates.io when the crates they need are sitting right there in the repository, and that seems silly. I opened a cargo issue[7] a while ago to ask for some way to better support this workflow but I haven't pushed on getting it fixed.

  1. https://github.com/alexcrichton/cargo-vendor/
  2. https://dxr.mozilla.org/mozilla-central/source/third_party/rust
  3. https://dxr.mozilla.org/mozilla-central/rev/41958333867b0f537271dbd4cb4ba9e8a67a85a8/python/mozbuild/mozbuild/vendor_rust.py#276
  4. https://dxr.mozilla.org/mozilla-central/source/.cargo/config.in
  5. https://github.com/rust-lang/cargo/blob/master/src/doc/source-replacement.md
  6. https://dxr.mozilla.org/mozilla-central/rev/41958333867b0f537271dbd4cb4ba9e8a67a85a8/config/rules.mk#911
  7. https://github.com/rust-lang/cargo/issues/3066

comment:20 Changed 6 weeks ago by Sebastian

Thank you for your input. We have a pretty similar situation here, with the big difference being that we're using a separate repository. I actually looked at firefox when thinking about how we might do it. Our solution for the annoyance you describe is to have a configure switch that can toggle the --frozen switch and source replacement on and off (that's the biggest chunk of the complexity) to aid local dev while allowing internet access.

comment:21 follow-up: Changed 6 weeks ago by nickm

Small questions. We shouldn't take any of these as blocking.

  • Have you tested this with out-of-tree builds?
  • Have you tried "make distcheck DISTCHECK_CONFIGURE_FLAGS='--enable-rust'" to make sure that make distcheck still works when building rust?
  • Do the various "make clean" targets work right?
  • What invokes clippy for us?
  • where do we check rustc/cargo version?
  • How will cross-compilation work?

Small suggestions:

  • The "please report bugs" message .. should say "how" and "where" to report, and remind people not to report ancient bugs. Otherwise we'll get swamped with junk.
  • test_rust.sh -- should it use ls to find the list of crates, rather than having them hardwired?
  • Let's label the autoconf option as "experimental" in the help message.

comment:22 in reply to: ↑ 21 Changed 6 weeks ago by Sebastian

Replying to nickm:

Small questions. We shouldn't take any of these as blocking.

  • Have you tested this with out-of-tree builds?
  • Have you tried "make distcheck DISTCHECK_CONFIGURE_FLAGS='--enable-rust'" to make sure that make distcheck still works when building rust?

Yes, that still works.

  • Do the various "make clean" targets work right?

All targets tested as a part of distcheck work, are there others?

  • What invokes clippy for us?

Nothing yet, it's not a part of this branch

  • where do we check rustc/cargo version?

there's a check during configure. We only test the rust version because cargo versions "belong" to certain rust versions.

  • How will cross-compilation work?

Not at all yet.

Small suggestions:

  • The "please report bugs" message .. should say "how" and "where" to report, and remind people not to report ancient bugs. Otherwise we'll get swamped with junk.

Ok.

  • test_rust.sh -- should it use ls to find the list of crates, rather than having them hardwired?

we could do that i guess, will add it.

  • Let's label the autoconf option as "experimental" in the help message.

Ok

comment:23 Changed 6 weeks ago by nickm

  • Resolution set to fixed
  • Status changed from needs_review to closed

Squashed and merged; let's see how it goes! If you can get me patches for the remaining issues noted above, I'll be glad to take them when they're done.

comment:24 Changed 6 days ago by teor

  • Type changed from defect to enhancement
Note: See TracTickets for help on using tickets.