Rust in Tor

What & why

We are currently investigating integrating Rust as a first-class language in Tor. We decided upon Rust due to the benefits of memory safety and the ability to directly integrate Rust and C. To read more about how and why this started, see our meeting notes from the 2017 meeting in Amsterdam.

Current status

In Tor 0.3.3.x, we include a Rust implementation of our "protover" module. It is not enabled by default, but we encourage people to try it.

Starting with Tor 0.3.4.x, we will accept Rust-only features, if they can be safely merged to the network with only some Tor instances supporting them.

We will continue to build out our modularity efforts, and our Rust infrastructure, to make it easier to write modules in rust and replace parts of Tor.

We encourage ALL downstream packagers to try building Tor with Rust enabled, to help identify any compatibility or usability issues as soon as possible.

Future steps

At some point in the future, when we judge that our Rust support is sufficiently mature, we will announce a release and release date after which Rust will be required. We have not picked such a release or date yet.

What we are currently working on

  1. Understand alignment between Rust and Tor supported platforms. This is a list of which platforms we aim to support, it would be helpful to understand the intersection with Rust. (#22771
  2. Adding automated tooling for code quality tools. (#22156)
  3. Build Tor with Rust for Windows. (#22839)
  4. Investigate the reproducibility of Rust binaries. (#22769)
  5. Implementing existing submodules in Rust as a proof of concept. Two that are currently in progress are consdiff (#24609) and protover (#22840).
  6. Add Rust-enabled build to the Tor CI. (#22636 and #22768)

Interested in helping out?

Please see doc/HACKING/ (rendered) in the tor.git repo.

Coding Standards

Please see doc/HACKING/ (rendered) in the tor.git repo.

All current, non-closed, Rust in Tor tickets

Ticket Summary
#24265 Fuzz all rust functions that are used by authorities to make sure they match C

We could break consensus if some authorities are running the rust version of the code, and some are running the C version of the code, and their outputs differ on any input.

This is like #24029, but with arbitrary inputs that may or may not be UTF-8.

#27194 Reject protover extra commas in protover

Like how it handles spaces, protover.c rejects leading commas (Link=,1-5 or Link=,) while it accepts and ignores extra commas elsewhere (Link=1-5, and Link=1,,,2-5 are valid).

The Rust version accepts and ignores all extra commas, including leading commas.

#26265 A proposal and demo for a fuzzing system that works with Rust through C code

I've implemented a demo for fuzzing Rust code and C code at the same time. I hope I can address #25386 with that by using cargo afl. Though I would like to have this system approved first before I write code for a PR.

#26941 Privcount blinding and encryption: review dependencies

External dependency review from

#26970 Privcount: plan the modules and components

Replying to [26953#comment:3 chelseakomlo]:

Is the idea that this project will remain external to core tor, or will this one day be merged into the core codebase? Definitely having CI in the short term seems wise either way.

That's a good question, nickm and I haven't discussed it yet. And I think we'd benefit from your advice.

For PrivCount in Tor, we need to produce the following components:

  • a Rust "Data Collector" module in Tor that does blinding, encryption, and noise, based on a config
  • a separate "Tally Reporter" binary that does unblinding, decryption, aggregation, and reporting, based on a config
  • some tools for creating and validating configurations

One possible design is:

  • Rust modules for blinding/encryption, noise, aggregation, reporting, and config
  • Glue code and module imports for the Tor Data Collector
  • Application code and module imports for the Tally Reporter
  • Application code and module imports for the tools

Split off

#26944 Privcount blinding and encryption: Add tests

Add tests as recommended in

#26945 Privcount blinding and encryption: always enable i128


#26957 Privcount blinding and encryption: Derive copy and debug where possible

The privcount shamir structs are missing copy and debug implementations.

komlo: are these useful?

If they are, we should derive them, and then enable the missing_copy_implementations and missing_debug_implementations warnings.

#26958 Privcount blinding and encryption: run clippy on travis rust nightly

We'll need to fix or disable a lot of warnings for clippy.

#25841 Write test for Rust fragile hardening

We should write some tests to ensure that asan is working to check memory leaks, dangling pointers, and so on. I know that #25386 links asan, but I don't know if it actually checks stuff that we want checked. We should check that it checks what we want checked.

I envision this looking like a test_fragile_hardening.c (or a series of such files) and some helper Rust code that does some bad behavior and a wrapper that ensures that it crashes.

Landing this requires #25386, but writing it does not; just assume that Rust and C can arbitrarily call each other (with the proper boilerplate), and asan is supposed to catch everything that it catches in C.

#27785 shouldn't use realpath
Description uses the realpath command. macOS doesn't seem to have it, and POSIX doesn't specify it.

Maybe we should use something like (cd $x && pwd -P) instead for greater portability. (Though older shells and OSes might not have the -P option to pwd, it is POSIX.)

Alternatively, confirm that the current directory is a "reasonable" place to run the script from, and use relative pathnames.

#28079 Stop returning the empty string when the cstr! macro fails

Split off #28077:

from_bytes_with_nul returns an error if there is more than one nul byte in the string. ... Returning an empty string could be a source of subtle bugs.

Misuse is extremely unlikely to slip in since this is only used on string literals. But yeah, the ideal solution would be statically asserting at compile-time that the passed literal has no NUL bytes in it, so the only one is the byte being appended.

But defaulting to an empty string(in a case that is basically impossible to get) is the intentional documented behavior of the macro ever since it was first merged in #25185. Improving on that seems like a separate ticket.

#31443 Rust's default allocator is the system allocator since rust 1.32

We could remove some code and comments about allocator usage, which would hopefully make it seem less daunting to contribute some rust code :)

#23351 Create a rustfmt.toml defining our whitespace/formatting standards

We currently have no style consensus for Rust code. It would be good to agree on something! We could agree on whatever the Rust people like (still a WIP last I checked) or we could modify that by creating a `rustfmt.toml`.

We should also probably add a pre-commit hook for running rustfmt, since we have a pretty clean slate and we should keep it clean. :)

#22816 Run tests for single Rust module

In Tor, we currently have the ability to run tests for a single C module (or even a single unit test). As specified in doc/HACKING/WritingTests, running tests for the cell format module (for example) can be done via ./src/test/test cellfmt/..

Rust modules should have a similar option. Currently 'cargo test' can be run within a single Rust module, but this will not link against C modules. It would be good to be able to do this and retain the ability to test a single Rust module. Also, it would be nice to make this similar to running single C module tests, to minimize developer confusion.

#27191 handling double spaces in protover

protover.c accepts trailing spaces and extra spaces between subprotocol entries like "Link=1-4 LinkAuth=1 ", but rejects leading spaces like " Link=1-4". It has since its introduction.

The Rust implementation rejects all extra spaces in any position. It's at least consistent.

#24609 consdiff implementation in Rust

in my public repo in branch rust4, there's a pretty much complete consdiff implementation in Rust (only missing some logging and testing from the C side iirc). I won't have time to pick it up anytime soon I'm afraid but I hope someone finds it useful. Note it looks a bit different compared to the C code as we were trying very hard to come up with something without any unsafe code and no external dependencies, as this was some of the first rust code ever written for tor. It should be straight-forward, though.

#27915 Make rust doctests get linked in same way as other rust tests
#27201 rust/protover doesn't forbid version zero

Per the spec, version integers can't begin with, or be, zero:

       Int = NON_ZERO_DIGIT
       Int = Int DIGIT
#27189 cleanup rust code

There are low-hanging fruit for silencing clippy lints, removing unnecessary allocations, and writing a more efficient version of .retain().

#27190 disparate duplicate subproto handling in protover

protover.c treats Link=1 Link=1 and Link=1 identically, allowing duplicate entries without complaint, though it does explicitly check for duplicates to avoid double-counting it as two votes for the same version.

protover.c also treats Link=1 Link=2 and Link=1-2 the same, while the rust implementation of protover treats Link=1 Link=2 as if it were Link=2.

#24249 Create automated mechanism for C/Rust types to stay in sync

In transitioning parts of tor to Rust, some parts of the code will either need to temporarily exist in both C and Rust (such as protover), or will be highly coupled (such as enums that are passed between the FFI boundary).

It would be good to automatically verify these areas of the code don't get out of sync. This could either be a post-hoc verifier, or a generator that takes a higher-level specification and generates both C and Rust types.

Ideally, the coupling between C and Rust will be as minimal as possible, so this probably does not need to be a heavyweight solution.

#28309 Log the rust version when printing other library versions

tor --library-versions prints the versions of Tor's library dependencies.

Would it be useful for it to print the rust version as well? Or the versions of our rust library dependencies?

Tor also prints the OS version and library versions when it starts up. We could add the rust version and significant rust library versions to that log message.

#31390 --enable-rust with pre-downloaded Rust dependencies fails: no .cargo-checksum.json files

In the FreeBSD port I am downloading the dependencies and supplying digest-0.7.2 libc-0.2.39 directories as they appear on GitHub pointed by the TOR_RUST_DEPENDENCIES environment variable.

The build fails:

error: failed to load source for a dependency on `digest`

Caused by:
  Unable to update registry ``

Caused by:
  failed to update replaced source registry ``

Caused by:
  failed to load checksum `.cargo-checksum.json` of libc v0.2.39

Caused by:
  failed to read `/usr/ports/security/tor/work/tor-`

Is some command supposed to be run that would build .cargo-checksum.json ?

#22769 Investigate the reproducibility of Rust binaries

If we are going to start writing more Tor things in Rust, it would be nice to understand the reproducibility of binaries created with rustc. I suspect the Tor Browser Team would also be interested in having these results, since parts of Firefox are now written in Rust, and soon (ESR 58?) it will no longer be optional to use them.

Note: this ticket is not about the reproducibility of rustc iteself. That is an extremely deep rabbit hole (trust me, I have a rustc chained back to the OCaml days). Someday we may need to explore that, but that time is not now.

My approach for this task would be probably be to create a Docker instance which builds some trivial Rust program, and then run the Docker instance on different machines and compare the hashes of the binaries (then optionally investigate the differences using whatever tools like running strings and moving up to Ida or whatever).

#25504 Find more generic ways to handle smartlist_t/Vec<T> between C and Rust

From #25368, we discussed having a possibly more generic and/or more rusty way to handle our smartlist_ts in C (and whatever underlying types the smartlist contains). Right now we have a Stringlist type in src/rust/smartlist/, which is a Rust representation of smartlist_t using C types, and then we have a conversion between that and a Vec<String>:

pub trait Smartlist<T> {
    fn get_list(&self) -> Vec<T>;

pub struct Stringlist {
    pub list: *const *const c_char,
    pub num_used: c_int,
    pub capacity: c_int,

impl Smartlist<String> for Stringlist {
    fn get_list(&self) -> Vec<String> {
        // [...]

I have not thought about this nearly as much as komlo has, but maybe one way to do it is to have direct conversion between a smartlist_t and a Vec<T>, where T is probably an opaque pointer to whatever type in C, or T is only allowed to be a String which we've copied from a non-NULL char* (e.g. impl From<Stringlist> for Vec<String>, or something, and then keep Stringlist private since internally it's a bunch of C types that we don't want propagating into our more Rusty code).

Another idea might be to only handle Vec<T>-like things in Rust (if/when we move to the Rust-is-required phase), since we already have a nice datatype there, and then provide safe interfaces for C code to do all the things with/to the vectors that it currently does. (This sounds easier and more maintainable to me.)

We should probably brainstorm other ideas of how we're going to do this generically moving forward, because our C code uses smartlists everywhere.

#27130 rust dependency updating instructions don't work

None of the instructions mention updating Cargo.lock, which is required. The script doesn't update that file, either.

#27052 document rust/crypto

And add #![deny(missing_docs)] to the top of the files to enforce it.

Attempted in af182d4ab51d6a1a70559bbdcd4ab842aa855684 and b6059297d7cb76f0e00e2098e38d6677d3033340 but forgot the exclamation point.

#27207 Examples in are wrong

The section on CString is incorrect:

  • CString::new("bl\x00ah").unwrap().into_raw() will panic in the 'unwrap' call, it will never return a pointer of any kind, dangling or otherwise.

Also, 12cf04646c571646b726e697d66ecafad7886cf2 seems to be the result of some miscommunication with withoutboats:

  • .expect() is literally '.unwrap(), but with a custom panic message,' it doesn't return an Option and is no safer than unwrap, but it is self-documenting.
#25269 Set codegen-units to 1 in src/rust/Cargo.toml to eke out every last drop of performance

Rust 1.24 now sets codegen-units to 16 by default to speed up compilation time but it makes the final binary slower

For maximum speed, setting codegen-units to 1 in your Cargo.toml is needed to eke out every last drop of performance.

So src/rust/Cargo.toml should be changed with that to squeeze the most perf.

#27722 rust protover doesn't canonicalize adjacent and overlapping ranges

protover.c accepts both "Foo=1-3,4-5" and "Foo=1-3,2-5" and then canonicalizes them into "Foo=1-5" with contract_protocol_list(). Rust rejects the 2nd one as malformed.

#28081 rust protover discards all votes if one is not UTF-8
#27739 rust protover_all_supported() accepts too-long protocol names

569b4e57e23d728969a12751afc6b45f32d0f093 was fixing #25517 but kept the old behavior of allowing protocol names of any length for protover_all_supported(). That's the reason this unit test was failing, and ended up being disabled on rust builds with a ??? comment of confusion.

The reason given in the commit for this behavior was in order to maintain compatibility with consensus methods older than 29. but the corresponding C code change never made any exception like this.

#27229 Create fuzzing harness to compare C/Rust Functionality

In porting over functionality to Rust, it can be useful to compare functionality between C/Rust. While ideally unit tests should catch most behavior, having a fuzzer to catch edge cases can be handy.

We should write a test harness that fuzzes C/Rust similar functions and compares their output. Ideally, a test would look something like this:

  1. Setup C test case
  2. Set up Rust test case
  3. Provide both functions with the same generated arbitrary input
  4. Compare results

It is worth noting that in most cases we will want to improve behavior when porting to Rust, but this tool can be useful for small cases where we want bitwise identical functions.

Alex Crichton recommended looking at as one option- it is worth looking at what a simple test harness should be and how to have code be reusable between tests.

#27805 Update with allocate_and_copy_string()

We have allocate_and_copy_string() in Rust, which uses tor_malloc(): ​

We need to update our rust coding standards to mention allocate_and_copy_string(): ​

#23880 Build tor with --enable-rust in Orbot and OnionBrowser

Hello! During our Rust discussions at the Montréal meeting, we discussed that it would be extremely useful to know — before we enable Rust by default — if doing so will cause issues for our packagers and downstreams, particularly on mobile. Would it be possible, please, for someone to create an experimental build of Orbot (and OnionBrowser!) building with ./configure --enable-rust [--enable-cargo-online-mode], and let us know any issues you encounter here?

#23882 Investigate implementing a Rust allocator wrapping tor_malloc

We should look into implementing the Rust alloc::allocator::Alloc trait as a wrapper around tor_malloc as a way to have a cleaner allocator interface in Rust moving forward (which still works with our current legacy C code).

This is what the Rust code in Firefox has done, and the alloc crate is supposed to stabilised "soon" (as in, within the next six months) because FF is using it.

#23886 Write FFI bindings and function pointers for ed25519-dalek

As part of our efforts to get a few modules in Tor written in Rust for 0.3.3, an exceptionally easy candidate is our ed25519 code, given that the current code is already highly modularised, taking function pointers to implement an interface. I wrote ed25519-dalek, and I recently revised the API to be a very close match to what tor expects, so I believe this task should be extremely easy, and a prime candidate for someone newer to Rust who wishes to learn about writing FFI. (I'm happy to pair program on this too! Also on anything else, but this too.)

#27161 Add make check-rustfmt to make check

In #26972, we discovered that stable and nightly rustfmt disagree about the formatting of some of Tor's Rust code. Beta currently follows nightly.

On 13 September, the current beta branch will become stable, and the formatting differences should go away. (Rust releases happen every 6 weeks, and the last one was on 2 August 2018: .)

Here's what we need to do:

#26337 Investigate making rust error types use the failure crate

As our Rust code increases, we'll eventually want a nicer way to convert between error types than we currently have. We'll probably want to use boats's failure crate. They mentioned a while ago that they were going to make a 1.0.0 release soon, and afaict there's not really anything about the current release that is expected to change, so we can probably start working on this now-ish.

[update] This ticket is pending stability of the failure crate and the direction of the error trait in Rust

#26373 should detect when it's being invoked improperly and error out

While attempting to test #26258, I noticed that running src/test/ from the top of the source tree exited with status 0 and no output. It should probably detect that it failed to find any Cargo.toml files and exit with a failure status with an error message. (This seems to happen because some necessary environment variables aren't set.)

Prior to #26258, the find invocation failing would probably have taken care of this, so this change should probably get back ported to the same releases.

#31862 Add a beta RUST_VERSION build to Travis CI

In #31859, we removed the beta Rust build to speed up CI.

We should add it back in, when we are actively developing Rust.

#27162 Travis: consider running CI on beta, nightly, and tor's lowest supported rust

At the moment, Tor's Travis CI runs on stable rust, and our Appveyor (Windows) CI doesn't run rust at all (#26954).

But in our privcount_shamir work (#25669), we've discovered some important bugs by running beta and nightly.

Since Rust releases every 6 weeks, we should run CI on rust stable and beta, so that we catch any show-stoppers before they are stable.

We might also want to consider an allow_failures nightly build.

If we choose to support a lower version of rust than stable (when a major distro freezes on a lower version), we should also CI that version.

#25628 Document our Rust coding standards for error/failure types

Every crate which returns Result<T, E>s or Option<T> anywhere in its public interface should have an module containing error types which implement either Display or Debug. See the addition to the protover crate from #24031 for an example.

In the future, when failure is 1.0.0, we should also require ::failure::Fail for making errors easier to work with between crates.

#25386 Link Rust Tests to C Dependencies in Tor (allow integration testing from Rust to C)

currently, it is not possible to call C Tor, directly or indirectly, from rust tests. one of the following must be done:

  1. provide rust stubs for all C functions that may be needed for tests (impractical)
  1. test rust functions from C (so we will have C tests calling Rust functions calling C functions)
  1. link C functions into rust doctests (preferred)
  1. never call C-using rust functions in tests (leads to poor test coverage, very bad)

my branch implements option 3 poorly. this is a bad solution firstly because it is very ugly, and secondly because it does not properly pass the system linking arguments, e.g. -L/opt/ssl. thirdly, it may hide problems in or cause to be compiled incorrectly dependency crates.

this ticket blocks a number of rust improvements, since of course we would like to actually test the improvements, and doctests are the best way to do it in rust.

#23878 Attempt rewriting buffers.c in Rust

In buffers.c, we define buf_t, which is essentially a doubly-linked list comprised of chunks of contiguously-allocated memory. During the Montréal meeting, we identified buf_t as a potentially good candidate datatype for reimplementation in Rust.

My understanding of possibly the ideal way to do this (after talking with Alex Crichton, without boats, nickm, and Nika Layzell) would be to entirely rethink the implementation in terms of a VecDeque<Bytes> using VecDeque from the stdlib and Bytes or another buffer type from the bytes crate. If this is something which works out, we could then (hopefully!) expose a similar API as to the C interface. (If that doesn't work out, there's only a couple points in the code which appear to rely on the current implementation of buf_t.)

#24033 Require all directory documents to be UTF-8

There are only a few places that directory documents can have arbitrary bytes today, and almost nobody is using them to encode anything besides UTF-8. let's standardize on UTF-8 while we still can.

Step one will be for the authorities to start rejecting these documents. Once they're rejecting them, everybody else can begin rejecting them too.

#27368 Authorities should reject non-UTF-8 in votes and consensuses

Part of #24033.

#27369 All Tor roles should reject non-UTF-8 in all directory documents

Part of #24033.

#27374 Bridge clients should reject non-UTF-8 in descriptors


#27414 Relays should check that their descriptors are UTF-8 before uploading them

#27415 All Tor instances should attempt to parse directory documents before uploading them

This helps us catch bugs like #16858 and #24821.

#25669 Privcount: blinding and encryption should be finished up

We're supposed to do this in 0.3.4. I don't know if there is anything here left to do, but in case there is (like merging to tor, testing more, etc) this is the master ticket.

#26161 Design and implement a Rust dirauth module

Some of our protoceratops* functions are only used when dirauths vote:

  • protover_compute_vote
  • protover_compute_for_old_tor

This function is implemented in Rust and C:

  • protover_compute_vote

We should work out how to split protover in Rust and C, and put the dirauth parts in a separate module.

* I blame autocorrect

#25381 Add crypto_rand_double_sign() in C and Rust

We need a function that returns 1.0 or -1.0 with equal probability, so we can avoid weird tricks that waste floating point precision.

Since we want to use this in the laplace and guassian distributions, it needs to be implemented in both C and Rust.

#24116 Torsocks deadlocks every Rust program

Any Rust program that is run with torsocks will deadlock. This has nothing to do with networking, even the program 'fn main() { }' compiled with a recent rustc will deadlock when run as 'torsocks ./rust_torsocks'.

This is a backtrace I got when attaching to the deadlocked process:

#0  0xb7713cf9 in __kernel_vsyscall ()
#1  0xb76b9d92 in __lll_lock_wait ()
  at ../sysdeps/unix/sysv/linux/i386/lowlevellock.S:144
#2  0xb76b38de in __GI___pthread_mutex_lock (mutex=0xb770d024)
  at ../nptl/pthread_mutex_[lock.c:80 lock.c:80]
#3  0xb77001ed in tsocks_mutex_lock ()
  from /usr/lib/i386-linux-gnu/torsocks/
#4  0xb7700334 in tsocks_once ()
  from /usr/lib/i386-linux-gnu/torsocks/
#5  0xb76fa25e in tsocks_initialize ()
  from /usr/lib/i386-linux-gnu/torsocks/
#6  0xb76fd02d in syscall ()
  from /usr/lib/i386-linux-gnu/torsocks/
#7  0x004a5049 in os_overcommits_proc ()
  at /checkout/src/liballoc_jemalloc/../jemalloc/src/pages.c:252
#8  je_pages_boot ()
  at /checkout/src/liballoc_jemalloc/../jemalloc/src/pages.c:297
#9  0x004745dd in malloc_init_hard_a0_locked ()
  at /checkout/src/liballoc_jemalloc/../jemalloc/src/jemalloc.c:1366
#10 0x00474768 in malloc_init_hard ()
  at /checkout/src/liballoc_jemalloc/../jemalloc/src/jemalloc.c:1493
#11 0x00489b95 in malloc_init ()
  at /checkout/src/liballoc_jemalloc/../jemalloc/src/jemalloc.c:317
#12 ialloc_body (slow_path=true, usize=<synthetic pointer>,
  tsdn=<synthetic pointer>, zero=true, size=20)at /checkout/src/liballoc_jemalloc/../jemalloc/src/jemalloc.c:1583
#13 calloc (num=1, size=20)
  at /checkout/src/liballoc_jemalloc/../jemalloc/src/jemalloc.c:1824
#14 0xb76d23ec in _dlerror_run (
  operate=operate@entry=0xb76d1b80 <dlopen_doit>, args=args@entry=0xbfb8cd10)
#15 0xb76d1c9e in __dlopen (file=0xb7703345 "", mode=1) at dlopen.c:87
#16 0xb76fa44f in ?? () from /usr/lib/i386-linux-gnu/torsocks/
#17 0xb7700352 in tsocks_once ()__
  from /usr/lib/i386-linux-gnu/torsocks/
#18 0xb76fa25e in tsocks_initialize ()
  from /usr/lib/i386-linux-gnu/torsocks/
#19 0xb7724c65 in call_init (l=<optimized out>, argc=argc@entry=1,
  argv=argv@entry=0xbfb8ce74, env=0xbfb8ce7c) at dl-init.c:72
#20 0xb7724d8e in call_init (env=0xbfb8ce7c, argv=0xbfb8ce74, argc=1,
  l=<optimized out>) at dl-init.c:30
#21 _dl_init (main_map=<optimized out>, argc=1, argv=0xbfb8ce74,
  env=0xbfb8ce7c) at dl-init.c:120
#22 0xb7715a5f in _dl_start_user () from /lib/

It looks like tsocks_initialize() is called when libtorsocks is loaded, it calls tsocks_once() which locks a mutex and then calls dlopen() to get the libc symbols, dlopen() tries to allocate some memory which leads jemalloc (the default allocator for Rust programs) to try to call syscall() (it wants to open a proc file to see if the system overcommits memory or not), which is intercepted by libtorsocks, which leads to another call to tsocks_initialize()... and since the mutex is already locked, it deadlocks.

One way to fix this might be to just let through any syscall() calls that happen during bootstrapping, but i don't know the torsocks code well enough to know if this could cause any dangerous leaks.

Last modified 3 years ago Last modified on Mar 20, 2018, 11:56:10 AM