We should wrap our usage of hash digest functions (and XOFs) in Rust types which implement the appropriate traits, yet still exposes the same API functionality we currently have in C. To keep this task small, I think we should start off with just the sha2 code for now. (Later, it's probably some copy-paste and a bit of refactoring to provide the same interface for other digests, and similar for XOFs.)
I have a prototype (please don't merge! I have some cleanup work described in a couple XXX comments that I'd like to do first) just for SHA256. It could use a look over, just to make sure folks agree with the direction I've taken. See my bug24659branch (TravisCI results pass mostly, except for cargo in offline mode, because I added a dependency which isn't in the tor-rust-dependencies repo).
Note on dependencies: the digest crate only contains traits, so that cryptographic code can implement the same interfaces and be inter-operable (with hundreds of other crates!). Slightly unfortunately, at the moment, digest has a dependency on generic_array, but that'll be removed as soon as const generics are a thing in Rust (there's an accepted RFC, which is also IIUC available as a shepherded/mentored project for people wanting to contribute to Rust).
Trac: Points: N/Ato 1 Sponsor: N/Ato Sponsor3-can Status: new to needs_review
I've done the cleanup work, and also gone ahead and implemented SHA2-512 wrappers. XOFs still need to be done, and maybe a bit of utility work to wrap C functions that don't fit nicely into any of the provided traits.
There's a problem with linking. We already have some Rust code which calls C code in src/rust/external: it is used in protover.rs and it calls tor_version_as_new_as, the latter is in src/or/protover.c and is built into src/or/libtor.a. My new code needs to link to the functions in src/common/crypto_digest.h (from #24658 (moved)), which are eventually built into src/common/libor-crypto.a. The Rust code is compiled eventually into src/rust/target/*/libtor-rust.a, and at the very end of the build (I think?) it is linked to both libtor.a and libor-crypto.a to make the final executable. The issue was that it does not have symbols available for the functions in crypto_digest.h:
= note: /home/travis/build/isislovecruft/tor/src/rust/target/debug/deps/libexternal-792b24e27067f1d1.rlib(external-792b24e27067f1d1.3fgl34tvxe66etyo.rcgu.o): In function `external::crypto_digest::CryptoDigest::new::h0e98658512d8b319': /home/travis/build/isislovecruft/tor/src/rust/external/crypto_digest.rs:225: undefined reference to `crypto_digest_new' /home/travis/build/isislovecruft/tor/src/rust/external/crypto_digest.rs:233: undefined reference to `crypto_digest_new' /home/travis/build/isislovecruft/tor/src/rust/external/crypto_digest.rs:234: undefined reference to `crypto_digest256_new' /home/travis/build/isislovecruft/tor/src/rust/external/crypto_digest.rs:235: undefined reference to `crypto_digest256_new' /home/travis/build/isislovecruft/tor/src/rust/external/crypto_digest.rs:236: undefined reference to `crypto_digest512_new' /home/travis/build/isislovecruft/tor/src/rust/external/crypto_digest.rs:237: undefined reference to `crypto_digest512_new' /home/travis/build/isislovecruft/tor/src/rust/target/debug/deps/libexternal-792b24e27067f1d1.rlib(external-792b24e27067f1d1.3fgl34tvxe66etyo.rcgu.o): In function `external::crypto_digest::CryptoDigest::add_bytes::h3401b4d82d21aa04': /home/travis/build/isislovecruft/tor/src/rust/external/crypto_digest.rs:280: undefined reference to `crypto_digest_add_bytes' /home/travis/build/isislovecruft/tor/src/rust/target/debug/deps/libexternal-792b24e27067f1d1.rlib(external-792b24e27067f1d1.3fgl34tvxe66etyo.rcgu.o): In function `external::crypto_digest::get_256_bit_digest::h126330ad84e55865': /home/travis/build/isislovecruft/tor/src/rust/external/crypto_digest.rs:318: undefined reference to `crypto_digest_get_digest' /home/travis/build/isislovecruft/tor/src/rust/target/debug/deps/libexternal-792b24e27067f1d1.rlib(external-792b24e27067f1d1.3fgl34tvxe66etyo.rcgu.o): In function `external::crypto_digest::get_512_bit_digest::h0ac55a1ee3c0a1d3': /home/travis/build/isislovecruft/tor/src/rust/external/crypto_digest.rs:360: undefined reference to `crypto_digest_get_digest' collect2: error: ld returned 1 exit status
If I add a #[link!] directive to my code, like so, in src/rust/external/crypto_digest.rs:
= note: /usr/bin/ld: cannot find -ltor /usr/bin/ld: cannot find -lor-crypto collect2: error: ld returned 1 exit status
So, as far as I understand it (I'm not the greatest at build systems engineering), what probably needs to happen is something like this:
crypto_digest.[ch] are built into a new library, say libor-crypto-digest.a.
libor-crypto-digest.a is linked to when libtor-rust.a is being built.
libtor-rust.a still needs to somehow be able to find the symbol for tor_version_as_new_as from libtor.a??? (I have no idea how it was ever doing this!!)
libor-crypto.a, libor-crypto-digest.a, libtor-rust.a, and libtor.a all get linked together in the final stage.
(Also at some point we may need to have talks about avoiding circular dependencies if we have Rust calling C and C calling Rust. However, this doesn't currently involve any circular dependencies which can be avoided through further library modularisation.)
I took a stab at this some more in this commit but I'm still failing to get linking to happen for tests.
I took a break from this for a bit to work on other stuff, but I'm beginning to agree with the thing that Chelsea said in #23881 (moved) with the Rust logging stuff, about "The approach of 'write C integration tests and test Rust in isolation' is an approach taken by Servo." If I remove my tests for the correctness of the SHA-2 implementation here, and test only in C (which I'm not actually sure if we're doing yet?) then it'll link fine. The caveat though, which is quite steep, would be that anything that wished to test its Rust code and simultaneously use hashes in a non-generic way would be unable to do so. :/
-L takes a directory, not a file. see rustc --help
src/rust/external/.cargo/config is not read. you can verify with strace or inotify (or your platform equivalent). must use src/rust/.cargo/config (which already exists)
rust does not depend on the libraries in the makefile. if you want it to work this way, you'd need to add that, otherwise building tor fails. I think it would be better to use it only for tests, i.e. use RUSTFLAGS="-L... -l..." in src/test/test_rust.sh, and for the main tor build, let it do it automagically.
I've thought about this a bit more, and—since our C code that I'm wrapping is currently tested for correctness in C—I think that an easy workaround for this, would be temporarily (assuming we're eventually only going to require implementations in C or Rust, not both) doing what Chelsea Komlo's Rust tor logging code is doing: the wrapping implementation in release mode, and a dummy implementation in testing mode. For this, that would probably amount to a dev-dependency (Rust distinguishes between hard dependencies, optional dependencies, and "dev" or testing dependencies in Cargo.toml files) on the sha2 crate. This way, Rust code that is higher level (eventually) will actually be able to test correctness of results of functions which rely on the output of a hash function. (Probably we'll want to do the same for the rand crate when we wrap that. Except for the rand crate, it's a bit messed up, because we'll extremely want the rand::Rng trait in release/production in order to use generic code throughout the Rust ecosystem. Possibly we should nicely ask its maintainers to split the trait off into a separate crate, or otherwise rely on the fact that the crate is scheduled to be absorbed into libstd/libcore in Rust and just use it entirely, including calling it from C.)
I've thought about this a bit more, and—since our C code that I'm wrapping is currently tested for correctness in C—I think that an easy workaround for this, would be temporarily (assuming we're eventually only going to require implementations in C or Rust, not both) doing what Chelsea Komlo's Rust tor logging code is doing: the wrapping implementation in release mode, and a dummy implementation in testing mode. For this, that would probably amount to a dev-dependency (Rust distinguishes between hard dependencies, optional dependencies, and "dev" or testing dependencies in Cargo.toml files) on the sha2 crate. This way, Rust code that is higher level (eventually) will actually be able to test correctness of results of functions which rely on the output of a hash function. (Probably we'll want to do the same for the rand crate when we wrap that. Except for the rand crate, it's a bit messed up, because we'll extremely want the rand::Rng trait in release/production in order to use generic code throughout the Rust ecosystem. Possibly we should nicely ask its maintainers to split the trait off into a separate crate, or otherwise rely on the fact that the crate is scheduled to be absorbed into libstd/libcore in Rust and just use it entirely, including calling it from C.)
Err, sorry, I'm worried I wasn't explicit/clear enough. Let me try again, please. :)
So, what I'm worried about isn't that somehow our hash functions in C are somehow incorrect. That would be absurd. Hash functions either pass test vectors or they don't, and it would be inconceivable for them to pass some vectors and fail others, that would undermine all cryptography based on the random oracle model.
What I worry about is that, with the current code, there's no way to test higher-level, pure-Rust functionality. So for example, if I were to implement ed25519 signing in Rust for tor, with my ed25519-dalekcrate, which takes the hash function as a generic, i.e.:
/// Sign a message with this `ExpandedSecretKey`. pub fn sign<D>(&self, message: &[u8], public_key: &PublicKey) -> Signature where D: Digest<OutputSize = U64> + Default {
Hence, if we ever needed to test e.g. a Rust parser for descriptors, which contain ed25519 ntor-onion-key-crosscert signatures, we'd be unable to, since the test code can't call the signatures code because the signature code won't be able to call the digest code which wraps the C. This seems kind of bad moving forward, since so much of our code relies on hash functions. One long-term solution is to say that some things are only Rust or C, not both. Another solution is to do something like #25386 (moved), but I had a stab at that in this ticket, and it seems like we're not really moving forward in a sustainable way with testing linker issues.
Added some initial thoughts on the code. I don't have much in the way of thinking about the linking issues though -- let me know if you want me to hack on those, or if we should merge without them and have them be a separate issue?
Added some initial thoughts on the code. I don't have much in the way of thinking about the linking issues though -- let me know if you want me to hack on those, or if we should merge without them and have them be a separate issue?
not test any code which needs to call/use hash functions
add a dev-dependency to the Cargo.toml for the sha2 crate for use only in tests (e.g. do #[cfg(test)] pub type Sha256 = sha2::Sha256; or something)
write a crappy, slow sha2 implementation, for use only in tests
Not sure what's the least painful, but I really dislike !#2 (closed).
I guess we could merge now, if you think that's okay, and say that the XOF stuff is a new ticket, and also the linking/testing thing is another issue? I'm somewhat inclined to go with !#3 (closed), mostly because I don't really want to have to do !#4 (closed) (and then also implement keccak when the time comes).