Opened 3 months ago

Closed 3 weeks ago

Last modified 3 weeks ago

#25639 closed enhancement (wontfix)

think about Rust crate boundaries

Reported by: Hello71 Owned by: Hello71
Priority: Low Milestone:
Component: Core Tor/Tor Version: Tor: unspecified
Severity: Normal Keywords: rust
Cc: chelseakomlo, isis, manishearth@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

pros:

  • simpler is better
  • makes testing Rust a lot easier, otherwise there are problems with dependencies. see #25386 for more details
  • makes it easier for newbies. I was certainly very confused as to why each crate only has a single real file.

cons:

  • I discussed this on IRC with Sebastian, since it looks like he was the one who started splitting them up in the first place. he said he doesn't remember exactly why, but possibly it had something to do with a plan to swap in and out each crate individually, or possibly something to do with technical reasons regarding allocation or some such thing. I can't find anything regarding the latter, and I'm not convinced that the former is actually helpful.

therefore, if nobody has any objections, I propose moving all the Rust code into one crate. I asked on IRC, and somebody said that a ticket is better than mailing list, but I'm happy to copy and paste this into an email if that's better.

setting priority High since this blocks fixing #25386 with asan (unless anybody can think of better ways to do that), and that blocks proper Rust development.

Child Tickets

Change History (19)

comment:1 Changed 3 months ago by nickm

Cc: chelseakomlo isis added

(I'd like to know what the other rust people think here too)

comment:2 Changed 3 months ago by chelseakomlo

I don't think is a good idea. I agree this makes it simpler in the short term but this won't scale well. I.e, we definitely want more modularity in Rust/new code, not less.

Have you looked at how Servo handles this problem? https://github.com/servo/servo I think we should consult more Mozilla people about how they have handled issues like running tests, linking, etc.

Last edited 3 months ago by chelseakomlo (previous) (diff)

comment:3 Changed 3 months ago by chelseakomlo

Have you looked at how building is handled in tor_rust/lib.rs?

comment:4 Changed 3 months ago by chelseakomlo

Keywords: rust added

comment:5 Changed 3 months ago by chelseakomlo

comment:6 in reply to:  5 ; Changed 3 months ago by Hello71

Owner: set to Hello71
Status: newassigned
Summary: merge Rust cratesthink about Rust crate boundaries

Replying to nickm:

(I'd like to know what the other rust people think here too)

yeah, I meant this as kind of an RFC (that I was hoping there wouldn't be any objections to :P)

Replying to chelseakomlo:

I don't think is a good idea. I agree this makes it simpler in the short term but this won't scale well. I.e, we definitely want more modularity in Rust/new code, not less.

Have you looked at how Servo handles this problem? https://github.com/servo/servo I think we should consult more Mozilla people about how they have handled issues like running tests, linking, etc.

I will investigate. AIUI their situation is different though: servo is intended to be a fully enclosed module, if primarily intended for use in a single application, whereas in Tor we have tight coupling between everything. possibly we should consider having several distinct modules, but in that case we need to discuss how we draw those lines; if they're "everything depends on everything else" then it only causes problems to have them as separate crates, but if the dependencies are actually limited then it might make sense. oh, or if they've already been discussed, I think it needs to be better documented.

doc/HACKING/CodingStandardsRust.md:

If your Rust code must call out to parts of Tor's C code, you must
declare the functions you are calling in the external crate, located
at .../src/rust/external.

if this was actually followed, maybe we could use it for tests? not sure though.

Replying to chelseakomlo:

Have you looked at how building is handled in tor_rust/lib.rs?

I think you mean tor_rust/Cargo.toml? Yeah, I understand how it works now, I just don't think it's a good system. I'm open to being convinced otherwise though.

Replying to chelseakomlo:

For more context about the Rust build structure, see: https://trac.torproject.org/projects/tor/ticket/22840#comment:11

And how Gecko includes/builds Rust crates: https://github.com/mozilla/gecko-dev/blob/master/toolkit/library/rust/shared/lib.rs#L5-L24

I will investigate.

comment:7 in reply to:  6 Changed 3 months ago by chelseakomlo

Replying to Hello71:

Replying to chelseakomlo:

I don't think is a good idea. I agree this makes it simpler in the short term but this won't scale well. I.e, we definitely want more modularity in Rust/new code, not less.

Have you looked at how Servo handles this problem? https://github.com/servo/servo I think we should consult more Mozilla people about how they have handled issues like running tests, linking, etc.

I will investigate. AIUI their situation is different though: servo is intended to be a fully enclosed module, if primarily intended for use in a single application, whereas in Tor we have tight coupling between everything. possibly we should consider having several distinct modules, but in that case we need to discuss how we draw those lines;

Yes, that is fair. We most likely won't get these lines exactly correct right now though, so let's plan on having several distinct modules, and being able to evolve this as we learn/refactor code, etc. I think this ticket should be an ongoing discussion, as opposed to "let's hammer out our architecture right now." Flexibility will be a good asset here :)

It is a good point about documentation for module planning- I have some rough notes, as does a few other people. We will compile these and make them available.

doc/HACKING/CodingStandardsRust.md:

If your Rust code must call out to parts of Tor's C code, you must
declare the functions you are calling in the external crate, located
at .../src/rust/external.

Yes, this should be followed. And we can talk about whether modifying this approach makes sense- maybe every crate has an external.rs, for example. But isolating our external C dependencies is a good idea for enforcing clean Rust/C boundaries.

Replying to chelseakomlo:

Have you looked at how building is handled in tor_rust/lib.rs?

I think you mean tor_rust/Cargo.toml? Yeah, I understand how it works now, I just don't think it's a good system. I'm open to being convinced otherwise though.

This is taken from Gecko project structure, it would be worth talking to them about this setup and any issues we've had with it thus far- maybe they have already ran into these problems and have ideas about them?

I will investigate.

Thank you! This is definitely a lot of work, but getting this right will help us have a sane/maintainable project structure into the future. Thanks for the effort and looking further into what our options are.

comment:8 Changed 3 months ago by chelseakomlo

Cc: manishearth@… added

comment:9 Changed 3 months ago by manish.earth

From Firefox's point of view, there's basically a single crate (the "shared"[1] crate you see) that links in everything. The crate itself is (almost) empty aside from extern crate declarations. The individual crate dependencies can be cfgd on or off, but all of it must go through this single crate.

Firefox doesn't run Rust tests (i.e. cargo test tests). Servo does, in its CI, however Servo is a standalone application that doesn't need Firefox C code to run. The Servo compilation mode that's used in Firefox ("libgeckoservo") *does* need Firefox C code to run. There are some tests for the libgeckoservo mode, however they only test very specific things. The understanding is that Firefox's tests will take care of bugs in libgeckoservo, so we don't need to specifically test libgeckoservo (commits get CId on both sides).

If Firefox wishes to test Rust code, it gets put in the gktest crate, which integrates with the GTest framework which Firefox already uses.

The way it works is that the "shared" crate is linked to by an empty "gkrust"[2] crate, as well as another empty "gktest" crate[3]. The gkrust crate is what's actually included in the Firefox build, whereas the gktest crate additionally contains some extra test crates which have a bunch of extern "C" Rust test functions that get called.

You can see an example of some tests here[4], which get called from a GTest file here[5].


Applying this to Tor, the way to make it work would be to have a single toplevel crate, which optionally includes some testing crates which export test symbols that get called from a C test file for whatever framework y'all use. This isn't great; I'm hoping in the future cargo test will be better at this (we're working on custom test frameworks which should make this easier).

[1]: https://hg.mozilla.org/mozilla-central/file/tip/toolkit/library/rust/shared/Cargo.toml
[2]: https://hg.mozilla.org/mozilla-central/file/tip/toolkit/library/rust/Cargo.toml
[3]: https://hg.mozilla.org/mozilla-central/file/tip/toolkit/library/gtest/rust/Cargo.toml
[4]: https://hg.mozilla.org/mozilla-central/file/tip/xpcom/rust/gtest/nsstring/test.rs
[5]: https://hg.mozilla.org/mozilla-central/file/tip/xpcom/rust/gtest/nsstring/Test.cpp

comment:10 Changed 3 months ago by Hello71

yeah, I found https://github.com/rust-lang/rfcs/pull/2318. I haven't gone through the comments yet, but:

I have heard that Rust people want it to be really easy to use both Rust and C together in the same project, at least during a (potentially extended) migration. For projects that actually test though (hopefully all of them!), interlanguage testing needs to be really solid. In particular, the majority of C projects (i.e. the ones that are medium-sized like Tor and not huge like Gecko) have semantic rather than strictly-defined module separation. Tor is certainly modular, but (as far as I know) they have many cyclic dependencies that get worked out by the magic of static archives. This doesn't work in the following case though (AFAIK): suppose there are Rust crates A and B and C blob C. A and B don't depend on each other, but C depends on both. We want to test A. If we just add C to link, it doesn't work, because C has a hidden dependency on B. But if we add our Rust blob (we call it tor_rust), then we get multiple definition errors. Maybe I'm doing it wrong, but I think this is a key use case for Rust (gradual migration from full-C/C++ projects), so I feel like this should be really easy and not require that I re-architect all my C module boundaries or give up and just write tests in C.

Applying this to Tor, the way to make it work would be to have a single toplevel crate, which optionally includes some testing crates which export test symbols that get called from a C test file for whatever framework y'all use. This isn't great; I'm hoping in the future cargo test will be better at this (we're working on custom test frameworks which should make this easier).

I considered doing this; I assume this involves some sort of custom macro or (minor) boilerplate though, and either way we'd have to write or find a non-"standard" test framework, when we really just want to link in a few extra libraries. Really I was hoping that I could just do rustc --crate-type staticlib --test and have it actually work (the output would have a main or _start or whatever Rust does), and just link that normally. That would require --print native-static-libs to work through Cargo though.

comment:11 Changed 2 months ago by Hello71

Priority: HighLow

I have discovered that #25386 can in fact use shared libraries, so we don't have to do this right now.

nevertheless, I think we have a fairly reasonable system already. I think the problem is in our C modules. as I understand, those currently have complex dependencies. if true, it will be difficult to architect our Rust modules in a spaghetti-free manner. we should try to untangle those dependencies in tandem with moving them to Rust. also, we should have some documentation on which modules do what in a central location rather than in header files. these can both be done together with Rust migration and would both be useful even if Tor decides not to use Rust.

OTOH, I don't know how useful it is to actually make separate crates instead of one crate with modules. Rust already enforces module boundaries and supports "features" for conditional compilation. I'm not totally sure what features do, but I'm pretty sure they would work for us here. I talked to #rust about it at some length. I gathered that it's fine to dump everything together first (in separate modules of course) and then separate into crates when it becomes necessary, e.g. want to publish separately (on crates.io or otherwise), want to make a program that links against only a few, etc.

comment:12 Changed 3 weeks ago by chelseakomlo

Closing as this is we have settled on creating a shared crate as described by Manish above.

comment:13 Changed 3 weeks ago by chelseakomlo

Resolution: wontfix
Status: assignedclosed

comment:14 Changed 3 weeks ago by Hello71

so, as I understand... Manish, your suggested solution for #25386 is "write your tests in C (for now)"?

comment:15 Changed 3 weeks ago by manish.earth

I didn't suggest a solution there, I'm still not very sure on what the current form of the problem is.

comment:16 Changed 3 weeks ago by Hello71

basically comment:10, summary: "how do C programs that are trying to integrate Rust do tests (unit and integration)? assume we want to migrate gradually, but want to look at using all Rust eventually."

comment:17 in reply to:  16 ; Changed 3 weeks ago by chelseakomlo

Replying to Hello71:

basically comment:10, summary: "how do C programs that are trying to integrate Rust do tests (unit and integration)? assume we want to migrate gradually, but want to look at using all Rust eventually."

Currently tor has C unit and integration tests, and Rust unit tests. The problem we are currently facing is how to write Rust integration tests that call C- see #25386.

From https://trac.torproject.org/projects/tor/ticket/25639#comment:9, it sounds like FireFox/Servo do not have Rust integration tests that call C dependencies, and instead only have Rust unit tests and C unit/integration tests.

comment:18 in reply to:  17 Changed 3 weeks ago by chelseakomlo

Replying to chelseakomlo:

Replying to Hello71:

basically comment:10, summary: "how do C programs that are trying to integrate Rust do tests (unit and integration)? assume we want to migrate gradually, but want to look at using all Rust eventually."

Currently tor has C unit and integration tests, and Rust unit tests. The problem we are currently facing is how to write Rust integration tests that call C- see #25386.

From https://trac.torproject.org/projects/tor/ticket/25639#comment:9, it sounds like FireFox/Servo do not have Rust integration tests that call C dependencies, and instead only have Rust unit tests and C unit/integration tests.

For clarification, we are creating a shared crate for linking per discussion above, hence why this particular issue can be closed.

comment:19 Changed 3 weeks ago by manish.earth

Yeah, the suggestion there was to write your tests in Rust (as extern C functions) and drive them from C :)

Note: See TracTickets for help on using tickets.