Opened 9 months ago

Closed 4 months ago

#24031 closed defect (fixed)

Protover.rs could use a better algorithm

Reported by: nickm Owned by: isis
Priority: Very High Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version: Tor: 0.3.3.1-alpha
Severity: Normal Keywords: rust, 033-must, protover, security, 033-triage-20180326, 033-included-20180326
Cc: chelseakomlo, teor Actual Points: 5
Parent ID: Points: 1
Reviewer: nickm Sponsor: SponsorM-can

Description

This probably doesn't matter in practice, but: it would be cool if protover.rs used a smarter representation for sets of protocol versions than HashSet.  Maybe a BTreeSet of (low,high) tuples?

Child Tickets

Attachments (2)

protocol-vote.py (1.6 KB) - added by nickm 5 months ago.
O(n*m) algorithm to compute consensus of n votes containing m protocol ranges each
main.rs (813 bytes) - added by isis 4 months ago.

Download all attachments as: .zip

Change History (34)

comment:1 Changed 5 months ago by teor

Keywords: 033-must protover added
Milestone: Tor: unspecifiedTor: 0.3.3.x-final
Points: 1

It does matter in practice - using hashsets to list every supported protocol is a DoS risk:
https://trac.torproject.org/projects/tor/ticket/25252#comment:1

comment:2 Changed 5 months ago by teor

Priority: MediumHigh

comment:3 Changed 5 months ago by nickm

Milestone: Tor: 0.3.3.x-finalTor: 0.3.4.x-final

An ordered vec of ranges would also be fine.

There are better voting algorithms we could use too, so that we could remove the 65536-version limit. I'm attaching a prototype of one such algorithm in Python.

Changed 5 months ago by nickm

Attachment: protocol-vote.py added

O(n*m) algorithm to compute consensus of n votes containing m protocol ranges each

comment:4 Changed 5 months ago by isis

Owner: set to isis
Status: newaccepted

Assigning some 033-must tickets to myself.

comment:5 Changed 5 months ago by nickm

(FWIW, I think it would be okay to defer this to 0.3.4 if it isn't fast. We can just say "don't use Rust on dirauths yet", right?)

comment:6 in reply to:  5 Changed 4 months ago by isis

Replying to nickm:

(FWIW, I think it would be okay to defer this to 0.3.4 if it isn't fast. We can just say "don't use Rust on dirauths yet", right?)


I think it really does matter to get it in, even if we advise dirauths against using Rust, due to the potential for DoS that teor mentioned. Wouldn't the attack vector look something like this?:

  1. A bad relay with alternate/modified tor implementation says Link 1-65536,LinkAuth 1-65536,Relay 1-65536,HSIntro 1-65536,HSRend 1-65536,HSDir 1-65536,DirCache 1-65536,Desc 1-65536,Microdesc 1-65536,Cons 1-65536.
  1. The dirauths parse that in C and allow it.
  1. A Rust relay/client gets this consensus and suddenly it's got a HashSet with 10 keys, each one with 65536 32-bit integers in it, so that's 2560.8594 kilobytes allocated per relay that does this. (according to this [XXX this tiny rust program])
  1. Bad relay operator spins up ~400 relays and suddenly every Rust-enabled relay/client in the network is using over 1 GB of memory.
Version 0, edited 4 months ago by isis (next)

Changed 4 months ago by isis

Attachment: main.rs added

comment:7 in reply to:  5 Changed 4 months ago by isis

Replying to nickm:

(FWIW, I think it would be okay to defer this to 0.3.4 if it isn't fast. We can just say "don't use Rust on dirauths yet", right?)


Although, also yes, we should advise authorities not to use Rust until #24265 is finished, otherwise we risk splitting the vote.

comment:8 Changed 4 months ago by isis

Okay, I have an WIP (done actually, but I need to split it nicely into commits) implementation in my bug24031 branch.

Warning that I uh… also went a bit nuts on refactoring to have actual types for different things, moving functions into them as methods, and making them more rusty e.g. implementing proper traits like impl std::str::FromStr for ProtoEntry instead of having stuff like SupportedProtocols::from_proto_entries_string. The types are much nicer however, they really helped me reason about the code, and also I found several differences in behaviour to the C (which were due to us not having as much type safety as we could, e.g. if something doesn't parse correctly into a type, we want to make sure we're treating it the same way as when C can't parse it into a proto_entry_t or whatever, instead of just saying "this is a String" and "this is a HashMap").

Caveat emptor that even though I found some differences in return codes/parsing/etc to the C, I'm still not fully convinced there aren't more, and we should definitely do #24265 soon.

comment:9 Changed 4 months ago by isis

Status: acceptedneeds_revision

comment:10 Changed 4 months ago by isis

Actual Points: 5
Keywords: security added
Milestone: Tor: 0.3.4.x-finalTor: 0.3.3.x-final
Priority: HighVery High
Sponsor: SponsorM-can
Status: needs_revisionneeds_review
Version: Tor: 0.3.3.1-alpha

I've cleaned up (mostly! sorry sorry!) my code into more understandable commits in my bug24031_r4 branch.

The changes to the unittests might be a little hard to follow, that commit is not quite pleasant (although the integration test changes in that same commit should be simple to follow to see where/when behaviour has changed, so I'd argue that the internal unittests changing precariously isn't as much of an issue?). If test changes (or any changes) are difficult for the reviewer to understand, please feel free to ask questions, or make me split that commit up better.

The other thing is that one test I've added intentionally fails: protover_all_supported_should_include_version_we_actually_do_support. The behavioural difference is this: if we take a protover string "Link=3-999" and ask if it is supported, when "Link=1-5" is supported:

  • the C version returns "Link=3-999" (which is, IMHO, wrong, since we do support 3, 4, and 5).
  • the Rust version returns "Link=6-999", which I believe to be the correct implementation of the spec?

Upping the priority because potential DoS.

Upping the actual points to 5, although only ~3 were spent on this ticket, and another 2 finding/fixing related bugs elsewhere.

Marking as SponsorM-can, although that may be wrong; feel free to correct.

Changing the version to reflect inclusion in 0.3.3, along with the appropriate keywords.

Last edited 4 months ago by isis (previous) (diff)

comment:11 Changed 4 months ago by nickm

Cc: teor added

comment:12 Changed 4 months ago by dgoulet

Reviewer: nickm

Assign Reviewer to nickm for the 26/03/18 week.

comment:13 Changed 4 months ago by nickm

Keywords: 033-triage-20180326 added

Second batch of triage for 0.3.3: tickets that we didn't cover the first time.

comment:14 Changed 4 months ago by nickm

Keywords: 033-included-20180326 added

Marking 033-must tickets as included. Round 2.

comment:15 Changed 4 months ago by chelseakomlo

Hi Isis,

This looks great! Thanks for doing this and making these improvements, it is much cleaner!

There are a few things I noticed when looking at this, they are mostly nits. I'll do a deeper look by the end of this week, but this is looking significantly better.

  1. If possible, it would be ideal to keep all FFI/C related code in ffi.rs. This way we can keep Rust/C code cleanly separated and if Rust needs to use these functions in the future, we don't need to do any refactoring or translation (for example, compute_for_old_tor and get_supported_protocols). It looks like in at least once place we get a string as a cstring in Rust (in supported for ProtoEntry) and then translate into a string- we should just keep this all as strings in Rust and then only translate in ffi.rs when we actually want to send something to C.
  2. I really like how you broke out errors into their own file. Is this something we should add to our guide to writing Rust in Tor?
  3. Is generating ToString implementations at runtime a common Rust practice? If not, document impl_to_string_for_proto_entry
  4. supported for ProtoEntry maybe should return an actual error instead of an empty string

comment:16 Changed 4 months ago by chelseakomlo

Status: needs_reviewneeds_revision

comment:17 in reply to:  15 ; Changed 4 months ago by isis

Status: needs_revisionneeds_review

Replying to chelseakomlo:

Hi Isis,

This looks great! Thanks for doing this and making these improvements, it is much cleaner!


Hey! Thanks for the review!

There are a few things I noticed when looking at this, they are mostly nits. I'll do a deeper look by the end of this week, but this is looking significantly better.

  1. If possible, it would be ideal to keep all FFI/C related code in ffi.rs. This way we can keep Rust/C code cleanly separated and if Rust needs to use these functions in the future, we don't need to do any refactoring or translation (for example, compute_for_old_tor and get_supported_protocols). It looks like in at least once place we get a string as a cstring in Rust (in supported for ProtoEntry) and then translate into a string- we should just keep this all as strings in Rust and then only translate in ffi.rs when we actually want to send something to C.


So the reasoning for this is that static strings should live in the module which owns them. (This code is from #25185.) Since we needed a generalised way to work with static strings between both C and Rust without (re)allocations, in that ticket I created the cstr! macro, but a caveat is that it must take a string literal in order to create the CStr (&str is to String as &CStr is to CString, i.e. the static, not-heap-allocated equivalent), meaning that we can't assign it to a variable first and then put it into the macro elsewhere. E.g. it would be nicer if the protover::get_supported_protocols_cstr() function actually lived in the protover::ffi module, but then we'd need to call something from protover::ffi in the main protover.rs module, and that seemed way backwards to me?

So for the ProtoEntry::supported() method, it internally gets the &CStr from protover::get_supported_protocols_cstr() and immediately calls .to_str() on it so that we're working with the underlying &str in Rust, then it parses that.

Basically, the problem is that the macro needs to actually have literally the &static str inside it, it cannot work at compile-time if the &static str and the call to the macro live in different places, because at compile-time when the macro is expanded there isn't yet a symbol table available for looking up the &static str if it were to live somewhere else.

  1. I really like how you broke out errors into their own file. Is this something we should add to our guide to writing Rust in Tor?


Probably! I was going to draft it into a guideline, but this will also become much better when we can use failure which is supposed to be 1.0.0 any day now. Adding it for ProtoverError is a one line addition:

impl ::failure::Fail for ProtoverError {};

I've made #25628 for tracking this.

  1. Is generating ToString implementations at runtime a common Rust practice? If not, document impl_to_string_for_proto_entry


It's compile-time! :) To see what it's generating, add this to the top of lib.rs (and use a nightly compiler):

#![feature(trace_macros)]

trace_macros!(true);

I don't know if it's standard to do it for ToString, but people generally use macros to generate trait implementations for types that should have the same implementation to avoid copy-pasting code that could get out of sync. (I see people do it a lot for Display/Debug/Clone, etc.)

  1. supported for ProtoEntry maybe should return an actual error instead of an empty string


It is! ProtoEntry::supported() returns ProtoverError because it sets the supported variable to an empty string if there was an error, and then returns supported.parse().

impl ProtoEntry {
    // [...]
    pub fn supported() -> Result<Self, ProtoverError> {
        let supported_cstr: &'static CStr = get_supported_protocols_cstr();
        let supported: &str = match supported_cstr.to_str() {
            Ok(x)  => x,
            Err(_) => "",
        };
        supported.parse()
    }
}

(supported.parse() is calling impl FromStr for ProtoEntry, and a blank string is not a valid ProtoEntry as the test_protoentry_from_str_empty() shows.)

Do you think it might read better if I refactored setting supported to be like this?

        let supported: &str = supported_cstr.to_str().unwrap_or("");

Setting as needs_review again? Not sure what the correct ticket state is? Feel free to change it back if you still think revision is necessary somewhere!

comment:18 in reply to:  17 Changed 4 months ago by isis

Replying to isis:

Do you think it might read better if I refactored setting supported to be like this?

        let supported: &str = supported_cstr.to_str().unwrap_or("");


I think this is easier, and it's also shorter? I added it to my bug24031_r4 branch in commit 8614ec7b5.

comment:19 Changed 4 months ago by isis

Another thing that would be nice to have feedback on is that this unittest fails, intentionally, because I interpreted the spec differently to the C implementation:

---- protover_all_supported_should_include_version_we_actually_do_support stdout ----
	thread 'protover_all_supported_should_include_version_we_actually_do_support' panicked at 'assertion failed: `(left == right)`
  left: `"Link=6-999"`,
 right: `"Link=3-999"`', protover/tests/protover.rs:327:5

If a relay supports "Link=1-5", and you ask it if it supports "Link=3-999" then the C code will return what's on the right, and the Rust returns what's on the left. (My reasoning was that if we actually support "Link=3-5", then they shouldn't be in the set of unsupported things returned.)

comment:20 Changed 4 months ago by nickm

On the unit test for protover_all_supported(): Both behaviors seem fine to me, since we're only using the results for a human-readable log. I'm fine with the behavior change, but if we're going to make it, we should make it in both versions.

comment:21 Changed 4 months ago by nickm

Also -- could I have this as a github pull request, or failing that, as a gitlab pull request? I'd like to try reviewing this with a UI, since the patch series is >3k lines long :)

comment:22 in reply to:  21 ; Changed 4 months ago by isis

Status: needs_reviewneeds_revision

Replying to nickm:

Also -- could I have this as a github pull request, or failing that, as a gitlab pull request? I'd like to try reviewing this with a UI, since the patch series is >3k lines long :)


Yep! Will submit to Github? (Wow, this is new.) Should I make an actual PR to https://github.com/torproject/tor? Or would you prefer oniongit's interface?

I found a few more small problems with differences to the C implementation that I want to fix, but I should have the patches/tests done today.

comment:23 in reply to:  22 Changed 4 months ago by isis

Status: needs_revisionneeds_review

Replying to isis:

Replying to nickm:

Also -- could I have this as a github pull request, or failing that, as a gitlab pull request? I'd like to try reviewing this with a UI, since the patch series is >3k lines long :)


Yep! Will submit to Github? (Wow, this is new.) Should I make an actual PR to https://github.com/torproject/tor?


Okay, more things fixed, it's passing all tests now and I ported all the missing tests that we had in C also to the Rust, which found some bugs that I fixed, and added more tests to both C and Rust and fixed differences in both. Travis passes.

Branch: bug24031_r4 (on git.tpo or github
PR: https://github.com/torproject/tor/pull/33

comment:24 Changed 4 months ago by isis

One potential difference to the C that I didn't test behaviourally or investigate (but it was mentioned in the documentation/comments in a C function) was that C tried to use "as few ranges as possible". I think that means e.g. "Foo=1,2" instead of "Foo=1-2"? If so, the Rust does the opposite.

comment:25 in reply to:  24 Changed 4 months ago by teor

Replying to isis:

One potential difference to the C that I didn't test behaviourally or investigate (but it was mentioned in the documentation/comments in a C function) was that C tried to use "as few ranges as possible". I think that means e.g. "Foo=1,2" instead of "Foo=1-2"? If so, the Rust does the opposite.

The Rust and C behaviour is currently the same, they both prefer "Foo=1-2".
(I think comment about "as few ranges as possible" refers to merging ranges like "Foo=1-2,3-4" into "Foo=1-4".)

The current authorities produce consensus lines like:

pr Cons=1-2 Desc=1-2 DirCache=1-2 HSDir=1-2 HSIntro=3-4 HSRend=1-2 Link=1-5 LinkAuth=1,3 Microdesc=1-2 Relay=1-2

http://154.35.175.225/tor/status-vote/current/consensus-microdesc

comment:26 Changed 4 months ago by nickm

Okay, I've taken a pass over the code. I'm trusting the tests a bunch here, but I'm generally liking what I'm seeing here. I left some specific comments and questions on the github branch; please let me know if they're visible?

comment:27 Changed 4 months ago by nickm

Update: I like the fixups you made last night, and I think I buy your argument about the new C protover_all_supported implementation. There are a couple of comments on some of your later commits that you didn't get to yet, however -- I'd especially want to know about the testing-mode build one on f6377a4.

Once you're happy with that, our next step is to make a new _r5 branch, and get the branch into usable condition. There are two changes that will be needed for that:

  • It needs to be based on maint-0.3.3 if we're going to try to merge it into 0.3.3; the current version of this branch is baased on master.
  • It needs to be squashed.

I tried rebasing and squashing it myself, using 'git rebase master --onto maint-0.3.3 --autosquash -i', but I ran into conflicts that I'd rather not try to resolve myself.

comment:28 Changed 4 months ago by nickm

Status: needs_reviewneeds_revision

comment:29 in reply to:  27 Changed 4 months ago by isis

Status: needs_revisionmerge_ready

Replying to nickm:

Update: I like the fixups you made last night, and I think I buy your argument about the new C protover_all_supported implementation. There are a couple of comments on some of your later commits that you didn't get to yet, however -- I'd especially want to know about the testing-mode build one on f6377a4.


Yep! I just hadn't finished getting to all the comments yet. They should be addressed now!

Once you're happy with that, our next step is to make a new _r5 branch, and get the branch into usable condition. There are two changes that will be needed for that:

  • It needs to be based on maint-0.3.3 if we're going to try to merge it into 0.3.3; the current version of this branch is baased on master.
  • It needs to be squashed.

I tried rebasing and squashing it myself, using 'git rebase master --onto maint-0.3.3 --autosquash -i', but I ran into conflicts that I'd rather not try to resolve myself.


Okay, I made a bug24031_r5 branch (CI passes).

For the 0.3.4 changes, I made a squashed version of it in my bug24031_r5_squashed branch (CI passes).

For the 0.3.3 changes, I took the squashed 0.3.4 branch above and did the rebase command you suggested; that branch is bug24031_r5_squashed_033 (CI passes).

comment:30 Changed 4 months ago by nickm

Status: merge_readyneeds_revision

whoops -- one last thing. Could you add a changes file to those branches?

comment:31 Changed 4 months ago by isis

oops, changes file added.

comment:32 Changed 4 months ago by nickm

Resolution: fixed
Status: needs_revisionclosed

Merged!

Note: See TracTickets for help on using tickets.