#25573 closed defect (fixed)

Track half-closed stream IDs

Reported by: mikeperry Owned by:
Priority: Medium Milestone: Tor: 0.3.4.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: guard-discovery-stats 034-backport
Cc: dmr, asn Actual Points:
Parent ID: #25574 Points:
Reviewer: teor Sponsor: SponsorV-can

Description (last modified by dmr)

In order to eliminate a side channel attack described in https://petsymposium.org/2018/files/papers/issue2/popets-2018-0011.pdf ("DropMark" attack) we need a way to determine if a stream id is invalid.

Many clients (particularly Firefox) will hang up on streams that still have data in flight. In this case, Tor clients send RELAY_COMMAND_END when they are done with a stream, and immediately remove that stream ID from their valid stream mapping. The remaining application data continues to arrive, but is silently dropped by the Tor client. The result is that this ignored stream data currently can't be distinguished from injected dummy traffic with completely random stream IDs, and this fact can be used to mount side channel attacks.

A similar situation exists for spurious RELAY_ENDs.

Child Tickets

TicketStatusOwnerSummaryComponent
#27686closednickmOOM code should count space used in a circuit's half-closed listCore Tor/Tor

Change History (27)

comment:1 Changed 21 months ago by mikeperry

Parent ID: #25574

comment:2 Changed 19 months ago by nickm

Keywords: 035-roadmap-proposed needs-proposal added
Milestone: Tor: unspecified

comment:3 Changed 17 months ago by dmr

Cc: dmr added

comment:4 Changed 17 months ago by mikeperry

Summary: Create a RELAY_COMMAND_END_ACK cell typeTrack half-closed stream IDs

Better idea: since malicious endpoints can withhold the ACK, we might as well just introduce a half-closed state for edge connections. With a simple struct, we can keep track of the outstanding END, SENDME windows, and data windows, and decrement those, and treat packets within the window as valid, until we get an END from the other side.

This also has the advantage that we don't have to wait for anyone to upgrade.

comment:5 Changed 17 months ago by mikeperry

Description: modified (diff)

Wrt DropMark, we can still count very early invalid cells as dropped in the vanguards addon (and eventually in tor-core), and react to those. This behavior won't prevent that, since there will be no streams at that point.

comment:6 in reply to:  5 Changed 17 months ago by dmr

Description: modified (diff)

Replying to mikeperry:

Wrt DropMark, [...]

Adding parenthetical to tie that term 'DropMark' to the paper (it might not otherwise be obvious by context).

comment:7 Changed 17 months ago by mikeperry

Cc: asn added
Keywords: 034-backport added; 035-roadmap-proposed needs-proposal removed
Milestone: Tor: unspecifiedTor: 0.3.4.x-final
Status: newneeds_review

https://github.com/torproject/tor/pull/262

Removing the needs-proposal and 035 tags because this version of the fix does not change the protocol or even change client behavior. It only changes the values reported to the CIRC_BW event for the vanguards controller to react to.

As such it would be great if we could backport this to 0.3.4 so that we can eliminate these classes of false positives in the vanguard addon. I have tested it with vanguards and it fixes both https://github.com/mikeperry-tor/vanguards/issues/3 and https://github.com/mikeperry-tor/vanguards/issues/25, which I am able to easily reproduce otherwise.

comment:8 Changed 17 months ago by asn

Reviewer: teor

comment:9 Changed 17 months ago by mikeperry

Status: needs_reviewneeds_revision

teor: If you care, this needs another block of the same checks when the circuit is in purpose PATH_BIAS_TESTING. I also want to reorg the branch a bit when I do that.

Despite the backport snafu, I would like review on 0.3.4 branch until we're satisfied with it. I imagine review will generate a bunch of changes that will be a pain to fix on both branches, and I'd like to do 0.3.4 first so I can keep testing it with the vanguards addon as we go. Then we can merge a rebased version of that result forward to master with just one merge commit for a spot-check as opposed to juggling fixups on both branches and generating a lot of confusion.

comment:10 Changed 16 months ago by mikeperry

Status: needs_revisionneeds_review

https://github.com/torproject/tor/pull/270 - Ok new reorganized pull request.

comment:11 Changed 16 months ago by teor

Status: needs_reviewneeds_revision

Please see my comments on the pull request.

Overall, this code seems ok, but it's very large for a backport to a release candidate.

(I understand you've already spoken with Nick on IRC about backporting the code. I don't think this ticket is a good place for backport conversations. Perhaps sending an email to tor-dev would be a good way of addressing concerns?)

comment:12 Changed 16 months ago by mikeperry

Status: needs_revisionneeds_review

Ok I have pushed fixup commits for your code review comments and linked them in reply on the github pull request review.

Additionally, I pushed a squashed branch to mikeperry/ticket25573-v2-squashed, for any fresh review and for reference in backport discussion (which I think I'll take up on network-team).

Last edited 16 months ago by mikeperry (previous) (diff)

comment:13 Changed 16 months ago by mikeperry

Ok I also decided to go ahead and sort the list of half-closed connections, so we can bsearch them on the common operations of cell arrival, and also guard against duplicate stream ids. I added tests for duplicates and list management correctness as well. These are additional commits on each branch (the pull request with the fixups from the first review, and also as separate commits on top of mikeperry/ticket25573-v2-squashed).

When we think this is merge-ready, I can do the rebase dance so we actually merge to master, but I'd still like to avoid doing that until we're sure we like it.

comment:14 Changed 16 months ago by teor

Status: needs_reviewneeds_revision

Please make check-spaces:

perl ./scripts/maint/checkSpace.pl -C \
		./src/common/*.[ch] \
		./src/or/*.[ch] \
		./src/test/*.[ch] \
		./src/test/*/*.[ch] \
		./src/tools/*.[ch]
...
 DoubleNL:./src/test/test_relaycell.c:530
make[1]: *** [check-spaces] Error 1
make[1]: *** Waiting for unfinished jobs....

https://travis-ci.org/torproject/tor/jobs/421209942#L3699

comment:15 Changed 16 months ago by teor

The branch looks good now.

Here are my remaining minor concerns:

  • Please make check-spaces
  • Do you test what happens when stream ids wrap? Do you think that's important now with all the other changes?

comment:16 in reply to:  15 Changed 16 months ago by mikeperry

Replying to teor:

The branch looks good now.

Here are my remaining minor concerns:

  • Please make check-spaces

Fixed, sorry about that.

  • Do you test what happens when stream ids wrap? Do you think that's important now with all the other changes?

There are a couple of different types of "wrapping". One is if we pick the initial stream_id for a circuit near UINT16_MAX, and that wraps around. This wrap will happen in get_unique_stream_id_by_circ(), which I don't currently test, but should be fine. I can easily test this, and I will in a fixup. (I also just realized I forgot to switch to bsearch in get_unique_stream_id_by_circ(), so I should do that and test it).

connection_half_edge_add() itself should succeed until the opened list is empty and the half-closed list is full of 65536 IDs. Since connection_half_edge_add() enforces uniqueness (which I already test for), by pigeon-hole principle with streamid_t (uint16_t), it will thus fail due when full to a duplicate stream ID in the list. So wrapping in the half-closed code itself is currently effectively tested to not be possible.

The other way that they can "wrap" is if more than 65536 streams total are split between the opened list and/or the half-closed list. Since get_unique_stream_id_by_circ() enforces uniqueness, by pigeon-hole principle, all stream IDs will be taken, so get_unique_stream_id_by_circ() should fail upon new stream creation.

However, I do not test for the uniqueness property of get_unique_stream_id(), and I don't think anything else does either. I can test get_unique_stream_id_by_circ() with a full (and full-1) half-closed array and ensure it succeeds and then fails, as a way of testing the uniqueness property as well as a full array.

I also do not test the consequences of this failure (there are 3 places in the codebase that check for it), but failure means that the circuit is marked unusable for new streams, and the stream is retried. Going farther out than one call layer from the change seems like test-creep though, so I don't plan to test all of these places. Hopefully they are covered in their respective component tests. They look simple and solid anyway tho.

Last edited 16 months ago by mikeperry (previous) (diff)

comment:17 Changed 16 months ago by mikeperry

Status: needs_revisionneeds_review

Ok the following fixups have been pushed to mikeperry/ticket25573-v2:

The space is fixed in c2dbb09dc3e4ed2b9d5a845a592a2d59d2d5e225.

Using bsearch in get_unique_stream_id_by_circ() is fixed in 99153878cb4b7ae309eb7f5109747e3a49ac00e9.

Tests for stream id wrapping and other get_unique_stream_id_by_circ() cases are in b8ae08caa059dc8f937a4afe9f929035b8beb712.

I have pushed a cleanly reorganized and squashed branch to mikeperry/ticket25573-v2-squashed2 (aka https://github.com/torproject/tor/pull/296). Its has 0 delta to mikeperry/ticket25573-v2, and everything has been squashed into 4 commits (a const for smartlist_bsearch; all half-stream tracking code+tests; prevent stream-id reuse code+tests; and truncate accounting code+tests). Only the last two commits change client behavior.

comment:18 in reply to:  17 Changed 16 months ago by teor

Replying to mikeperry:

Ok the following fixups have been pushed to mikeperry/ticket25573-v2:

The space is fixed in c2dbb09dc3e4ed2b9d5a845a592a2d59d2d5e225.

Using bsearch in get_unique_stream_id_by_circ() is fixed in 99153878cb4b7ae309eb7f5109747e3a49ac00e9.

These commits seem fine.

Tests for stream id wrapping and other get_unique_stream_id_by_circ() cases are in b8ae08caa059dc8f937a4afe9f929035b8beb712.

I suggest you use tor's crypto_rand() for these tests.

Here are some reasons from our ChangeLog:

    - Use our own weak RNG when we need a weak RNG. Windows's rand() and
      Irix's random() only return 15 bits; Solaris's random() returns more
      bits but its RAND_MAX says it only returns 15, and so on. Motivated
      by the fix for bug 7801; bugfix on 0.2.2.20-alpha.

Also, I'm pretty sure some compilers on some OSes will warn if you use rand().

I have pushed a cleanly reorganized and squashed branch to mikeperry/ticket25573-v2-squashed2 (aka https://github.com/torproject/tor/pull/296). Its has 0 delta to mikeperry/ticket25573-v2, and everything has been squashed into 4 commits (a const for smartlist_bsearch; all half-stream tracking code+tests; prevent stream-id reuse code+tests; and truncate accounting code+tests). Only the last two commits change client behavior.

The old branch was cancelled on appveyor, and the new branch timed out:

So I pushed your branch to my github to trigger another build:

And made myself an appveyor admin so I could trigger future builds more easily (#27372).

comment:19 Changed 16 months ago by teor

Status: needs_reviewneeds_revision

comment:20 Changed 16 months ago by mikeperry

Status: needs_revisionneeds_review

Ok, I pushed b13db83b7db00d9dd8d4cd6842e5e2f5c9558542 to mikeperry/ticket25573-v2, which is e5c5f90b6311cc469befbdeb2bb31abc7d53da04 on top of mikeperry/ticket25573-v2-squashed2.

The new fully squashed branch is mikeperry/ticket25573-v2-squashed3 with pull request at https://github.com/torproject/tor/pull/297. So glad I'm not trying to also juggle a branch on top of master yet.

I chose not to bother to seed the rng because we don't have direct helpers for that (other than a small maze of AES mocking) and it doesn't affect coverage anyhow.

comment:21 Changed 16 months ago by teor

Status: needs_reviewmerge_ready

Ok, looks good to me.

Start the merge discussion :-)

comment:22 Changed 16 months ago by mikeperry

Ok. I have renamed the 034 branch to ticket25573-034. It is 0 delta to ticket25573-v2 and ticket25573-v2-squashed3. It lives at: https://github.com/torproject/tor/pull/299

I created a merge commit to master from that branch as ticket25573-master: https://github.com/torproject/tor/pull/298

Let me know if that merge commit to master is how we want to do things. This is my first merge across the reorg border.

If anyone wants further changes, please advise how you want me to manage the branches/merge commits (ie fixups on the merge commit, or on the previous commits, or squashed or rebased or not, etc).

As for 034 backport, I guess the key question is: are any of these things "Smaller bugs that significantly impact user experience", as per https://trac.torproject.org/projects/tor/wiki/org/teams/NetworkTeam/CoreTorReleases. If vanguards usage counts as user experience, then I think yes. The stream id data corruption issue that this branch fixes might count, too, but it's probably rare in practice.

comment:23 Changed 15 months ago by nickm

This looks really clean and comprehensible. Nicely done.

I'm going to merge ticket25573-master to master, and if it doesn't cause any trouble over the first couple of 0.3.5-alpha releases, we should backport.

comment:24 Changed 15 months ago by Hello71

src/test/test_relaycell.c:41:5: warning: type of ‘pathbias_count_valid_cells’ does not match original declaration [-Wlto-type-mismatch]
 int pathbias_count_valid_cells(origin_circuit_t *circ,
     ^
src/feature/client/circpathbias.c:930:1: note: return value type mismatch
 pathbias_count_valid_cells(circuit_t *circ, const cell_t *cell)
 ^
src/feature/client/circpathbias.c:930:1: note: type ‘void’ should match type ‘int’
src/feature/client/circpathbias.c:930:1: note: ‘pathbias_count_valid_cells’ was previously declared here
src/feature/client/circpathbias.c:930:1: note: code may be misoptimized unless -fno-strict-aliasing is used
Last edited 15 months ago by Hello71 (previous) (diff)

comment:25 Changed 15 months ago by nickm

Added a fix in bug25573-034-typefix; merged to master.

comment:26 Changed 15 months ago by nickm

ow; there a bunch of those mismatches. I've done a better fix as part of #27772 and backported it to bug25573-034-typefix. Not yet merged in master or 0.3.4.

comment:27 Changed 13 months ago by nickm

Resolution: fixed
Status: merge_readyclosed

Backported to 0.3.4. This will take some testing to make sure we backported everything right, but it's a good time to let it burn in.

Note: See TracTickets for help on using tickets.