Opened 5 months ago

Closed 4 months ago

#25903 closed defect (implemented)

Add OVERHEAD and DELIVERED fields to CIRC_BW events

Reported by: mikeperry Owned by: mikeperry
Priority: Medium Milestone: Tor: 0.3.4.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: 034-roadmap-proposed
Cc: Actual Points:
Parent ID: #25546 Points:
Reviewer: isis Sponsor:

Description

I want to have information about how much data is actually for applications, so that external controllers can check for circuit side channel attacks.

To do this, I want to add two fields to CIRC_BW events: a PACKAGED field that reports the sum of all rh.length values for all relay cells, and a DELIVERED field that reports the sum of all rh.length for *valid* RELAY_DATA cells.

That way, controllers can check for large differences between PACKAGED and DELIVERED, to see if lots of data is being sent on a circuit that is getting silently dropped. They can also check for differences between the raw CIRC_BW value and DELIVERED, to see if an adversary is underfilling cells (though this may be far more prone to false positives).

Child Tickets

Change History (17)

comment:2 Changed 5 months ago by mikeperry

Status: needs_reviewneeds_revision

Ok I forgot that rh.length will be 0 or small for many relay commands. So we're not able to get a dropped data count with the PACKAGED stat as-is. I might just remove that and only have DELIVERED, unless I can think of something clever.

Ideally the values reported here would allow us to differentiate between excess data on the circuit due to cells being valid but not full vs excess data due to dropped cells.

comment:3 Changed 5 months ago by mikeperry

Status: needs_revisionneeds_review

Second try:

https://oniongit.eu/mikeperry/tor/commits/bug25903_v2 for code changes.
https://gitweb.torproject.org/user/mikeperry/torspec.git/log/?h=bug25903_v2 for spec changes.

In those two, I add DELIVERED and OVERHEAD. DELIVERED is the same as before (sum of RELAY_COMMAND_DATA rh.length fields). OVERHEAD is the sum of the extra unused space in RELAY_COMMAND_DATA fields.

The idea is that the original READ and WRITTEN fields can be used to count all of the cell data sent on the circuit. Then you can add DELIVERED+OVERHEAD, and subtract that from the original CIRC_BW fields and get the count of bytes consumed due to all non-application cells. This should be an approximation of the dropped cell data on the circuit (not counting legit non-application data like SENDME cells, HS state machine cells).

The plan is to use this to set limits in the vanguards script on the ratio of RELAY_COMMAND_DATA to other stuff on the circuit.

comment:4 Changed 5 months ago by mikeperry

Status: needs_reviewneeds_revision
Summary: Add PACKAGED and DELIVERED fields to CIRC_BW eventsAdd OVERHEAD and DELIVERED fields to CIRC_BW events

Ok tested this for a while with the vanguards script and realized that I will also need to count stream BEGIN, RESOLVE, and END as delivered application data, or this thing is going to have false positives wrt dropped data.

comment:5 Changed 5 months ago by mikeperry

Status: needs_revisionneeds_review

Third time's a charm:
​​https://oniongit.eu/mikeperry/tor/commits/bug25903_v3 for code changes.
​​https://gitweb.torproject.org/user/mikeperry/torspec.git/log/?h=bug25903_v3 for spec changes.

This version counts any relay cell the client decides to write as valid delivered written data. For received data, it counts relay cells as valid delivered data if they do not trigger an error condition in the code that handles them.

I've been testing this with https://github.com/mikeperry-tor/vanguards/blob/master/vanguards/bandguards.py (which is a prototype of #23980) on client and service side and have gotten it down to no false positives on service side. On client side, if the client closes the connection while data is in flight, then that stream id will be unknown, and those packets will be dropped data (need #25573 for that).

Last edited 5 months ago by mikeperry (previous) (diff)

comment:6 Changed 5 months ago by dgoulet

Milestone: Tor: 0.3.4.x-final

Without a Milestone, that ticket is good as lost ;).

comment:7 Changed 5 months ago by dgoulet

Reviewer: isis

comment:8 Changed 5 months ago by mikeperry

FYI, I updated the changes file and one of the commit messages here. The new branch is bug25903_v4, which is also in a pull request at: https://github.com/torproject/tor/pull/71

The torspec branch is still
​​​https://gitweb.torproject.org/user/mikeperry/torspec.git/log/?h=bug25903_v3

Last edited 5 months ago by mikeperry (previous) (diff)

comment:9 Changed 5 months ago by isis

Status: needs_reviewneeds_revision

Mike mentioned earlier in the week that this work has known issues remaining and review should be delayed… not sure if that's still accurate? lmk

comment:10 Changed 5 months ago by mikeperry

Owner: set to mikeperry
Status: needs_revisionaccepted

As I said the issues were in the changes file and the commits, which were fixed in the pull request you requested.

comment:11 Changed 5 months ago by mikeperry

Status: acceptedneeds_review

comment:12 Changed 5 months ago by mikeperry

Ok I think I found a bug in tor. I am not sure, but I think there should be a return in connection_edge_process_relay_cell_not_open() when the connected cell can't be parsed. The side effect of the parse failure is the addr will be AF_UNSPEC later, so I moved the accounting call inside that block.

New pull request, this time with unit tests: https://github.com/torproject/tor/pull/101

comment:13 in reply to:  12 Changed 5 months ago by isis

Status: needs_reviewneeds_revision

Replying to mikeperry:

Ok I think I found a bug in tor. I am not sure, but I think there should be a return in connection_edge_process_relay_cell_not_open() when the connected cell can't be parsed. The side effect of the parse failure is the addr will be AF_UNSPEC later, so I moved the accounting call inside that block.

New pull request, this time with unit tests: https://github.com/torproject/tor/pull/101


Okay, left a review for you on PR#101. There's just one small thing I'm worried about with an assert() not necessarily being true all the time in the future.

comment:14 Changed 5 months ago by mikeperry

Status: needs_revisionneeds_information

Hrmm. Right now there are no relay cell payloads larger than RELAY_PAYLOAD_SIZE, and checks against this max are earlier in all of the code that calls this function.

Is this your only issue with the branch? IMO, in that case we should merge this, and then change this and the other checks on relay payload length to whatever mechanism we have for specifying the new max length depending on cell type once that invariant changes. Right now, we don't seem to have an alternative mechanism.

comment:15 Changed 5 months ago by mikeperry

Status: needs_informationneeds_review

Ok nickm says:

16:52 <+nickm> mikeperry: are you asking about wide-create stuff?
16:52 <+nickm> that doesn't affect relay payload sizes; they remain the same as before

This makes sense. It isn't the relay payload size that will change with wide creates. You simply have the ability to split a create over multiple relay cells, whose payload body max will still be RELAY_PAYLOAD_SIZE.

So this code will continue to work correctly in that case, properly measuring overhead and delivered data.

Setting to needs_review in case you have other issues. If not, please set to merge_ready.

comment:16 Changed 4 months ago by isis

Status: needs_reviewmerge_ready

comment:17 Changed 4 months ago by nickm

Resolution: implemented
Status: merge_readyclosed

Okay, lgtm. Squashed and merged (along with spec branch).

Note: See TracTickets for help on using tickets.