Implement Bridge Guards and other anti-enumeration defenses

changed milestone to %Tor: unspecified

added censorship component::core tor/tor ex-sponsor-19 milestone::Tor: unspecified points::10 priority::high severity::normal sponsor::30-can status::new tor-bridge tor-guard type::project labels

Trac:
Keywords: SponsorZ deleted, SponsorZ tor-bridge added

Worth looking at during 0.2.7 triage IMO

Trac:
Milestone: Tor: unspecified to Tor: 0.2.7.x-final

Trac:
Status: new to assigned

Trac:
Cc: nickm, andrea to nickm, andrea, yawning

Trac:
Cc: nickm, andrea, yawning to nickm, andrea, yawning, isis

Marking triaged-out items from first round of 0.2.7 triage.

Trac:
Keywords: SponsorZ tor-bridge deleted, SponsorZ, tor-bridge, 027-triaged-1-out added

Make all non-needs_review, non-needs_revision, 027-triaged-1-out items belong to 0.2.???

Trac:
Milestone: Tor: 0.2.7.x-final to Tor: 0.2.???

In an effort to publish early and often, my bug7144 branch contains an implementation of this which is not quite finished yet. There are some unittests in there for the added code (but only ~60% coverage of the new code so far).

Trac:
Status: assigned to needs_revision

Note that we'll need to remember to revise consider_testing_reachability() and dirserv_single_reachability_test() to not use random middle nodes to connect to bridges when testing their reachability. (Otherwise all this work for bridge guards is still a bit wasted.)

Also, I've diverged slightly from prop!#188 (closed): rather than sending a CREATE_FAST cell to the bridge guard, we use circuit_should_use_create_fast_for_circuit() (which has been slightly modified) to attempt to do use the same CREATE* cell type as a normal client.

My latest code is in my bug7144_r1 branch. I know it's a lot of code, and I apologise for that… but if you review it, I'll buy you a beverage of your choice at the next dev meeting. :)

Trac:
Status: needs_revision to needs_review

Trac:
Milestone: Tor: 0.2.??? to Tor: 0.2.8.x-final

Yes, we should try to get this reviewed before the dev meeting. :)

Trac:
Keywords: 027-triaged-1-out deleted, 027-triaged-1-out TorCoreTeam201509 added

Review is happening on this experimental Phabricator instance that doesn't belong to us and is probably going to eat all the data: https://phablab.krautspace.de/D1

Because the data is going to get eaten, I just wanted to archive this:

Trac:
Milestone: Tor: 0.2.8.x-final to Tor: 0.2.???
Keywords: 027-triaged-1-out TorCoreTeam201509 deleted, 027-triaged-1-out added

Trac:

Adding back nickm's tags that got clobbered by trac.

Trac:
Milestone: Tor: 0.2.??? to Tor: 0.2.8.x-final
Keywords: N/A deleted, TorCoreTeam201509 added

I have revised the original prop!#188 (closed) text to reflect some (mostly minor) details discovered during implementation. The changes are in my prop188 branch; however, I have already merged them into the common torspec repo.

I hope it was okay to for me to merge that! I felt like I shouldn't merge my own commits… then I thought it'd be okay because it's only documentation and it's now sort of "my" proposal. In the end, I decided that it was probably okay, since it seems easier for others to audit my code and follow my specification side-by-side — with the spec already updated — than needing to review both separately. If this was not okay, please let me know, and I'll revert my commit and never do that again.

The changes include:

ADD a new section detailing loose-source routed circuits, including:
- How the circuit should appear to the OP, the bridge, and the bridge guard,
- How the hop(s) in the path is/are chosen,
- How the first hop is handled and how circuit extension is handled, in a generalised sense (i.e. not specific to a bridge using bridge guards),
- Which cell types are used and how they are chosen,
- When the original create cell is answered (in relation to when cells are sent to the other additional hops),
- What should be done when relay cells are received when the additional hops in the loose-source routed circuit are not fully constructed,
- How the circuit crypto and cell digests for incoming/outgoing relay cells works (again, in a generalised sense, i.e. not specific to bridges using bridge guards),
- How and whether a given relay cell is treated as "recognized",
- Under what circumstances (based on whether a stream is attached and which relay command was sent) should cause a node which uses loose-source routed circuits to drop a relay cell or mark a circuit for close, and,
- When and what statistics should be gathered for loose-source routed circuits.
UPDATE and clarify the example loose-source routed circuit construction (the original one from the proposal which was specific to a client using a bridge that uses bridge guards).
CHANGE the "Make it harder to tell clients from bridges" section which described the tradeoffs between using CREATE versus CREATE_FAST cells for the first additional hop (i.e. the bridge guard). Those concerns is no longer entirely relevant, since, in the meantime, we've introduced the NTor handshake and it's extremely likely that both the client and the bridge will use CREATE2 for all their hops.
ADD a new section on enumerating bridges via the behaviours inherent to bridge reachability testing.
REMOVE open question (from the very end of the proposal) regarding whether the standard guard selection algorithms are adequate. They are. (Even if they are convoluted and overly-complicated.)

Trac:
Keywords: N/A deleted, 028-triage added

Trac:
Keywords: N/A deleted, 028-triaged added

(medium remaining, not medium-total)

Trac:
Points: N/A to medium
Sponsor: N/A to N/A

Trac:
Priority: Medium to High

Okay, I came here to drink coffee and review code, and my doctor tells me I shouldn't drink so much coffee. I'll look at the smaller ones first.

fbb21bbd9d21 Refactor loop over cpath for getting nicknames into separate function.

lgtm! 43670da13937 Generalise logic for whether a circuit_t supports ntor.
Yes but we should also open a ticket here for removing *_supports_ntor() entirely; we no-longer allow TAP-only relays on the network. (Opened as #17882 (moved)) 05d5eaa18985 Generalize logic for calculating cpath length.
lgtm 04fd0097cd9a Refactor circuit_get_cpath_hop() to operate on just the cpath.
lgtm 1568e1449278 Redefine CIRCUIT_IS_ORIGIN to use ORIGIN_CIRCUIT_MAGIC, not purpose.
lgtm. There is a too-wide line here, I think, but please don't fix it now; I'll get it when I do "make check-spaces" after merge. 6daf9165951d Make logic for choosing create cell type be agnostic to circuit type.
hmm. I know this isn't new, but the !cpath->extend_info->onion_key check looks poor to me, since it will fail once no-TAP relays are a reality. Probably doesn't need to get fixed on this branch though. b5546456b415 Check circuit types before casting in relay_send_command_from_edge_().
If we're going to make this change, we need to recognize that this function isn't really "from_edge" any more -- a cell sent outwards from a bridge is not sent "from_edge". Renaming this function might be overkill, but we should document its new semantics in its comments. e81acaf5f33e Implement Bridge Guards (prop188).
I'll come back to this in my next comment. It's the big one. 45d2457abd5c Add unittests for loose.c.
Changes to non-test code all lgtm
Tests seem okay after a quick skim. If you haven't already done so, please run them under valgrind to make sure they don't leak.

Trac:
Severity: N/A to Normal

Okay, now to review "e81acaf5f33e Implement Bridge Guards (prop188).".

The big files are command.c and loose.c. I'll review the others first.

circuitmux.c:

I think that the changes here should make use of a CIRCUIT_HAS_CPATH macro rather than doing origin || loose; this logic is likely to be important elsewhere.

onion.c:

What does the + // prop#188 comment mean? More detail please.

or.h:

0x13371515, huh? :) Fair enough.
p_chan_relay_cell is a little confusing. How do we know that they won't send two relay early cells, and what do we do if they do? Also consider prop#249 ("Large create cells")
I think some of the tor_asserts() in the OR_TO_LOOSE_CIRCUIT_() functions need to check for LOOSE_CIRCUIT_MAGIC, not OR_CIRCUIT_MAGIC.
Did you grep for other uses of OR_CIRCUIT_MAGIC to see if they needed to change as well?

relay.c:

I wonder whether there is really no shared code with circuit_receive_relay_cell and loose_circuit_process_relay_cell. I guess I'll find out when I review loose.c
What does the + // prop#188 comment mean? More detail please.

Okay, next step is the hard part. :)

Now to review the command.c changes and loose.c in e81acaf5f33e7c6de4205dfdc8d797c3bd9dbd80.

command.c:

command_answer_create_cell has no documentation.

loose.c:

I recommend that after we merge we rename this to loosecirc.c or something.
Arg. Unless I miss my guess, there is waaay to much copied code from origin_circuit_t here. Cut-and-paste programming is a sad thing. I can't tell what's new and what isn't, what's different and what isn't. I wonder, can we extract the common fields from origin_circuit_t and loose_circuit_t into a new type, refactor the origin_circuit_t functions to work on that, and then redefine many/most of these functions in terms of the refactored origin_circuit_t stuff?
If the above change were done, this would be much easier to review, since I wouldn't have to hand-compare it to the origin_circuit_t code.

Trac:
Status: needs_review to needs_revision

Setting isis as the owner of this ticket.

Trac:
Status: needs_revision to assigned
Owner: N/A to isis

Trac:
Status: assigned to needs_revision

Replying to nickm:

Okay, I came here to drink coffee and review code, and my doctor tells me I shouldn't drink so much coffee. I'll look at the smaller ones first.

Thanks, Nick! I owe you a beverage at the next meeting in Valencia. :)

[…]

43670da13937 Generalise logic for whether a circuit_t supports ntor.

Yes but we should also open a ticket here for removing *_supports_ntor() entirely; we no-longer allow TAP-only relays on the network. (Opened as #17882 (moved))

I thought it odd that we still had a function which should only ever return true, but I didn't want to go off refactoring pieces of code when it wasn't strictly necessary (since these patches are already large).

[…]

1568e1449278 Redefine CIRCUIT_IS_ORIGIN to use ORIGIN_CIRCUIT_MAGIC, not purpose.

lgtm. There is a too-wide line here, I think, but please don't fix it now; I'll get it when I do "make check-spaces" after merge.

Whoopsies.

6daf9165951d Make logic for choosing create cell type be agnostic to circuit type.

hmm. I know this isn't new, but the !cpath->extend_info->onion_key check looks poor to me, since it will fail once no-TAP relays are a reality. Probably doesn't need to get fixed on this branch though.

What if I were to move it below the if (server_mode(options)) check? That way, (at least in the future) we'll never accidentally use CREATE_FAST cells between ORs, where one has no onion_key. Separate ticket?

b5546456b415 Check circuit types before casting in relay_send_command_from_edge_().

If we're going to make this change, we need to recognize that this function isn't really "from_edge" any more -- a cell sent outwards from a bridge is not sent "from_edge". Renaming this function might be overkill, but we should document its new semantics in its comments.

I made a coccinelle script for the transformation. It's added, applied, and cleaned up (whitespace fixes and renaming the actual function declaration and implementation ) in:

* [https://gitweb.torproject.org/user/isis/tor.git/commit/?h=bug7144_r1&id=781c27fb238cc8148763b5e9113113851ebd6d9f 781c27fb] Add coccinelle script for renaming relay_send_command_from_edge().
* [https://gitweb.torproject.org/user/isis/tor.git/commit/?h=bug7144_r1&id=cd3d7f9c0b8cc9a8142471bd9b5e6a72dd2ea8c6 cd3d7f9c] Apply relay_send_command_from_edge.cocci transformations.
* [https://gitweb.torproject.org/user/isis/tor.git/commit/?h=bug7144_r1&id=376f3b41f227c8c725500e3ac212ca0388978e7a 376f3b41] Manual changes after relay_send_command_from_edge.cocci transformations.

The coccinelle script can easily be applied to any current/further patches which call either relay_send_command_from_edge() or relay_send_command_from_edge_().

[…]

45d2457abd5c Add unittests for loose.c.

Changes to non-test code all lgtm

Tests seem okay after a quick skim. If you haven't already done so, please run them under valgrind to make sure they don't leak.

I ran them under valgrind while writing them. There's no leaks, AFAICT.

Replying to nickm:

Okay, now to review "e81acaf5f33e Implement Bridge Guards (prop188).".

The big files are command.c and loose.c. I'll review the others first.

circuitmux.c:

I think that the changes here should make use of a CIRCUIT_HAS_CPATH macro rather than doing origin || loose; this logic is likely to be important elsewhere.

Did you mean this check?

       if ((circ->magic == OR_CIRCUIT_MAGIC) ||
           (circ->magic == LOOSE_OR_CIRCUIT_MAGIC))

That check is for discovering if the circuit has a p_circ_id and maybe having a p_chan. Should I make a CIRCUIT_HAS_PCHAN macro? Also, I agree that CIRCUIT_HAS_CPATH sounds useful, but I don't understand where in circuitmux.c it should be applied.

onion.c:

What does the + // prop#188 comment mean? More detail please.

Whoops. That should not have been committed. That was me making a mark in the code at the very beginning of hacking on this, to tell myself that this is a spot I might need to pay attention to. I ended up not needing to though. The comment has been removed in:

* [https://gitweb.torproject.org/user/isis/tor.git/commit/?h=bug7144_r1&id=f076f639f685775368ba1e7b1d5d7c568a3807b9 f076f639] Remove prop!#188 comment from onion.c.

or.h:

0x13371515, huh? :) Fair enough.

It works as magic because it's highly improbable. :)

p_chan_relay_cell is a little confusing. How do we know that they won't send two relay early cells, and what do we do if they do? Also consider prop#249 ("Large create cells")

My reading of the spec and testing caused me to believe that only one relay_early should/would be sent, and that the OR should simply drop additional cells while the channel is not yet fully constructed:

5.5. Routing relay cells
      
   When an OR receives a RELAY or RELAY_EARLY cell, it checks the cell's
   circID and determines whether it has a corresponding circuit along that
   connection.  If not, the OR drops the cell.

   Otherwise, if the OR is not at the OP edge of the circuit (that is,
   either an 'exit node' or a non-edge node), it de/encrypts the payload
   with the stream cipher, as follows:
        'Forward' relay cell (same direction as CREATE):
            Use Kf as key; decrypt.
        'Back' relay cell (opposite direction from CREATE):
            Use Kb as key; encrypt.
   Note that in counter mode, decrypt and encrypt are the same operation.

   The OR then decides whether it recognizes the relay cell, by
   inspecting the payload as described in section 6.1 below.  If the OR
   recognizes the cell, it processes the contents of the relay cell.
   Otherwise, it passes the decrypted relay cell along the circuit if
   the circuit continues.  If the OR at the end of the circuit
   encounters an unrecognized relay cell, an error has occurred: the OR
   sends a DESTROY cell to tear down the circuit.

   When a relay cell arrives at an OP, the OP decrypts the payload
   with the stream cipher as follows:
         OP receives data cell:
            For I=N...1,
                Decrypt with Kb_I.  If the payload is recognized (see
                section 6..1), then stop and process the payload.

Without taking prop!#249 (moved) into account, my logic was that we should store one relay_early and drop the rest of them:

  if (PREDICT_UNLIKELY(!loose_circ->has_opened)) {                         
    /* Store the first relay_early cell for later, after our loose circuit is
     * fully constructed. */
    tor_assert(cell);

    if (cell_direction == CELL_DIRECTION_OUT) {
      log_debug(LD_CIRC,
                "Received %s command on loose circuit %d, but we haven't "
                "finished constructing the extra hops in this circuit! "
                "Saving for later.",
                cell_command_to_string(cell->command),
                LOOSE_TO_CIRCUIT(loose_circ)->global_circuitlist_idx);

      if (!loose_circ->p_chan_relay_cell) {
        ++num_seen;
        loose_circ->p_chan_relay_cell = tor_malloc_zero(sizeof(cell_t));
        *loose_circ->p_chan_relay_cell = *cell;
        return 0;
      } else {
        /* OPs should not be sending additional relay cells until the first
         * has been answered, so if we already have one stored, then the OP is
         * misbehaving.  Mark this circuit for close. */
        log_warn(LD_CIRC,
                 "Received another relay cell on loose circuit %d, which "
                 "is not done constructing.  Closing this circuit.",
                 LOOSE_TO_CIRCUIT(loose_circ)->global_circuitlist_idx);
        return -END_CIRC_REASON_TORPROTOCOL;
      }
    } else {
      /* If the first relay cell we saw on this circuit was incoming, then
       * something really weird is happening.  Drop it the cell. */
      log_warn(LD_OR,
               "Received incoming relay cell with command %s on loose circuit "
               "%d, but we haven't finished construction the extra hops in "
               "this circuit!  This is odd.  Dropping the cell.  Path is:",
                cell_command_to_string(cell->command),
                LOOSE_TO_CIRCUIT(loose_circ)->global_circuitlist_idx);
      loose_circuit_log_path(LOG_WARN, LD_OR, loose_circ);
      return 0;
}

Although, now that I look at it again, it looks wrong. The part which does return -END_CIRC_REASON_TORPROTOCOL should actually do return 0 to drop extra outgoing relay_early from the OP side. Also, the return 0 part should instead do something like:

if (crypto_cipher_crypt_inplace(loose_circ->cpath->b_crypto, cell,
                                CELL_PAYLOAD_SIZE) && (cell->recognised)) {
  // XXX I would need to double check that loose_circ_process_relay_cell
  // would still do the right thing in this case.
  loose_circuit_process_relay_cell(loose_circ, NULL, cell, CELL_DIRECTION_IN, 1);
} else {
  // Avoid passing any relay_early cells from the loose hops backwards to the OP.
  return -END_CIRC_REASON_TORPROTOCOL;
}

Aside from all this, this logic becomes totally incorrect when we have prop!#249 (moved) large/split create cells. I'll make two patches, one to fix the above mistakes, and another to revise the code to handle post-prop!#249 (moved) create cells. Does that sound okay?

I think some of the tor_asserts() in the OR_TO_LOOSE_CIRCUIT_() functions need to check for LOOSE_CIRCUIT_MAGIC, not OR_CIRCUIT_MAGIC.

Yikes! Good catch. O_o'

* [https://gitweb.torproject.org/user/isis/tor.git/commit/?h=bug7144_r1&id=749d559ac3cea1a0ab5591b94385a238a9eb40dc 749d559a] Change [CONST_]OR_TO_LOOSE_CIRCUIT to check correct magic.

Did you grep for other uses of OR_CIRCUIT_MAGIC to see if they needed to change as well?

It's used in to see if a rend_splice is actually an or_circuit_t in circuit_free(). And also in circuitmux.c (as mentioned in my last comment) to as part of a check to see if a circuit should have a p_circ_id.

relay.c:

I wonder whether there is really no shared code with circuit_receive_relay_cell and loose_circuit_process_relay_cell. I guess I'll find out when I review loose.c

There's plenty. I wasn't sure what to do about that, since otherwise I would be refactoring large sections of the cell processing code.

What does the + // prop#188 comment mean? More detail please.

Removed in * [https://gitweb.torproject.org/user/isis/tor.git/commit/?h=bug7144_r1&id=46de8ce45a50d2b460cccf1e77f728247c4969af 46de8ce4] Remove prop!#188 (closed) comment from relay.c.

Replying to isis:

Replying to nickm:

Okay, I came here to drink coffee and review code, and my doctor tells me I shouldn't drink so much coffee. I'll look at the smaller ones first.

Thanks, Nick! I owe you a beverage at the next meeting in Valencia. :)

Thanks, Isis! I do enjoy me some beverage! Almost as much as I enjoy me code.

In any case where I don't respond below, either your code looks fine, or no change is needed, or a change can wait till later.

[...]

6daf9165951d Make logic for choosing create cell type be agnostic to circuit type.

hmm. I know this isn't new, but the !cpath->extend_info->onion_key check looks poor to me, since it will fail once no-TAP relays are a reality. Probably doesn't need to get fixed on this branch though.

What if I were to move it below the if (server_mode(options)) check? That way, (at least in the future) we'll never accidentally use CREATE_FAST cells between ORs, where one has no onion_key. Separate ticket?

Sounds okay. Separate ticket, I think.

Replying to isis:

Replying to nickm:

Okay, now to review "e81acaf5f33e Implement Bridge Guards (prop188).".

The big files are command.c and loose.c. I'll review the others first.

circuitmux.c:

I think that the changes here should make use of a CIRCUIT_HAS_CPATH macro rather than doing origin || loose; this logic is likely to be important elsewhere.

Did you mean this check? {{{ if ((circ->magic == OR_CIRCUIT_MAGIC) || (circ->magic == LOOSE_OR_CIRCUIT_MAGIC)) }}}

That check is for discovering if the circuit has a p_circ_id and maybe having a p_chan. Should I make a CIRCUIT_HAS_PCHAN macro? Also, I agree that CIRCUIT_HAS_CPATH sounds useful, but I don't understand where in circuitmux.c it should be applied.

CIRCUITH_HAS_PCHAN would be great.

p_chan_relay_cell is a little confusing. How do we know that they won't send two relay early cells, and what do we do if they do? Also consider prop#249 ("Large create cells")

My reading of the spec and testing caused me to believe that only one relay_early should/would be sent, and that the OR should simply drop additional cells while the channel is not yet fully constructed: {{{ 5.5. Routing relay cells

When an OR receives a RELAY or RELAY_EARLY cell, it checks the cell's circID and determines whether it has a corresponding circuit along that connection. If not, the OR drops the cell.

Otherwise, if the OR is not at the OP edge of the circuit (that is, either an 'exit node' or a non-edge node), it de/encrypts the payload with the stream cipher, as follows: 'Forward' relay cell (same direction as CREATE): Use Kf as key; decrypt. 'Back' relay cell (opposite direction from CREATE): Use Kb as key; encrypt. Note that in counter mode, decrypt and encrypt are the same operation.

The OR then decides whether it recognizes the relay cell, by inspecting the payload as described in section 6.1 below. If the OR recognizes the cell, it processes the contents of the relay cell. Otherwise, it passes the decrypted relay cell along the circuit if the circuit continues. If the OR at the end of the circuit encounters an unrecognized relay cell, an error has occurred: the OR sends a DESTROY cell to tear down the circuit.

When a relay cell arrives at an OP, the OP decrypts the payload with the stream cipher as follows: OP receives data cell: For I=N...1, Decrypt with Kb_I. If the payload is recognized (see section 6..1), then stop and process the payload. }}}

Without taking prop!#249 (moved) into account, my logic was that we should store one relay_early and drop the rest of them:

{{{ if (PREDICT_UNLIKELY(!loose_circ->has_opened)) {
/* Store the first relay_early cell for later, after our loose circuit is * fully constructed. */ tor_assert(cell);
if (cell_direction == CELL_DIRECTION_OUT) {
  log_debug(LD_CIRC,
            "Received %s command on loose circuit %d, but we haven't "
            "finished constructing the extra hops in this circuit! "
            "Saving for later.",
            cell_command_to_string(cell->command),
            LOOSE_TO_CIRCUIT(loose_circ)->global_circuitlist_idx);

  if (!loose_circ->p_chan_relay_cell) {
    ++num_seen;
    loose_circ->p_chan_relay_cell = tor_malloc_zero(sizeof(cell_t));
    *loose_circ->p_chan_relay_cell = *cell;
    return 0;
  } else {
    /* OPs should not be sending additional relay cells until the first
     * has been answered, so if we already have one stored, then the OP is
     * misbehaving.  Mark this circuit for close. */
    log_warn(LD_CIRC,
             "Received another relay cell on loose circuit %d, which "
             "is not done constructing.  Closing this circuit.",
             LOOSE_TO_CIRCUIT(loose_circ)->global_circuitlist_idx);
    return -END_CIRC_REASON_TORPROTOCOL;
  }
} else {
  /* If the first relay cell we saw on this circuit was incoming, then
   * something really weird is happening.  Drop it the cell. */
  log_warn(LD_OR,
           "Received incoming relay cell with command %s on loose circuit "
           "%d, but we haven't finished construction the extra hops in "
           "this circuit!  This is odd.  Dropping the cell.  Path is:",
            cell_command_to_string(cell->command),
            LOOSE_TO_CIRCUIT(loose_circ)->global_circuitlist_idx);
  loose_circuit_log_path(LOG_WARN, LD_OR, loose_circ);
  return 0;
}
}}}

Although, now that I look at it again, it looks wrong. The part which does return -END_CIRC_REASON_TORPROTOCOL should actually do return 0 to drop extra outgoing relay_early from the OP side. Also, the return 0 part should instead do something like:

{{{ if (crypto_cipher_crypt_inplace(loose_circ->cpath->b_crypto, cell, CELL_PAYLOAD_SIZE) && (cell->recognised)) { // XXX I would need to double check that loose_circ_process_relay_cell // would still do the right thing in this case. loose_circuit_process_relay_cell(loose_circ, NULL, cell, CELL_DIRECTION_IN, 1); } else { // Avoid passing any relay_early cells from the loose hops backwards to the OP. return -END_CIRC_REASON_TORPROTOCOL; } }}}

Aside from all this, this logic becomes totally incorrect when we have prop!#249 (moved) large/split create cells. I'll make two patches, one to fix the above mistakes, and another to revise the code to handle post-prop!#249 (moved) create cells. Does that sound okay?

Yes, please!

I think some of the tor_asserts() in the OR_TO_LOOSE_CIRCUIT_() functions need to check for LOOSE_CIRCUIT_MAGIC, not OR_CIRCUIT_MAGIC.

Yikes! Good catch. O_o'

I'm a little worried that the testing didn't catch this. :/

relay.c:

I wonder whether there is really no shared code with circuit_receive_relay_cell and loose_circuit_process_relay_cell. I guess I'll find out when I review loose.c

There's plenty. I wasn't sure what to do about that, since otherwise I would be refactoring large sections of the cell processing code.

Yeah; I think we need to combine that if possible, either as part of merging this or right after. Having duplicate code leads to maintenance troubles.

(Note to self : waiting to see what Isis says about last round of comments, then I need to look over loose.c to figure out what's different, if it's not getting refactored just yet.)

These seem like features, or like other stuff unlikely to be possible this month. Bumping them to 0.2.9

Trac:
Milestone: Tor: 0.2.8.x-final to Tor: 0.2.9.x-final

Removing TorCoreTeam201509 from these tickets, since we do not own a time machine.

Trac:
Keywords: TorCoreTeam201509 deleted, N/A added

Trac:
Reviewer: N/A to N/A
Keywords: N/A deleted, isis201604, TorCoreTeam201604 added

Tagging these bridge- and PT- items as S-can.

Trac:
Sponsor: N/A to SponsorS-can

April is over; calling these april tickets postponed into may.

Trac:
Keywords: TorCoreTeam201604 deleted, TorCoreTeam-postponed-201604, TorCoreTeam201605 added

Trac:
Keywords: N/A deleted, isis201605 added

Trac:
Points: medium to 3

Code review:

make gives these warnings:

src/or/circuitbuild.c:304:65: warning: unused parameter 'verbose_names'
      [-Wunused-parameter]
circuit_list_path_impl(origin_circuit_t *circ, int verbose, int verbose_names)
...
src/or/loose.c:1077:49: warning: unused parameter 'layer_hint'
      [-Wunused-parameter]
                                  crypt_path_t *layer_hint, cell_t *cell)
                                                ^
src/or/loose.c:79:1: warning: unused function
      'loose_note_that_we_maybe_cant_complete_circuits' [-Wunused-function]
loose_note_that_we_maybe_cant_complete_circuits(void)
...
src/test/test_loose.c:404:12: warning: incompatible pointer to integer
      conversion initializing 'circid_t' (aka 'unsigned int') with an expression
      of type 'void *' [-Wint-conversion]
  circid_t circ_id = NULL;

The last one is repeated a few times.

make test asserts and fails on:

scheduler/channel_states: [forking] May 23 15:18:46.937 [err] tor_assertion_fail
ed_: Bug: src/or/circuitlist.c:420: circuit_get_all_pending_on_channel: Assertio
n circ->state == CIRCUIT_STATE_CHAN_WAIT failed; aborting. (on Tor 0.2.8.2-alpha
-dev )
...
2   libsystem_c.dylib             	0x00007fff8bf376e7 abort + 129
3   test                          	0x00000001084dbecb circuit_get_all_pending_on_channel + 299 (circuitlist.c:409)
4   test                          	0x00000001084d85ca circuit_n_chan_done + 90 (circuitbuild.c:630)
5   test                          	0x00000001084c903b channel_closed + 91 (channel.c:1338)
6   test                          	0x000000010846025c test_scheduler_channel_states + 2876 (test_scheduler.c:417)
7   test                          	0x00000001084bf4bf testcase_run_one + 863 (tinytest.c:106)
8   test                          	0x00000001084bf9d3 tinytest_main + 531 (tinytest.c:432)
9   test                          	0x00000001084bee0f main + 639 (testing_common.c:300)
10  libdyld.dylib                 	0x00007fff93e5f5ad start + 1

make test-network-all fails everything:

FAIL: basic-min
FAIL: bridges-min
FAIL: hs-min
FAIL: bridges+hs
FAIL: bridges+ipv6-min
FAIL: ipv6-exit-min
FAIL: mixed

I'll try to work out why.

Let's call the last 3 issues T1, T2, and T3.

T4: Generalize logic for calculating cpath length. The code used to check if (circ && circ->cpath) and return 0 if circ is NULL. Now circuit_get_cpath_len will dereference circ without that check. (This might not be an issue if every caller makes sure circ is not NULL.)

T5: In loose_circuit_pick_cpath_entry, extend_info_from_node should be called with node, 1, because we're connecting to it directly. (This is a nitpick, as bridge relays don't currently use ReachableAddresses and ClientPreferIPv6ORPort to pick the preferred OR address to extend to. At the moment, only clients use the preferred address code.)

  } else {              /* We should pick an entry node */
    node = choose_good_entry_server(CIRCUIT_PURPOSE_OR,
                                    loose_circ->build_state);
    if (!node) {
      log_warn(LD_CIRC, "Failed picking suitable first hop for loose "
                        "circuit.");
      return NULL;
    }
    entry_ei = extend_info_from_node(node, 0);
    tor_assert(entry_ei);
  }

T6: In loose_circuit_populate_cpath, I think we don't care about ntor for one-hop circuits because they were originally for directory fetches only, which are authenticated by signature, and don't contain any private information. Maybe we should change this now loose source routing and (soon) single onion services will use one-hop circuits. (Or maybe it's ok as-is!)

 // XXXprop#188 Why do we not care if it's ntor if it's only one hop?

I'll think about this for single onion services in #19163 (moved).

The code is great - it's well-structured and a joy to read.

But I wonder about the code copied from the base OR code:

have the changes to the OR code in 0.2.8 also been made in the copied code? (T7) (I wonder if this could be causing subtle breakage. But I have no evidence either way.)

OK, now I'm done reading, time to debug:

Once this patch is applied, chutney clients never download a microdesc consensus, even though the authorities have one. (But this works ok in maint-0.2.8.) I need to look into the code that's executed when a client asks an authority for a consensus for the first time. And I need to look at the code that this patch adds to existing tor OR code.

Remove "TorCoreTeam201605" keyword. The time machine is broken.

Trac:
Keywords: TorCoreTeam201605 deleted, N/A added

Deferring many tickets that are in needs_revision with no progress noted for a while, where I think they could safely wait till 0.3.0 or later.

Please feel free to move these back to 0.2.9 if you finish the revisions soon.

Trac:
Milestone: Tor: 0.2.9.x-final to Tor: 0.2.???
Keywords: N/A deleted, nickm-deferred-20160905 added

Milestone renamed

Trac:
Milestone: Tor: 0.2.??? to Tor: 0.3.???

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

Trac:
Milestone: Tor: 0.3.??? to Tor: unspecified
Keywords: N/A deleted, tor-03-unspecified-201612 added

A question about bridge guards in general: can't they be used by the next hop after the guard to passively verify that the bridge guard really is a bridge guard? Consider a client C and a bridge B with a bridge guard G. C doesn't know that G is a bridge guard of B, so at some point it tries to open the circuit C -> B -> R -> G where R is some arbitrary relay. B inserts a loose-source routed hop to G, since that's its bridge guard, so we get

C -> B -> G -> R -> G

Since no one will ever try to create a circuit like that directly, since R will refuse to extend through G, R now knows that the first hop through G must be loose-source routed, and thus G must be a bridge guard. Neither C nor B can prevent this since C doesn't know which guards B are using, and B can't read the encrypted extend cells intended for R.

Is this bad, or do we assume that bridge guards are detectable anyway? Or is it actually prevented by the path-choosing algorithms somehow?

Trac:
Username: larsl

Remove an old triaging keyword.

Trac:
Keywords: tor-03-unspecified-201612 deleted, N/A added

Implement Bridge Guards and other anti-enumeration defenses

Child items ...

Activity