Opened 11 years ago

Last modified 7 years ago

#878 closed defect (Fixed)

Running out of RELAY_EARLY cells.

Reported by: phobos Owned by:
Priority: Low Milestone: 0.2.1.x-final
Component: Core Tor/Tor Version: 0.2.1.7-alpha
Severity: Keywords:
Cc: phobos, karsten, arma, nickm Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I wasn't even at my computer when this showed up in the log file:
Nov 29 13:10:06.102 [Warning] relay_send_command_from_edge(): Bug: Uh-oh. We're sending a RELAY_COMMAND_EXTEND cell, but we have run out o
f RELAY_EARLY cells on that circuit.
Nov 29 13:10:56.356 [Warning] relay_send_command_from_edge(): Bug: Uh-oh. We're sending a RELAY_COMMAND_EXTEND cell, but we have run out o
f RELAY_EARLY cells on that circuit.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Attachments (1)

patch878.txt (4.9 KB) - added by karsten 11 years ago.

Download all attachments as: .zip

Change History (16)

comment:1 Changed 11 years ago by nickm

14:23 < armadev> karsten: ah, phobos's bug was known to me and rovv. our theory
is that it's an intro circuit that has been extended 4 times.
14:23 < armadev> ah. you said that above. i believe nobody has written it down.
or fixed it.

comment:2 Changed 11 years ago by nickm

23:08 < armadev> my thoughts on 878 is that it's a client-side hidden service

attempt, and it tried a lot of extends and was refused at each
intro point so it ended up trying enough that it triggered the
limit.

23:08 < armadev> not sure what 'not even at my computer' contributes to my

assessment. but what i described is in fact a plausible bug.

23:09 < nickm> Possibly. Is the solution to raise the limit; to dump the

circuit if we hit the limit; or other.

23:09 < armadev> dump
23:09 < armadev> it'sa pretty dreadful circuit at the point anyway
23:10 < nickm> ok

comment:3 Changed 11 years ago by nickm

07:47 < karsten> nickm, armadev: in general, dumping the circuit seems to be a

safe solution. however, i'd also be interested in what exactly
happens when we run into this case. so, if we want to spend
some time on understanding what happens, i'd like to
investigate the bug some more.

07:47 < karsten> as far as i understand (and please correct me if i'm wrong),

all outgoing command cells reduce the local relay-early
counter by one. in case of client-side introduction circuits i
wonder what cells are sent along that circuit reducing the
counter.

07:47 < karsten> in theory, when accessing a hidden service we extend more than

one introduction circuit, but we only send a single INTRODUCE1
cell. that means that in case of the bug the circuit is
already roughly 8 hops long when we run out of RELAY_EARLY
cells. this seems quite unlikely to me, so maybe it's
something else?

07:47 < karsten> should we keep a history of previous commands reducing the

counter to see what happens? we could add a tiny data
structure (int[10]) to circuits to remember command numbers.
if it's really a 8-hop circuit that we try to extend, we
should dump it for sure. if it's just a matter of counting
cells that don't really extend the circuit, we could change
that.

...
11:16 < nickm> karsten: cool. So, wrt 878, do you feel like implementing any

of the debugging stuff you mention? It all seems like basically
a good idea to me.

11:18 < karsten> nickm: sure, i can do that.

comment:4 Changed 11 years ago by rovv

it's an intro circuit that has been extended 4 times,
only for non-standart HS.
(standart 3 intro points)
So, if standart HS, it is really maximum 8 relay_early cells.

But from code:

circ->remaining_relay_early_cells = MAX_RELAY_EARLY_CELLS_PER_CIRCUIT;
circ->remaining_relay_early_cells -= crypto_rand_int(2);

So client can choose from 7(?) to 8.

(mistake? crypto_rand_int(2); returns 0 or 1 only.

should be from 6 to 8, proposal 110 says about [K-2, K]).

Then, only 2 intro points can be asked before bug triggered
if remaining_relay_early_cells == 7
Example:
build_some_circ(2 relay_early)+extend_to_fisrt(1)
introduce_to_fisrt(1)(back NAK)+extend_to_second(1)+
introduce_to_second(1)(back NAK)+extend_to_third(1) == 7
Can't to hold introduce_to_third.

(And non-standart HS with 3+ intro-points can still happens in theory.)

comment:5 Changed 11 years ago by rovv

Sorry, ignore my comment. It's wrong.

Changed 11 years ago by karsten

Attachment: patch878.txt added

comment:6 Changed 11 years ago by karsten

It appears that all kinds of RELAY_COMMAND cells count down the RELAY_EARLY
counter. Examples are as follows (see patch to reproduce these log
statements):

Dec 29 20:01:44.154 [info] relay_send_command_from_edge(): Sending relay
command SENDME as RELAY_EARLY cell. 7 sent, 1 remaining. Commands sent so
far: BEGIN_DIR, DATA, DATA, DATA, DATA, DATA, SENDME

Dec 29 20:03:58.826 [info] relay_send_command_from_edge(): Sending relay
command DATA as RELAY_EARLY cell. 8 sent, 0 remaining. Commands sent so
far: EXTEND, EXTEND, BEGIN, DATA, BEGIN, END, BEGIN, DATA

comment:7 Changed 11 years ago by nickm

Partially fixed in r17814 and r17815. Now we never cannibalize a circuit that's out of RELAY_EARLY cells, and we
never use a RELAY_EARLY cell to send anything but an EXTEND cell that ends at the first hop.

comment:8 Changed 11 years ago by rovv

circuit_find_to_cannibalize() never returned circuits
with timestamp_dirty or non-CIRCUIT_PURPOSE_C_GENERAL.
Any begin_dir means timestamp_dirty = time(NULL),
and no one non-dirty rend circ can not be returned too.

Do you know any such circuits: non-dirty general with empty
remaining_relay_early_cells? What for r17814?
Let you have such circ and cannibalize for C_INTRODUCING purpose,
with remaining_relay_early_cells == 1, than any NAK ruins all changes.

any sense of changes?

comment:9 Changed 11 years ago by nickm

[Deleted]: The point of the patch is to handle the case in the future if we decide that it's okay to cannibalize
circuits that have been used for begin_dir only.

On the other hand, we could probably get smarter here: We should not only forebear to cannibalize circuits that are
out of relay_early cells, but also we should not cannibalize ones that don't have enough relay_early cells to do
what we want.

comment:10 Changed 11 years ago by nickm

...though afaict, that number is always 1. So nothing is needed here atm.

comment:11 Changed 11 years ago by nickm

It looks like the right thing is to change rend_client_introduction_acked so that it doesn't try to extend the
circuit if it is too long or it is out of relay_early cells, and instead launches a new circuit.

Karten: do you want to tackle this? It shouldn't be too hard.

comment:12 Changed 11 years ago by nickm

Here's Karsten's draft patch, so I don't lose it. He says he'll try it out to make sure it works. It looks
okay to me.

Index: src/or/rendclient.c
===================================================================
--- src/or/rendclient.c (revision 18451)
+++ src/or/rendclient.c (working copy)

-274,13 +274,25 @@

circuit_mark_for_close(TO_CIRCUIT(circ), END_CIRC_REASON_INTERNAL);
return -1;

}

  • log_info(LD_REND,
  • "Got nack for %s from %s. Re-extending circ %d, "
  • "this time to %s.",
  • escaped_safe_str(circ->rend_data->onion_address),
  • circ->build_state->chosen_exit->nickname, circ->_base.n_circ_id,
  • extend_info->nickname);
  • result = circuit_extend_to_new_exit(circ, extend_info);

+ if (circ->remaining_relay_early_cells) {
+ log_info(LD_REND,
+ "Got nack for %s from %s. Re-extending circ %d, "
+ "this time to %s.",
+ escaped_safe_str(circ->rend_data->onion_address),
+ circ->build_state->chosen_exit->nickname, circ->_base.n_circ_id,
+ extend_info->nickname);
+ result = circuit_extend_to_new_exit(circ, extend_info);
+ } else {
+ log_info(LD_REND,
+ "Got nack for %s from %s. Building a new introduction "
+ "circuit, this time to %s.",
+ escaped_safe_str(circ->rend_data->onion_address),
+ circ->build_state->chosen_exit->nickname,
+ extend_info->nickname);
+ circuit_mark_for_close(TO_CIRCUIT(circ), END_CIRC_REASON_FINISHED);
+ circuit_launch_by_extend_info(CIRCUIT_PURPOSE_C_INTRODUCING,
+ extend_info, CIRCLAUNCH_IS_INTERNAL);
+ }

extend_info_free(extend_info);
return result;

}

comment:13 Changed 11 years ago by nickm

Patch applied in r18459. Closing bug.

comment:14 Changed 11 years ago by nickm

flyspray2trac: bug closed.

comment:15 Changed 7 years ago by nickm

Component: Tor ClientTor
Note: See TracTickets for help on using tickets.