Audit all calls to connection_mark_for_close()

changed milestone to %Tor: 0.2.5.x-final

added component::core tor/tor milestone::Tor: 0.2.5.x-final owner::andrea priority::medium resolution::fixed status::closed tor-relay type::task version::tor 0.2.4.5-alpha labels

Trac:
connection_mark_for_close.txt

Audit for connection_mark_for_close() as of commit 2cb82c33bcb59ce65c9738ff8ff4977f39fa3d9f

I've attached my analysis of all occurrences of connection_mark_for_close(); the ones potentially meriting a closer look are:

connection.c:2099: connection_mark_for_close(conn);

This is in connection_mark_all_noncontrol_connections(), could be an erroneous close of an orconn. What calls this?

connection.c:2804: connection_mark_for_close(linked);

Also in connection_handle_read_impl(); tries to flush from linked conn and then connection_flushed_some() fails. Could possibly be an orconn (when do linked conns get created?); comments suggest this case might be a can't-happen, though.

connection.c:3051: connection_mark_for_close(conn);

This is in connection_handle_read_cb(), and gets called if connection_process_inbuf() fails; could be an orconn? Comment suggests this might be a can't-happen.

connection.c:3063: connection_mark_for_close(conn);

This is closely parallel to the above case, but in the connection_handle_write_cb() function.

connection.c:3120: connection_mark_for_close(conn);

This is the error-handling case in connection_handle_event_cb(); if the error is during connect and conn is an orconn it will call connection_or_connect_failed(), but it looks like it probably should handle other orconn cases.

connection.c:3270: connection_mark_for_close(conn);

Error with getsockopt() in connection_handle_write_impl(); this should probably call connection_or_notify_error() too if it's an orconn.

connection.c:3391: connection_mark_for_close(conn);

I think this case can't be an active orconn, but it might be worth a test here anyway. This one can only happen if !(connection_speaks_cells(conn) && conn->state > OR_CONN_STATE_PROXY_HANDSHAKING)

connection.c:3538: connection_mark_for_close(conn);

Error in write_to_buf() or write_to_buf_zlib() from connection_write_to_buf_impl_(); this should check for orconn, I think.

main.c:731: connection_mark_for_close(conn);

This is conn_read_callback() when connection_handle_read() returns an error; can this ever happen for an orconn?

main.c:769: connection_mark_for_close(conn);

This is conn_write_callback() when connection_handle_write() returns an error; can this ever happen for an orconn?

Replying to andrea:

I've attached my analysis of all occurrences of connection_mark_for_close(); the ones potentially meriting a closer look are:

connection.c:2099: connection_mark_for_close(conn);

This is in connection_mark_all_noncontrol_connections(), could be an erroneous close of an orconn. What calls this?

It's used when the DisableNetwork option is turned on while Tor is running.

connection.c:2804: connection_mark_for_close(linked);

Also in connection_handle_read_impl(); tries to flush from linked conn and then connection_flushed_some() fails. Could possibly be an orconn (when do linked conns get created?); comments suggest this case might be a can't-happen, though.

Linked connections are currently always an edge (ap or exit) connection linked to a directory connection.

connection.c:3051: connection_mark_for_close(conn);

This is in connection_handle_read_cb(), and gets called if connection_process_inbuf() fails; could be an orconn? Comment suggests this might be a can't-happen.

oops. That comment is asking, "Should the second argument to connection_process_inbuf really always be 1 here"? It's entirely possible for connection_process_inbuf() to return -1 on an orconn; check out connection_or_process_inbuf() for examples.

connection.c:3063: connection_mark_for_close(conn);

This is closely parallel to the above case, but in the connection_handle_write_cb() function.

connection.c:3120: connection_mark_for_close(conn);

This is the error-handling case in connection_handle_event_cb(); if the error is during connect and conn is an orconn it will call connection_or_connect_failed(), but it looks like it probably should handle other orconn cases.

Yeah; note that the last 3 are bufferevent-specific.

connection.c:3270: connection_mark_for_close(conn);

Error with getsockopt() in connection_handle_write_impl(); this should probably call connection_or_notify_error() too if it's an orconn.

agreed

connection.c:3391: connection_mark_for_close(conn);

I think this case can't be an active orconn, but it might be worth a test here anyway. This one can only happen if !(connection_speaks_cells(conn) && conn->state > OR_CONN_STATE_PROXY_HANDSHAKING)

Right. connection_speaks_cells() is an alias for "is an orconn" right now. So any orconn, other than one doing a HTTP or SOCKS handshake with proxy, will hit the first clause of the conditional statement.

connection.c:3538: connection_mark_for_close(conn);

Error in write_to_buf() or write_to_buf_zlib() from connection_write_to_buf_impl_(); this should check for orconn, I think.

Yes. In practice, I don't think these functions can fail, but f they're documented to have an error return, we should check for it.

main.c:731: connection_mark_for_close(conn);

This is conn_read_callback() when connection_handle_read() returns an error; can this ever happen for an orconn? main.c:769: connection_mark_for_close(conn);

This is conn_write_callback() when connection_handle_write() returns an error; can this ever happen for an orconn?

These two should never happen for any kind of connection, since they only happen if the connection is not already marked, and connection_handle_{read,write} are supposed to mark the connection every time they return -1. But that said, there's no reason to think that a bug where we forget to mark an orconn is less likely than a bug where we forget to mark some other kind of conn.

Trac:
Keywords: N/A deleted, tor-relay added

We haven't heard much recently about bugs related to this, and some of these are bufferevent-specific, and bufferevents are broken anyway. This should wait for 0.2.5.

Trac:
Milestone: Tor: 0.2.4.x-final to Tor: 0.2.5.x-final

conn_read_callback():

I think there may be a few paths along which connection_handle_read() can fail to mark the connection. In particular, connection_handle_read() -> connection_handle_read_impl() -> connection_process_inbuf() -> connection_or_process_inbuf() -> connection_tls_start_handshake(), and then tor_tls_new() fails in connection_tls_start_handshake(). This may be a can't-happen for other reasons and I don't think it can lead to any bad interactions with channels because the new or_connection is not yet associated with a channel at that point.

conn_read_callback():
- Correction: channel_tls_handle_incoming() gets called in connection_tls_start_handshake() before tor_tls_new(). I still believe this path is safe because connection_or_process_inbuf() will connection_or_close_for_error() if connection_tls_start_handshake() fails.

conn_write_callback():
- This is safe iff connection_handle_write() always marks for close when it fails. The only cases where it returns failure without marking for close itself are when it passes along failure from connection_finished_connecting() or connection_finished_flushing().

conn_write_callback():
- In the case of an or_connection_t, connection_finished_connecting() and connection_finished_flushing() always pass through to connection_or_finished_connecting() and connection_or_finished_flushing(), respectively. The connection_or_finished_connecting() function always calls connection_or_close_for_error() correctly when it returns an error, and connection_or_finished_flushing() should never return an error unless the or_connection_t is in a bad state, and asserts first before it would in that case anyway.

Therefore, I believe conn_write_callback() is always channel-safe.

connection_mark_all_noncontrol_connections()
- This gets called from options_act_reversible() when the DisableNetwork config option is set. There isn't any check for orconns and I believe this may be a bug. Will test further.

connection_handle_read_impl()
- This case with connection_mark_for_close(linked) after connection_flushed_some(linked) fails is safe because a linked conn can't be an orconn.

connection_handle_read_cb() and connection_handle_write_cb():
- Bufferevents don't quite work anyway, so testing or further investigation is difficult here.
- I believe the connection_handle_read_cb() case is safe because connection_process_inbuf() will always correctly mark an orconn and notify the controlling channel when it fails.
- The connection_handle_write_cb() case is safe because connection_flushed_some() always calls connection_or_flushed_some() on an orconn, and connection_or_flushed_some() never returns an error.

connection_handle_event_cb():
- I think there's probably a bug here which could be triggered by an eof on an orconn which has already finished connecting, among other possible causes.
- I believe the fix would be easy: move the conn->state == OR_CONN_STATE_CONNECTING test into an inner if clause inside the conn->type == CONN_TYPE_OR test, and have an else clause that calls connection_or_close_for_error() rather than connection_or_connect_failed().
- Demonstrating the bug or testing the proposed fix would be difficult due to this being bufferevent code, though.

connection_handle_write_impl():
- I think it's a bug to call connection_mark_for_close() on an orconn after getsockopt() fails. This is all in an if (connection_state_is_connecting()), so I think there should be a test for orconn and connection_or_connect_failed() rather than connection_mark_for_close() in that case.

connection_handle_write_impl():
- The case formerly on connection.c:3391 and now on connection.c:3825 appears to be safe for the reasons nickm stated in comment #2 (closed).

connection_write_to_buf_impl_:
- connection_mark_for_close(conn); now on connection.c:3997 should have an orconn test, although it may be a can't-happen case.

Replying to andrea:

connection_handle_write_impl():

I think it's a bug to call connection_mark_for_close() on an orconn after getsockopt() fails. This is all in an if (connection_state_is_connecting()), so I think there should be a test for orconn and connection_or_connect_failed() rather than connection_mark_for_close() in that case.

Created ticket #11302 (moved) for this.

Audit all calls to connection_mark_for_close()

Child items ...

Activity