Opened 11 years ago

Closed 10 years ago

Last modified 8 years ago

#1362 closed defect (fixed)

log does not record at NOTICE level when Tor is really ready

Reported by: keb Owned by:
Priority: Low Milestone: Tor: 0.2.2.x-final
Component: Core Tor/Tor Version:
Severity: Keywords:
Cc: keb, Sebastian, nickm, chris@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by arma)

the Notice level Tor log said

Apr 18 00:37:00.072 [Notice] We now have enough directory information to build circuits.

but i was getting these errors in IRC:

[00:37] SOCKS Proxy failed to connect to host (error 1).
[00:37] Proxy traversal failed.

so i turned on Info level logging and
tried to connect to IRC server again a few times
with same results as above until the last Tor log line below:

Apr 18 00:38:39.453 [Info] exit circ (length 3): jceaovh(open) $B2B00F568DEAA608C9C9A2DFCABD3A2622460743(open) $476418056BAEEFFB0354CB7DA0B0D50010264820(open)
Apr 18 00:38:39.455 [Info] connection_ap_handshake_send_begin(): Address/port sent, ap socket 22, n_circ_id 47188
Apr 18 00:38:39.455 [Info] connection_edge_process_inbuf(): data from edge while in 'waiting for connect response' state. Leaving it on buffer.
Apr 18 00:38:46.557 [Info] connection_ap_process_end_not_open(): Edge got end (internal error at server) before we're connected. Marking for close.
Apr 18 00:38:46.559 [Info] exit circ (length 3): jceaovh(open) $B2B00F568DEAA608C9C9A2DFCABD3A2622460743(open) $476418056BAEEFFB0354CB7DA0B0D50010264820(open)
Apr 18 00:39:03.432 [Info] update_consensus_router_descriptor_downloads(): 0 router descriptors downloadable. 0 delayed; 1363 present (0 of those were in old_routers); 0 would_reject; 1 wouldnt_use; 0 in progress.
Apr 18 00:39:18.460 [Info] routerlist_remove_old_routers(): We have 1363 live routers and 0 old router descriptors.
Apr 18 00:39:18.461 [Info] buf_shrink_freelists(): Cleaning freelist for 4096-byte chunks: keeping 9, dropping 8.
Apr 18 00:39:18.462 [Info] buf_shrink_freelists(): Cleaning freelist for 32768-byte chunks: keeping 3, dropping 1.
Apr 18 00:39:38.583 [Info] command_process_versions_cell(): Negotiated version 2 with [scrubbed]:9001; sending NETINFO.
Apr 18 00:39:38.584 [Info] command_process_netinfo_cell(): Got good NETINFO cell from [scrubbed]:9001; OR connection is now open, using protocol version 2

immediately after that last message, the IRC connection worked.
There were no Notice level messages indicating any problem.
Therefore i suggest promoting that last message, or some cleaned up version of it,
to Notice level so the log shows when Tor is really ready to pass user traffic.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Change History (11)

comment:1 Changed 11 years ago by Sebastian

True, it seems we should delay the "Bootstrapped 100%: Done."
notice a little bit.

comment:2 Changed 10 years ago by cjb

Cc: chris@… added

Is this bug still valid? I see the messages in the correct order with Git HEAD:

Aug 01 16:26:47.166 [info] command_process_netinfo_cell(): Got good NETINFO cell from [scrubbed]:443; OR connection is now open, using protocol version 2
Aug 01 16:26:47.179 [notice] Bootstrapped 90%: Establishing a Tor circuit.
Aug 01 16:26:47.179 [info] command_process_netinfo_cell(): Got good NETINFO cell from [scrubbed]:9001; OR connection is now open, using protocol version 2
Aug 01 16:26:47.766 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Aug 01 16:26:47.766 [notice] Bootstrapped 100%: Done.

comment:3 Changed 10 years ago by arma

Description: modified (diff)

This bug is as valid as it ever was, insofar as the behavior hasn't changed.

I'm not really sure what we should fix though. It works well enough for me.

Perhaps some of the problem is that on slow connections, since Tor tries to preemptively build several circuits, it could be busy building its second and third preemptive circuit when you start trying to use the first one.

Another part of the mystery is why this person got an end cell with reason END_STREAM_REASON_INTERNAL after the first connection attempt. What went wrong at the relay end? Was it a problem with that particular relay?

comment:4 Changed 10 years ago by arma

In any case, the "we got a netinfo cell from our first hop" message is totally not the sign that Tor is ready for application-level connections. Notice how right after that line in the code is:

        "We now have enough directory information to build circuits.");
    control_event_client_status(LOG_NOTICE, "ENOUGH_DIR_INFO");
    control_event_bootstrap(BOOTSTRAP_STATUS_CONN_OR, 0);

BOOTSTRAP_STATUS_CONN_OR corresponds to the 80% mark in bootstrapping. You really want

            "Tor has successfully opened a circuit. "
            "Looks like client functionality is working.");
            control_event_bootstrap(BOOTSTRAP_STATUS_DONE, 0);

So the new question is: when we run out of directory information such that we call

  if (!res && have_min_dir_info) {
    int quiet = directory_too_idle_to_fetch_descriptors(options, now);
    log(quiet ? LOG_INFO : LOG_NOTICE, LD_DIR,
        "Our directory information is no longer up-to-date "
        "enough to build circuits: %s", dir_info_status);
    control_event_client_status(LOG_NOTICE, "NOT_ENOUGH_DIR_INFO");

should we also set has_completed_circuit to 0, so it will do the "Tor has successfully opened a circuit" log line when it next makes a circuit? I think the answer is yes.

comment:5 Changed 10 years ago by arma

Milestone: Tor: 0.2.2.x-final
Priority: trivialminor
Status: newneeds_review

Check out branch bug1362 in my arma.

Putting in the 0.2.2.x milestone, but at 'minor' priority as we can skip it if needed.

comment:6 Changed 10 years ago by nickm

I'm okay with the implementation of the fix, but I'm not merging it as-is. It makes it so that "has_completed_circuit" no longer means "Have we completed a circuit?" If we're going to change the semantics of a variable or function, we should change its name and documentation as well.

comment:7 Changed 10 years ago by arma

See circuit_note_clock_jumped() for an example where we already use it the way I described.

It has meant "Does it look like circuit-building is actually working this time" for quite a while now.

comment:8 Changed 10 years ago by nickm

I still say the name is misleading. Let's merge this, though, and make it less misleading as a separate issue, since this commit doesn't make it any more misleading than it was before.

What do you like for a more accurate name? has_completed_circuit_recently ? observed_to_complete_circuits?

comment:9 Changed 10 years ago by arma

can_complete_circuit maybe?

comment:10 Changed 10 years ago by nickm

Resolution: Nonefixed
Status: needs_reviewclosed

I'll buy that. I'll merge "bug1362 from "[your] arma" and, then tweak it, then close this.

comment:11 Changed 8 years ago by nickm

Component: Tor ClientTor
Note: See TracTickets for help on using tickets.