Opened 2 years ago

Closed 3 months ago

#16706 closed defect (fixed)

Too many connection_edge_process_relay_cell warnings

Reported by: s7r Owned by: dgoulet
Priority: Medium Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version: Tor: 0.2.6.10
Severity: Normal Keywords: tor-hs, nickm-deferred-20161017, triage-out-030-201612
Cc: chelseakomlo Actual Points:
Parent ID: Points: 3
Reviewer: nickm Sponsor: SponsorR-can

Description (last modified by s7r)

Hosting multiple hidden services on a Debian server running Tor 0.2.6.10 (no special setup or config options). I see thousands of such lines in the log files:

Jul 31 07:55:39.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 07:57:59.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:00:36.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:02:49.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:05:12.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:07:30.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:09:49.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:12:09.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:14:30.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:16:50.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:19:11.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:21:45.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:24:14.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:26:27.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:29:01.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:31:18.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:33:38.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Jul 31 08:35:58.000 [warn] connection_edge_process_relay_cell (at origin) failed.

All hidden services are working and accessible, didn't reload/restart Tor. What is concerning is that there are so so many of these messages in a very short time. #9635 mentioned this as well, but along with other error messages which were indicating a wrong nTor key. Here we only have a single line, heavily repeated at short time intervals.

Child Tickets

Change History (27)

comment:1 Changed 2 years ago by s7r

  • Description modified (diff)

comment:2 Changed 2 years ago by s7r

After ~10 hours the log file grows fast with so so many lines:

Aug 02 08:53:58.000 [warn] connection_edge_process_relay_cell (at origin) failed.

Decided to look at it closer and changed log level to info. Left Tor to log info level for about ~7 minutes, and I got a log file with 3966 lines, out of which 3364 are the same, with an average of about 25 with the exact same timestamp:

Aug 02 08:39:24.000 [info] cell_ewma_set_scale_factor(): Enabled cell_ewma algorithm because of value in CircuitPriorityHalflifeMsec in consensus; scale factor is 0.793701 per 10 seconds
Aug 02 08:39:24.000 [info] options_act(): Worker-related options changed. Rotating workers.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:39:24.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.

Here are the log messages surrounding the [warn] message:

Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_exit_begin_conn(): begin is for rendezvous. configuring stream.
Aug 02 08:40:03.000 [info] rend_service_set_connection_addr_port(): No virtual port mapping exists for port 8333 on service dioq2yg3l5ptgpge
Aug 02 08:40:03.000 [info] connection_exit_begin_conn(): Didn't find rendezvous service (port 8333)
Aug 02 08:40:03.000 [warn] connection_edge_process_relay_cell (at origin) failed.
Aug 02 08:40:03.000 [info] command_process_relay_cell(): circuit_receive_relay_cell (backward) failed. Closing.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_edge_package_raw_inbuf(): called with package_window 0. Skipping.
Aug 02 08:40:03.000 [info] connection_exit_begin_conn(): begin is for rendezvous. configuring stream.

The same for the other messages - before each one it appears that a client tries to connect to an existent hidden service but at a virtual port which is not configured for that hidden service.

comment:3 Changed 2 years ago by s7r

After trying on 0.2.7.2-alpha as well (2 separate servers), I can confirm that the message:

[warn] connection_edge_process_relay_cell (at origin) failed.

appears in Tor's log file every time I try to connect to an existent hidden service at a different virtual port than the one(s) configured via HiddenServicePort. Does this bother Tor or puts load on it in any way? Seams like a trivial attack for any script kiddie.

comment:4 Changed 2 years ago by nickm

  • Keywords 026-backport added
  • Milestone set to Tor: 0.2.7.x-final

comment:5 Changed 22 months ago by nickm

  • Keywords PostFreeze027 added

I'd merge patches for these for 0.2.7 if they come in on time. In some cases, that will require figuring out an as-yet-unsolved bugs.

comment:6 Changed 22 months ago by nickm

  • Keywords PostFreeze027 removed
  • Milestone changed from Tor: 0.2.7.x-final to Tor: 0.2.8.x-final

Moving these tickets into 0.2.8. Not expecting to take patches for any into 0.2.7 at this late date. As usual, please say something if you disagree! :)

comment:7 Changed 21 months ago by nickm

  • Keywords SponsorR removed
  • Sponsor set to SponsorR

Bulk-replace SponsorR keyword with SponsorR sponsor field in Tor component.

comment:8 Changed 21 months ago by dgoulet

  • Points set to medium

comment:9 Changed 17 months ago by nickm

  • Milestone changed from Tor: 0.2.8.x-final to Tor: 0.2.9.x-final

Throw most 0.2.8 "NEW" tickets into 0.2.9. I expect that many of them will subsequently get triaged out.

comment:10 Changed 15 months ago by dgoulet

  • Sponsor changed from SponsorR to SponsorR-can

Move those from SponsorR to SponsorR-can.

comment:11 Changed 15 months ago by arma

  • Owner set to arma
  • Severity set to Normal
  • Status changed from new to assigned

comment:12 Changed 13 months ago by isabela

  • Points changed from medium to 3

comment:13 Changed 8 months ago by nickm

  • Keywords nickm-deferred-20161017 added
  • Milestone changed from Tor: 0.2.9.x-final to Tor: 0.3.0.x-final

I am fairly sure that these are neither regressions nor major problems. So, deferring from 0.2.9. Please let me know if I'm wrong.

comment:14 Changed 8 months ago by arma

This line occurs every time a client tries to visit a port on your onion service which is not configured. For example, if you put your onion service on virtual port 80, and somebody tries to visit virtual port 22.

So as onion service crawlers get more popular and more aggressive, we'll be seeing more of these lines.

I think there's another ticket for this issue open somewhere too.

comment:15 follow-up: Changed 8 months ago by s7r

Yes, this is true and confirmed (comment 3 on this ticket). What I think we should do is discover why this log message is printed when the described event occurs and see if it causes any load over the Tor process. Requests on non-configured virtual ports should be discarded in a way that spends the lowest amount of resources possible.

comment:16 Changed 6 months ago by dgoulet

  • Keywords triage-out-030-201612 added
  • Milestone changed from Tor: 0.3.0.x-final to Tor: 0.3.1.x-final

Triaged out on December 2016 from 030 to 031.

comment:17 in reply to: ↑ 15 Changed 6 months ago by arma

  • Cc chelseakomlo added

Replying to s7r:

What I think we should do is discover why this log message is printed when the described event occurs

It's because some function deep inside Tor returns -1 for failure, and the calling function thinks -1 means it needs to log a warning. It's a great opportunity for refactoring. (In fact, I hear Chelsea enjoys refactoring, so I am cc'ing her on this ticket. :)

and see if it causes any load over the Tor process. Requests on non-configured virtual ports should be discarded in a way that spends the lowest amount of resources possible.

I think there isn't any extra load, except for the warning.

comment:18 Changed 6 months ago by arma

  • Owner arma deleted

Somehow I was the owner of this ticket. I am marking it as unclaimed so people can step in.

comment:19 Changed 6 months ago by arma

  • Status changed from assigned to new

comment:20 Changed 4 months ago by dgoulet

Seems we know what's happening here so we have a fix? Just fixing the log warning when -1?

comment:21 Changed 3 months ago by dgoulet

  • Owner set to dgoulet
  • Status changed from new to accepted

comment:22 Changed 3 months ago by dgoulet

  • Keywords 026-backport removed

No chance we backport that to 026.

comment:23 Changed 3 months ago by dgoulet

  • Status changed from accepted to needs_review

Very small fix for this which is to not print the warning if an invalid hidden service port is requested. As far as I can tell, it's the only case where the reason END_CIRC_AT_ORIGIN is returned in that code path.

It's not an ideal fix I would say but it's that or a *massive* refactoring of connection_edge_process_relay_cell() and how it's used.

Now, with ProtocolWarnings 1, you see this and set to 0, the warning line is not present.

Apr 05 16:22:53.000 [info] rend_service_set_connection_addr_port(): No virtual port mapping exists for port 18163 on service bdf2bzs2od7ynr33
Apr 05 16:22:53.000 [info] connection_exit_begin_conn(): Didn't find rendezvous service (port 18163)
Apr 05 16:22:53.000 [warn] circuit_receive_relay_cell (backward) failed. Closing.

See branch: bug16706_031_01

comment:24 follow-up: Changed 3 months ago by nickm

  • Status changed from needs_review to merge_ready

If you open a ticket for the refactoring, I'm okay with merging this. (Maybe add some XXXX comments too?)

comment:25 Changed 3 months ago by nickm

  • Reviewer set to nickm

comment:26 in reply to: ↑ 24 Changed 3 months ago by dgoulet

Replying to nickm:

If you open a ticket for the refactoring, I'm okay with merging this. (Maybe add some XXXX comments too?)

Done here #21910

comment:27 Changed 3 months ago by nickm

  • Resolution set to fixed
  • Status changed from merge_ready to closed

Merging!

Note: See TracTickets for help on using tickets.