#24501 closed defect (implemented)

When we hit MaxMemInQueues, make the log message more quantitative

Reported by: arma Owned by:
Priority: Medium Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: review-group-28
Cc: alex_y_xu@… Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

A relay operator on #tor shared these log lines, on a fast exit relay running with 1GByte of memory (so maxmeminqueues is 768MBytes):

Nov 30 03:13:52.000 [notice] We're low on memory.  Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.)
Nov 30 03:13:52.000 [notice] Removed 118288 bytes by killing 124 circuits; 0 circuits remain alive. Also killed 0 non-linked directory connections.
Nov 30 03:13:52.000 [notice] We're low on memory.  Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.)
Nov 30 03:13:52.000 [notice] Removed 25940496 bytes by killing 125 circuits; 0 circuits remain alive. Also killed 0 non-linked directory connections.
Nov 30 03:13:53.000 [notice] We're low on memory.  Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.)
Nov 30 03:13:53.000 [notice] Removed 528 bytes by killing 1 circuits; 0 circuits remain alive. Also killed 0 non-linked directory connections.
Nov 30 03:13:53.000 [notice] We're low on memory.  Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.)
Nov 30 03:13:53.000 [notice] Removed 528 bytes by killing 2 circuits; 0 circuits remain alive. Also killed 0 non-linked directory connections.
Nov 30 03:13:54.000 [notice] We're low on memory.  Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.)
Nov 30 03:13:54.000 [notice] Removed 528 bytes by killing 1 circuits; 0 circuits remain alive. Also killed 0 non-linked directory connections.
Nov 30 03:13:54.000 [notice] We're low on memory.  Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.)
Nov 30 03:13:54.000 [notice] Removed 528 bytes by killing 2 circuits; 0 circuits remain alive. Also killed 0 non-linked directory connections.

I notice in cell_queues_check_size() that we sum up four things to compute alloc:

  size_t alloc = cell_queues_get_total_allocation();
  alloc += buf_get_total_allocation();
  alloc += tor_compress_get_total_allocation();
  const size_t rend_cache_total = rend_cache_get_total_allocation();
  alloc += rend_cache_total;

For operators who are debugging their memory use, and for us poor people trying to help diagnose, it would seem smart for us to expose these four numbers when we say "We're low on memory".

(I thought about this idea in particular because of the tor_compress_get_total_allocation() call and #24368.)

Child Tickets

Attachments (1)

0001-Make-the-log-message-more-quantitative.patch (2.0 KB) - added by ffmancera 12 months ago.

Download all attachments as: .zip

Change History (14)

comment:1 Changed 12 months ago by ffmancera

If everything is fine, I will work on it.

comment:2 Changed 12 months ago by ffmancera

Status: newneeds_review

Done, I hope everything is fine!

comment:3 Changed 11 months ago by ffmancera

I did some changes so check my github branch bug24501. I hope everything is fine!

comment:4 Changed 11 months ago by nickm

Milestone: Tor: 0.3.3.x-final

comment:5 Changed 11 months ago by Hello71

I think the commit summary could be better. "Add numbers to OOM log msg" still leaves you with a bunch of characters for other stuff, like the ticket number.

Tor doesn't use Signed-off-by.

'z' format qualifier isn't used in other places in Tor. I think a discussion might be needed before it's introduced. also, %zd is invalid to print size_t. use %zu.

comment:6 Changed 11 months ago by Hello71

Cc: alex_y_xu@… added
Status: needs_reviewneeds_revision

comment:7 Changed 11 months ago by Hello71

semi-OT, but I just noticed that some tor commits have Signed-off-by and some don't. since Signed-off-by is supposed to be defined per-project, this seems like a bad idea.

comment:8 Changed 11 months ago by ffmancera

Status: needs_revisionneeds_review

Changes done, check my github branch bug24501.

I have kept the Signed-off-by because others use it and it seems that there isn't any problem with it. (Anyway if nick or another one thinks that it is a bad idea, I will remove it.)

About introduce 'z' format qualifier, if they consider to not introduce it then I need a way to print this values.

Thanks!

Last edited 11 months ago by ffmancera (previous) (diff)

comment:9 Changed 11 months ago by nickm

Status: needs_reviewneeds_revision

Per discussion last night, I think we should add a unit test to verify that %zu works if we're going to start using it...but we can probably afford to start using it.

One other thing -- Shouldn't this also log cell_queues_get_total_allocation()?

comment:10 Changed 11 months ago by ffmancera

I have been looking at tests and I noticed that we already have a unit test for cell_queues_check_size() when we are oom on src/test/test_oom.c:161 and  we are calling circuits_handle_oom() on src/or/relay.c:2566.

So I think that we are already testing if %zu works and for me it works well.

Another thing -- I added cell_queues_get_total_allocation() on log message as discussed in #tor-dev.

comment:11 Changed 11 months ago by ffmancera

Status: needs_revisionneeds_review

comment:12 Changed 11 months ago by nickm

Keywords: review-group-28 added

comment:13 Changed 10 months ago by nickm

Resolution: implemented
Status: needs_reviewclosed

ok, looks solid. merging to master.

Note: See TracTickets for help on using tickets.