Opened 12 months ago

Last modified 11 months ago

#32461 new defect

do not write logs on caching servers

Reported by: anarcat Owned by: anarcat
Priority: Medium Milestone:
Component: Internal Services/Service - cache Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

In #32239, a caching system was deployed with nginx. To get hit rate ratios, log files are written to disk, with IP address and user agents anonymized. That's okay-ish: it's not as well anonymized as our apache log files because it's not possible to have a per-day granularity in timestamps.

From there, mtail wakes up once in a while and parses those logfiles and counts things, which are exposed as metrics picked up by prometheus. That in turn gives us pretty Prometheus graphs and makes us feel better about ourselves.

But ideally, we wouldn't have log files at all and pipe things directly into mtail. But we don't want to hang the webserver while waiting for mtail (which can be a little flaky), so the typical way to deal with this is to pipe logs first in syslog.

I couldn't immediately figure out how to do this during deployment so I'm opening this ticket to make sure we eventually operate that conversion.

One problem I had is the syslog-ng config sends all logs to the central logging server. If we start pushing web hits into syslog, this could become unwieldy, to say the least, in terms of performance mostly, but also privacy.

It's also not clear to me how to send logs from syslog into mtail without hitting the disk in the first place.

So the checklist is:

  1. how to send logs from nginx to syslog (access_log syslog:server=unix:/dev/log,facility=local3,tag=nginx_access extended; seems to be the magic config in nginx)
  2. how to avoid sending those logs to the central server
  3. how to send those logs (and only those) into mtail

All of this should be automatically configured in Puppet as well.

Child Tickets

Change History (2)

comment:1 Changed 12 months ago by anarcat

a/i sends logs from syslog into mtail using a rsyslog rule like this:

ruleset(name="incoming") {
  # [...]
  action(type="ompipe" Pipe="/run/mtail.fifo")
  # [...]
}

Then mtail gets started by systemd using socket activation, with something like this:

[Unit]
Description=MTail input FIFO

[Socket]
ListenFIFO=/run/mtail.fifo
SocketMode=700
SocketUser=mtail
SocketGroup=mtail
PipeSize=1M
RemoveOnStop=on

Finally, there's a service file which has some magic bits to deal with memory leaks that mtail apparently suffers from (at least in stretch):

[Unit]
Description=MTail
Requires=mtail.socket

[Service]
Type=simple
# Systemd will pass mtail.socket as FD 3.
ExecStart=/usr/bin/mtail --progs /etc/mtail --logtostderr --port 3903 --logs /dev/fd/3
Restart=on-failure
User=mtail

# Limit memory leaks
MemoryMax=1G
ExecStartPost=+/bin/sh -c "echo 0 > /sys/fs/cgroup/memory/system.slice/%n/memory.swappiness"

[Install]
WantedBy=multi-user.target

That should about cover it. We need to figure out how that would translate into a newsyslog config.

comment:2 Changed 11 months ago by anarcat

Component: Internal Services/Tor Sysadmin TeamInternal Services/Service - cache
Owner: changed from tpa to anarcat
Note: See TracTickets for help on using tickets.