Frequent OOM kills of tor process

We are currently expierencing frequent Out of memory kills of or tor processes on two boxes running ubuntu 14.04.

root@tor1:~# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 14.04.5 LTS
Release:	14.04
Codename:	trusty

We experience this on following versions:

0.3.0.5-rc-1~~trusty+1 and 0.2.9.10-1~~trusty+1

I changed to the rc after the normal 2.9.10-1 Version keept having this problems. The 0.3.0.5-rc has the same problems.

From syslog


May 14 10:21:42 tor1 kernel: [123517.565688] UDP: bad checksum. From 82.247.214.125:51413 to 176.10.104.240:21446 ulen 38
May 14 11:02:47 tor1 kernel: [125982.993268] tor invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
May 14 11:02:47 tor1 kernel: [125982.993270] tor cpuset=/ mems_allowed=0
May 14 11:02:47 tor1 kernel: [125982.993272] CPU: 2 PID: 33180 Comm: tor Not tainted 3.13.0-117-generic !#164-Ubuntu
May 14 11:02:47 tor1 kernel: [125982.993273] Hardware name: HP ProLiant ML310e Gen8 v2, BIOS P78 03/28/2014
May 14 11:02:47 tor1 kernel: [125982.993274]  0000000000000000 ffff8800a8679990 ffffffff8172d1c9 ffff88013c868000
May 14 11:02:47 tor1 kernel: [125982.993276]  0000000000000000 ffff8800a8679a18 ffffffff81727768 ffffffff8106a926
May 14 11:02:47 tor1 kernel: [125982.993278]  ffff8800a86799f0 ffffffff810cb27c 0000000001320122 0000000000000000
May 14 11:02:47 tor1 kernel: [125982.993279] Call Trace:
May 14 11:02:47 tor1 kernel: [125982.993285]  [<ffffffff8172d1c9>] dump_stack+0x64/0x82
May 14 11:02:47 tor1 kernel: [125982.993286]  [<ffffffff81727768>] dump_header+0x7f/0x1f1
May 14 11:02:47 tor1 kernel: [125982.993289]  [<ffffffff8106a926>] ? put_online_cpus+0x56/0x80
May 14 11:02:47 tor1 kernel: [125982.993291]  [<ffffffff810cb27c>] ? rcu_oom_notify+0xcc/0xf0
May 14 11:02:47 tor1 kernel: [125982.993294]  [<ffffffff81156d81>] oom_kill_process+0x201/0x360
May 14 11:02:47 tor1 kernel: [125982.993297]  [<ffffffff812dde35>] ? security_capable_noaudit+0x15/0x20
May 14 11:02:47 tor1 kernel: [125982.993299]  [<ffffffff81157511>] out_of_memory+0x471/0x4b0
May 14 11:02:47 tor1 kernel: [125982.993301]  [<ffffffff8115d82c>] !__alloc_pages_nodemask+0xa6c/0xb90
May 14 11:02:47 tor1 kernel: [125982.993304]  [<ffffffff8119e25a>] alloc_pages_vma+0x9a/0x140
May 14 11:02:47 tor1 kernel: [125982.993307]  [<ffffffff811908eb>] read_swap_cache_async+0xeb/0x160
May 14 11:02:47 tor1 kernel: [125982.993308]  [<ffffffff811909f8>] swapin_readahead+0x98/0xe0
May 14 11:02:47 tor1 kernel: [125982.993320]  [<ffffffff8117e90b>] handle_mm_fault+0xa7b/0xf10
May 14 11:02:47 tor1 kernel: [125982.993322]  [<ffffffff81739344>] !__do_page_fault+0x184/0x560
May 14 11:02:47 tor1 kernel: [125982.993325]  [<ffffffff810a3af5>] ? set_next_entity+0x95/0xb0
May 14 11:02:47 tor1 kernel: [125982.993328]  [<ffffffff810135db>] ? !__switch_to+0x16b/0x4f0
May 14 11:02:47 tor1 kernel: [125982.993329]  [<ffffffff8173973a>] do_page_fault+0x1a/0x70
May 14 11:02:47 tor1 kernel: [125982.993331]  [<ffffffff81735a68>] page_fault+0x28/0x30
May 14 11:02:47 tor1 kernel: [125982.993350] Mem-Info:
May 14 11:02:47 tor1 kernel: [125982.993351] Node 0 DMA per-cpu:
May 14 11:02:47 tor1 kernel: [125982.993352] CPU    0: hi:    0, btch:   1 usd:   0
May 14 11:02:47 tor1 kernel: [125982.993353] CPU    1: hi:    0, btch:   1 usd:   0
May 14 11:02:47 tor1 kernel: [125982.993353] CPU    2: hi:    0, btch:   1 usd:   0
May 14 11:02:47 tor1 kernel: [125982.993354] CPU    3: hi:    0, btch:   1 usd:   0
May 14 11:02:47 tor1 kernel: [125982.993355] Node 0 DMA32 per-cpu:
May 14 11:02:47 tor1 kernel: [125982.993356] CPU    0: hi:  186, btch:  31 usd:  25
May 14 11:02:47 tor1 kernel: [125982.993356] CPU    1: hi:  186, btch:  31 usd:   2
May 14 11:02:47 tor1 kernel: [125982.993357] CPU    2: hi:  186, btch:  31 usd:   0
May 14 11:02:47 tor1 kernel: [125982.993358] CPU    3: hi:  186, btch:  31 usd:  26
May 14 11:02:47 tor1 kernel: [125982.993358] Node 0 Normal per-cpu:
May 14 11:02:47 tor1 kernel: [125982.993359] CPU    0: hi:  186, btch:  31 usd:   0
May 14 11:02:47 tor1 kernel: [125982.993360] CPU    1: hi:  186, btch:  31 usd:   0
May 14 11:02:47 tor1 kernel: [125982.993360] CPU    2: hi:  186, btch:  31 usd:   0
May 14 11:02:47 tor1 kernel: [125982.993361] CPU    3: hi:  186, btch:  31 usd:   0
May 14 11:02:47 tor1 kernel: [125982.993363] active_!anon:0 inactive_!anon:0 isolated_!anon:0
May 14 11:02:47 tor1 kernel: [125982.993363]  active_!file:141 inactive_!file:204 isolated_!file:0
May 14 11:02:47 tor1 kernel: [125982.993363]  !unevictable:0 !dirty:0 !writeback:0 !unstable:0
May 14 11:02:47 tor1 kernel: [125982.993363]  !free:20517 slab_!reclaimable:7176 slab_!unreclaimable:41948
May 14 11:02:47 tor1 kernel: [125982.993363]  !mapped:80 !shmem:0 !pagetables:692 !bounce:0
May 14 11:02:47 tor1 kernel: [125982.993363]  free_!cma:0
May 14 11:02:47 tor1 kernel: [125982.993365] Node 0 DMA !free:15344kB !min:268kB !low:332kB !high:400kB active_!anon:0kB inactive_!anon:0kB active_!file:4kB inactive_!file:28kB !unevictable:0kB isolated(anon):0kB isolated(file):0kB !present:15976kB !managed:15892kB !mlocked:0kB !dirty:0kB !writeback:8kB !mapped:36kB !shmem:0kB slab_!reclaimable:0kB slab_!unreclaimable:156kB kernel_!stack:0kB !pagetables:4kB !unstable:0kB !bounce:0kB free_!cma:0kB writeback_!tmp:0kB pages_!scanned:18 all_unreclaimable? no
May 14 11:02:47 tor1 kernel: [125982.993368] lowmem_reserve[]: 0 2688 3771 3771
May 14 11:02:47 tor1 kernel: [125982.993370] Node 0 DMA32 !free:51080kB !min:46532kB !low:58164kB !high:69796kB active_!anon:0kB inactive_!anon:0kB active_!file:516kB inactive_!file:788kB !unevictable:0kB isolated(anon):0kB isolated(file):0kB !present:2832272kB !managed:2753204kB !mlocked:0kB !dirty:0kB !writeback:0kB !mapped:284kB !shmem:0kB slab_!reclaimable:15612kB slab_!unreclaimable:83432kB kernel_!stack:312kB !pagetables:2064kB !unstable:0kB !bounce:0kB free_!cma:0kB writeback_!tmp:0kB pages_!scanned:17 all_unreclaimable? no
May 14 11:02:47 tor1 kernel: [125982.993372] lowmem_reserve[]: 0 0 1082 1082
May 14 11:02:47 tor1 kernel: [125982.993373] Node 0 Normal !free:15644kB !min:18732kB !low:23412kB !high:28096kB active_!anon:0kB inactive_!anon:0kB active_!file:44kB inactive_!file:0kB !unevictable:0kB isolated(anon):0kB isolated(file):0kB !present:1179644kB !managed:1108484kB !mlocked:0kB !dirty:0kB !writeback:0kB !mapped:0kB !shmem:0kB slab_!reclaimable:13092kB slab_!unreclaimable:84204kB kernel_!stack:2072kB !pagetables:700kB !unstable:0kB !bounce:0kB free_!cma:0kB writeback_!tmp:0kB pages_!scanned:203 all_unreclaimable? yes
May 14 11:02:47 tor1 kernel: [125982.993376] lowmem_reserve[]: 0 0 0 0
May 14 11:02:47 tor1 kernel: [125982.993377] Node 0 DMA: 3*4kB (M) 3*8kB (UM) 2*16kB (UM) 0*32kB 1*64kB (U) 1*128kB (M) 1*256kB (M) 1*512kB (M) 2*1024kB (UM) 2*2048kB (MR) 2*4096kB (M) = 15364kB
May 14 11:02:47 tor1 kernel: [125982.993384] Node 0 DMA32: 303*4kB (UE) 662*8kB (UEM) 418*16kB (UEM) 952*32kB (UEM) 51*64kB (UM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 51148kB
May 14 11:02:47 tor1 kernel: [125982.993394] Node 0 Normal: 375*4kB (UEM) 330*8kB (UEM) 649*16kB (UEM) 3*32kB (M) 2*64kB (R) 1*128kB (R) 1*256kB (R) 1*512kB (R) 0*1024kB 0*2048kB 0*4096kB = 15644kB
May 14 11:02:47 tor1 kernel: [125982.993401] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
May 14 11:02:47 tor1 kernel: [125982.993401] 385 total pagecache pages
May 14 11:02:47 tor1 kernel: [125982.993402] 20 pages in swap cache
May 14 11:02:47 tor1 kernel: [125982.993403] Swap cache stats: add 2432647, delete 2432627, find 700250/1142133
May 14 11:02:47 tor1 kernel: [125982.993404] Free swap  = 3627940kB
May 14 11:02:47 tor1 kernel: [125982.993404] Total swap = 4026364kB

May 14 11:02:47 tor1 kernel: [125982.993405] 1006973 pages RAM
May 14 11:02:47 tor1 kernel: [125982.993405] 0 pages HighMem/MovableOnly
May 14 11:02:47 tor1 kernel: [125982.993406] 17790 pages reserved
May 14 11:02:47 tor1 kernel: [125982.993406] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
May 14 11:02:47 tor1 kernel: [125982.993422] ![32898]     0 32898     5073        0      13      251             0 upstart-udev-br
May 14 11:02:47 tor1 kernel: [125982.993423] ![32903]     0 32903    12900        1      27      228         -1000 systemd-udevd
May 14 11:02:47 tor1 kernel: [125982.993425] ![33049]     0 33049     3821        0      11       48             0 upstart-file-br
May 14 11:02:47 tor1 kernel: [125982.993426] ![33092]   102 33092     9813        1      23       95             0 dbus-daemon
May 14 11:02:47 tor1 kernel: [125982.993436] ![33096]   101 33096    63962        0      28      722             0 rsyslogd
May 14 11:02:47 tor1 kernel: [125982.993437] ![33119]     0 33119    10864        1      26       88             0 systemd-logind
May 14 11:02:47 tor1 kernel: [125982.993439] ![33206]     0 33206     3817        0      12       55             0 upstart-socket-
May 14 11:02:47 tor1 kernel: [125982.993440] ![33053]     0 33053     3699        1      12       41             0 getty
May 14 11:02:47 tor1 kernel: [125982.993441] ![33056]     0 33056     3699        1      12       41             0 getty
May 14 11:02:47 tor1 kernel: [125982.993442] ![33061]     0 33061     3699        1      12       39             0 getty
May 14 11:02:47 tor1 kernel: [125982.993443] ![33062]     0 33062     3699        1      11       40             0 getty
May 14 11:02:47 tor1 kernel: [125982.993444] ![33064]     0 33064     3699        1      12       39             0 getty
May 14 11:02:47 tor1 kernel: [125982.993446] ![33079]     0 33079     1094        0       8       35             0 acpid
May 14 11:02:47 tor1 kernel: [125982.993447] ![33089]     0 33089    15347        1      33      171         -1000 sshd
May 14 11:02:47 tor1 kernel: [125982.993448] ![33094]     0 33094     5915        0      17       81             0 cron
May 14 11:02:47 tor1 kernel: [125982.993449] ![33095]     0 33095     4786        0      12       48             0 atd
May 14 11:02:47 tor1 kernel: [125982.993450] ![33155]     0 33155     4801       34      13       73             0 irqbalance
May 14 11:02:47 tor1 kernel: [125982.993457] ![33180]   106 33180   111003        0     152    50520             0 tor
May 14 11:02:47 tor1 kernel: [125982.993458] ![33317]   108 33317    20352        0      41     9936             0 unbound
May 14 11:02:47 tor1 kernel: [125982.993459] ![33369]   105 33369     1871       12       9       33             0 vnstatd
May 14 11:02:47 tor1 kernel: [125982.993460] ![32890]     0 32890     3699        1      11       40             0 getty
May 14 11:02:47 tor1 kernel: [125982.993461] ![32991]  1002 32991     2867        0      10       84             0 memlog.sh
May 14 11:02:47 tor1 kernel: [125982.993462] ![33057]   106 33057    97196        0     128    36538             0 tor
May 14 11:02:47 tor1 kernel: [125982.993464] ![57323]  1002 57323     2867        0      10       91             0 memlog.sh
May 14 11:02:47 tor1 kernel: [125982.993465] ![57325]  1002 57325     2218       78      10       32             0 grep
May 14 11:02:47 tor1 kernel: [125982.993466] ![57326]  1002 57326     2746       44      11       46             0 sed
May 14 11:02:47 tor1 kernel: [125982.993467] Out of memory: Kill process 33180 (tor) score 25 or sacrifice child
May 14 11:02:47 tor1 kernel: [125982.993478] Killed process 33180 (tor) !total-vm:444012kB, !anon-rss:0kB, !file-rss:0kB

notices.log does not show anything around the kill time. First record ist the restart of the process at 20:32


May 14 06:54:03.000 [notice] Tor 0.3.0.5-rc (git-61f68662421142d2) opening new log file.
May 14 07:13:01.000 [warn] eventdns: All nameservers have failed
May 14 07:13:01.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 07:47:36.000 [warn] eventdns: All nameservers have failed
May 14 07:47:36.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 08:10:38.000 [warn] eventdns: All nameservers have failed
May 14 08:10:38.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 08:12:28.000 [warn] eventdns: All nameservers have failed
May 14 08:12:28.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 08:20:08.000 [warn] eventdns: All nameservers have failed
May 14 08:20:09.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 20:23:18.000 [notice] Tor 0.3.0.5-rc (git-61f68662421142d2) opening log file.
May 14 20:23:18.000 [notice] Parsing GEOIP IPv4 file /usr/share/tor/geoip.
May 14 20:23:18.000 [notice] Parsing GEOIP IPv6 file /usr/share/tor/geoip6.
May 14 20:23:18.000 [notice] Configured to measure statistics. Look for the *-stats files that will first be written to the data directory in 24 hours from now.

We are operation the * Exit Servers since years and they became unstable about a week ago. No changes other the normal software updates where performed.

Other than the two tor processes an local unbound resolver and ssh are running.

It feels like an memory leak. Since tor is the one getting killed I assume the problem would be the tor software. Any thoughts ?

Trac:
Username: DeS

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information