crm-int-01 running out of disk space

We're at 92% disk use on crm-int-01 suddenly... It seems that disk usage grew suddenly three days ago:

https://grafana.torproject.org/d/ER3U2cqmk/node-exporter-server-metrics?panelId=31&fullscreen&orgId=1&var-node=crm-ext-01.torproject.org:9100&var-node=crm-int-01.torproject.org:9100&from=1585834161059&to=1588426161060

The mariadb server was stopped this morning as well. It seems innodb crashes with the following assertion:

2020-05-02  9:08:31 41 [ERROR] InnoDB: preallocating 65536 bytes for file ./torcrm_prod/civicrm_acl_contact_cache.ibd failed with error 28
2020-05-02  9:08:31 41 [Warning] InnoDB: Cannot create table `torcrm_prod`.`civicrm_acl_contact_cache` because tablespace full
2020-05-02 09:08:31 0x7f6c50252700  InnoDB: Assertion failure in file /build/mariadb-10.3-qB78gy/mariadb-10.3-10.3.22/storage/innobase/dict/dict0dict.cc line 491

There are also warnings on startup:

2020-05-02 13:23:17 0 [Note] InnoDB: Ignoring data file './torcrm_prod/#sql-ib1158381.ibd' with space ID 1137566. Another data file called ./torcrm_prod/civicrm_acl_contact_cache.ibd exists with the same space ID.
2020-05-02 13:23:17 0 [Note] InnoDB: Ignoring data file './torcrm_prod/civicrm_acl_contact_cache.ibd' with space ID 1137566. Another data file called ./torcrm_prod/#sql-ib1158381.ibd exists with the same space ID.

We have ~1.5GB left on the server:

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        20G   18G  1.6G  92% /

Most of that 18G is in /var with 5GB split between the two databases:

    2.4 GiB [##########] /torcrm_prod
    2.3 GiB [######### ] /torcrm_staging

Here's the top 10 tables in disk usage:

  528.0 MiB [##########]  civicrm_mailing_event_queue.ibd
  432.0 MiB [########  ]  civicrm_mailing_recipients.ibd 
  340.0 MiB [######    ]  civicrm_activity_contact.ibd  
  172.0 MiB [###       ]  civicrm_contact.ibd         
  168.0 MiB [###       ]  civicrm_log.ibd    
  148.0 MiB [##        ]  civicrm_mailing_event_delivered.ibd
   80.0 MiB [#         ]  civicrm_activity.ibd               
   60.0 MiB [#         ]  civicrm_group_contact.ibd
   52.0 MiB [          ]  civicrm_subscription_history.ibd
   52.0 MiB [          ]  civicrm_email.ibd

Backups take up the most space, however, at about 10GB. I am not familiar with how the backup system works on that host, but there are about 7.5GB of SHA256-* files in there:

root@crm-int-01:/var/backups/local/mysql# du -sch SHA256-* | tail -1
7.6G	total

Some of those are fairly old too:

root@crm-int-01:/var/backups/local/mysql# ls -alt SHA256-*  | tail -2
-rw-r----- 2 root root 16130992 Feb  8  2019 SHA256-f6810ff0245807455347d88a1a0d7eaf29368e64188e7c1766b64c0cc143570e
-rw-r----- 2 root root   130308 Feb  8  2019 SHA256-c30b262677ed796d892b888f7f417690ca55dd83bf17fa421fdd32438ca2203a

It also seems that the database grew quite a bit -- doubled in size -- in the last few months, according to the backup sizes:

  800.9 MiB [##########]  20200415-190301-torcrm_prod                                                                                                                                          
  670.4 MiB [########  ]  20200109-190301-torcrm_prod
  398.9 MiB [####      ]  20191002-190301-torcrm_prod

Obviously, a database server running out of disk space is an... undesirable condition, to say the least. :) Should we expand the disk usage for that server (which I would rather avoid doing during the weekend) or is there something you can do on your end to clean stuff up?

Alternatively, maybe we should improve the backup system here so it doesn't take up twice as much disk space as the production server. Or, even better, not be on the same partition as the prod...

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information