Opened 7 months ago

Last modified 3 weeks ago

#29987 new project

clear out unowned files on servers

Reported by: anarcat Owned by: tpa
Priority: Low Milestone:
Component: Internal Services/Tor Sysadmin Team Version:
Severity: Minor Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

there is a significant number of unowned files on the servers. this is generally because a user was removed without the associated user being purged as well, but there are also odd corner cases like backup restores and so on.

In #29682, I have done the following Cumin run to find such files, expecting to find only problems with the Munin user/group I had just removed, but instead found many more cases, mostly (300,000) surrounding deleted users:

cumin -p 0 -b 5 --force -o txt '*' 'find / -ignore_readdir_race -path /proc -prune -nouser -o -nogroup' | tee unowned-files

Next step is to decide what to do with the leftover files and document this as part of the user retirement process.

Child Tickets

Change History (4)

comment:1 Changed 7 months ago by anarcat

The result of the run is in alberti.torproject.org:/home/anarcat/unowned-files. I haven't included it here because it's 84MB but also because it might contain sensitive information.

A cleaned up version of the file is in unowned-files-sorted, produced with the following command:

sed -n '/^___/,$p' < unowned-files | cut -d: -f2 | sort -u  > unowned-files-sorted

The idea of the first part is the Cumin produces the output *twice*, once as the regular output and then as a machine-readable output. We select only the latter. Then the cut takes only the actual paths (as opposed to host: path pairs) and finally, sort takes the unique paths across the entire set. The result is still over 40MB and lists ~300,000 files.

Many files are from removed users, but there are also "restore" runs on brulloi which make up a significant number. The remaining is fairly small:

$ sed -n '/^___/,$p' < unowned-files | grep -v -e /home/ -e /var/lib/sudo -e restore | wc -l 
277

... and mostly consists of random tidbits, which were basically:

  • listera:/lib/firmware: owned by 1000:1000, cleared out by chown'ing to root:root
  • *:/run/xtables.lock: owned by root:115 (previously the munin group), removed
  • brulloi:/root/etc.bak/munin/...: one year old /etc backup, ignored

The vast majority of the remaining (~277,000 files) are the restore stuff. It is mostly leftovers on brulloi, but there were also things in /srv/restored on staticiforme, I ignored both, since brulloi is going away and the other seemed harmless as it was readable only by root.

Finally, the remaining ~20,000 files are stuff in /home. This is the part I am not sure what to do with. For now, I'm just ignoring those as well until we make up our mind about what to do with the files leftovers by retired users.

comment:2 Changed 2 months ago by gaba

Next step is to decide what to do with the leftover files and document this as part of the user retirement process.

My 2 cents is to remove those files for eternity.

comment:3 Changed 2 months ago by anarcat

i deleted the stuff in /var/lib/sudo as that's clearly leftover crap, using:

cumin -b 5 --force -o txt '*' 'find /var/lib/sudo/ \( -nouser -o -nogroup \) -a -delete'

i also removed some stuff from andrew and nickm on alberti and staticiforme, but yeah, we do need a policy for the rest. i'd be happy to just clear out all that stuff, but we need to think about whether the stuff is useful...

comment:4 Changed 3 weeks ago by arma

I bet we could pare it down a lot more, e.g. by deleting the old git checkouts of random things. That would let us better see what remains.

A few of the things, like pictures of stickies from past dev meetings, would be a shame to throw out before looking at.

Note: See TracTickets for help on using tickets.