Opened 6 months ago

Closed 5 months ago

#34115 closed defect (fixed)

review the impact of usrmerge

Reported by: anarcat Owned by: anarcat
Priority: High Milestone:
Component: Internal Services/Tor Sysadmin Team Version:
Severity: Major Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by anarcat)

Debian buster shipped with a "merged /usr", which means that /bin, /lib and /sbin are now symlinks to their counterparts in /usr. There are concerns that this behavior is buggy and triggers problems in all sorts of places. In particular, the dpkg maintainers are quite unhappy about the change and do not support it as a configuration:

https://wiki.debian.org/Teams/Dpkg/MergedUsr

... which is disturbing, considering the dpkg is such a core component of a Debian system.

That wiki page provides a hackish script to "migrate away" from usrmerge but no one, as far as I know, has done that in production. It definitely looks nasty.

We should consider :

  • [ ] whether this is a real problem (probably?)
  • [x] which machines have usrmerge (20 machines or 27%, detailed below)
  • [x] whether new machines should have it (probably not? not having usrmerge is *not* a problem, and having it has risks, so let's not risk it?)
  • [ ] whether we need to fix old machines

There are two ways of fixing the installers:

  • pass --no-merged-usr to deboostrap
  • use mmdebstrap

The latter has the advantage of being faster, at the cost of being possibly less reliable and compatible.

Next steps:

  1. [x] fix cloud installer - fixed in the wiki and tsa-misc
  2. [x] fix robot installer - fixed in the wiki and tsa-misc
  3. [x] fix ganeti installer - reported as bug 959745, mentioned in the wiki, reported in the puppet module

Child Tickets

Change History (6)

comment:1 Changed 6 months ago by anarcat

Description: modified (diff)

inventory of servers with a merged-usr, done by running readlink /bin on all machines with cumin:

27.0% (20/74) success ratio (>= 0.0% threshold) for command: 'readlink /bin'.: bacula-director-01.torproject.org,build-arm-10.torproject.org,cache01.torproject.org,cache-02.torproject.org,check-01.torproject.org,chives.torproject.org,fsn-node-[03-05].torproject.org,gettor-01.torproject.org,gitlab-02.torproject.org,loghost01.torproject.org,onionbalance-01.torproject.org,onionoo-backend-01.torproject.org,onionoo-frontend-01.torproject.org,static-master-fsn.torproject.org,submit-01.torproject.org,tbb-nightlies-master.torproject.org,web-fsn-[01-02].torproject.org

those machines do *not* have a usrmerged:

73.0% (54/74) of nodes failed to execute command 'readlink /bin': alberti.torproject.org,archive-01.torproject.org,build-x86-[05-06,08-09].torproject.org,bungei.torproject.org,carinatum.torproject.org,cdn-backend-sunet-01.torproject.org,colchicifolium.torproject.org,corsicum.torproject.org,crm-ext-01.torproject.org,crm-int-01.torproject.org,cupani.torproject.org,eugeni.torproject.org,fallax.torproject.org,forrestii.torproject.org,fsn-node-[01-02].torproject.org,gayi.torproject.org,henryi.torproject.org,hetzner-hel1-[01-03].torproject.org,hetzner-nbg1-[01-02].torproject.org,kvm[4-5].torproject.org,listera.torproject.org,majus.torproject.org,mandos-01.torproject.org,materculae.torproject.org,meronense.torproject.org,moly.torproject.org,neriniflorum.torproject.org,nevii.torproject.org,nutans.torproject.org,omeiense.torproject.org,oo-hetzner-03.torproject.org,orestis.torproject.org,palmeri.torproject.org,pauli.torproject.org,peninsulare.torproject.org,perdulce.torproject.org,polyanthum.torproject.org,rouyi.torproject.org,rude.torproject.org,scw-arm-par-01.torproject.org,staticiforme.torproject.org,subnotabile.torproject.org,troodi.torproject.org,vineale.torproject.org,web-cymru-01.torproject.org,web-hetzner-01.torproject.org

comment:2 Changed 6 months ago by anarcat

Description: modified (diff)

i have filed bug 959745 against ganeti-instance-debootstrap to see if that can be customized. it seems we should be able to override the debootstrap call by defining a function in the variants config file, which remains to be tested.

comment:3 Changed 6 months ago by anarcat

Description: modified (diff)
Status: assignedaccepted

fixed the cloud and robot installer, need to consider the ganeti hack next.

comment:4 Changed 6 months ago by anarcat

Description: modified (diff)

tested the "function shell" hack on fsn-node-01 in a test VM: it works. suggested is in an issue in the puppet module, which we'll need to patch:

https://gitlab.com/shared-puppet-modules-group/puppet-ganeti/-/issues/7

waiting for feedback there before pushing any further.

comment:5 Changed 6 months ago by anarcat

one impact of mergedusr is actually when it's *not* enabled. for example, with mergedusr, /usr/sbin/ip is a valid path, but without, it isn't. so someone (mistakenly, perhaps) hardcoding said path will *fail* on a non-merged system (including any stretch system) while working on a merged system.

so that's actually an argument *for* enabling merged-usr...

comment:6 Changed 5 months ago by anarcat

Description: modified (diff)
Resolution: fixed
Status: acceptedclosed

fixed deboostrap in ganeti installs to use --no-merged-usr as well.

we can revisit this later for existing installs, but for now this should keep us somewhat safe in the future. worst case, we at least have knobs on how to switch that off everywhere as well. just grep for --no-merged-usr.

Note: See TracTickets for help on using tickets.