upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about auomated installs and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
checklist:
install needrestart everywhere, in interactive mode
switch needrestart to automatic mode
install unattended-upgrades everywhere
fix major upgrades docs to disable unattended-upgrades during the upgrade run
i setup needrestart everywhere, using a puppet module. it's currently in "interactive" mode, which means it will do nothing during automated upgrades and will prompt during manual ones. my hope is to use needrestart manually for a while to see if it works well and, when it does, deploy it automated everywhere.
i also eventually want to run unattended-upgrades everywhere.
between those two tools, we should get rid of 50-75% of the manual work involved here, the remaining being reboots. those could also be automated, if we find a way for the servers to coordinate among themselves.
remove from parent "ops report card" thing, as i want to close that ticket and it will be open forever if it depends on all the tickets generated from it.
Trac: Description: upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
to
upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about upgrades and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
i discussed this with hiro as part of our 2020 roadmap work. she volunteered to followup on this.
i made a checklist in the ticket summary: the next step is to enable needrestart automatically everywhere, which we should look at doing soon. then we deploy unattended-upgrades everywhere, making sure we update the buster major upgrade docs to disable unattended-upgrades while we do major upgrades, on step 4.
so, TL;DR: next step is needrestart auto everywhere.
Trac: Owner: anarcat to hiro Description: upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about upgrades and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
to
upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about upgrades and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
checklist:
install needrestart everywhere, in interactive mode
switch needrestart to automatic mode
install unattended-upgrades everywhere
fix major upgrades docs to disable unattended-upgrades during the upgrade run
switched needrestart to automatic mode in puppet now and forcibly deployed everywhere.
Trac: Description: upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about upgrades and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
checklist:
install needrestart everywhere, in interactive mode
switch needrestart to automatic mode
install unattended-upgrades everywhere
fix major upgrades docs to disable unattended-upgrades during the upgrade run
to
upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about upgrades and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
checklist:
install needrestart everywhere, in interactive mode
switch needrestart to automatic mode
install unattended-upgrades everywhere
fix major upgrades docs to disable unattended-upgrades during the upgrade run
please do provide a review of the upstream pull request. if you think it's good, just say so in the pull request so I can officially merge it upstream. (note that I can merge it upstream without your approval, but i just think it's more transparent that way, plus it gives you some public credits on github and introduces you to the folks paying attention in the org)
i haven't audited the upstream module's source code and will assume you have done due dilligence here :)
did you test the deployment somewhere? how do you plan to do the deployment? just dropping it in hiera/common.yaml is a rather... bold move, I would say... ;) i have written instructions on how to do a progressive deployment here: https://help.torproject.org/tsa/howto/puppet/#Progressive_deployment
note that the progressive deployments notes seem a bit dated now, these days I deploy classes as includes in a role instead of directly in hiera, because hiera includes classes in a non-deterministic way, which can be confusing sometimes. see the way profile::jumphost was progressively deployed for an example (commits 8c1d3087 c2439c7f dd3a1d7b c57b446c cdcc8576, etc)
Hi,
I have pushed a new branch addressing all your comments: unattended-upgrades.
Regarding comment 3. I have audited the upstream code to the best of my knowledge.
I was in the process of updating and commenting on https://github.com/voxpupuli/puppet-unattended_upgrades/pull/148 but I see this has been already merged. Nice.
I have pushed a new branch addressing all your comments: unattended-upgrades.
it seems we now have three branches for this... i think it would have been preferable to force-push to the topic branch instead of creating new ones... please do cleanup the old ones to leave only the current one.
after you merge, do remove the good branch as well, of course. :)
now as for the review of the unattended-upgrades branch...
I don't think this is necessary:
+# a host that is monitored+class roles::unattended_upgrades {+ include profile::unattended_upgrades+}
we don't need a role at all, we can include the profile in the relevant roles. for example, this:
... could be turned into an include profile::unattended_upgrades inside the roles::ircbox.
that said, that's how the progressive deployment docs look right now, so I can't really blame you for following it. :)
anyways this looks good and I'd say go ahead with it. you are correctly including the functionality only in one node in that way, that's the important part to get right and it looks like you've done it. :)
(if you're curious about why i'm now hesitant in adding roles to hiera there: it's because those classes get added as prometheus labels which creates needless noise in the prometheus time series and confuses grafana...)
Trac: Owner: anarcat to hiro Status: needs_review to assigned
I have pushed a new branch addressing all your comments: unattended-upgrades.
it seems we now have three branches for this... i think it would have been preferable to force-push to the topic branch instead of creating new ones... please do cleanup the old ones to leave only the current one.
I thought it was easier to just reapply the patches cleanly. My plan was to delete the old branches after merging, but since you have mentioned I have now deleted the other branches.
after you merge, do remove the good branch as well, of course. :)
Sure
now as for the review of the unattended-upgrades branch...
I don't think this is necessary:
{{{
+# a host that is monitored
+class roles::unattended_upgrades {
include profile::unattended_upgrades
+}
}}}
we don't need a role at all, we can include the profile in the relevant roles. for example, this:
... could be turned into an include profile::unattended_upgrades inside the roles::ircbox.
that said, that's how the progressive deployment docs look right now, so I can't really blame you for following it. :)
anyways this looks good and I'd say go ahead with it. you are correctly including the functionality only in one node in that way, that's the important part to get right and it looks like you've done it. :)
(if you're curious about why i'm now hesitant in adding roles to hiera there: it's because those classes get added as prometheus labels which creates needless noise in the prometheus time series and confuses grafana...)
Ok I'll try to merge this as see how it goes for chives.
I thought it was easier to just reapply the patches cleanly. My plan was to delete the old branches after merging, but since you have mentioned I have now deleted the other branches.
i think it's fine to overwrite topic branches like this during the review process. right now it means we lose the older version, but that's fine because if I have cloned the repo, that branch is still technically available locally (through the reflog). if we'd use gitlab for review, it would be even visible through the web UI as well...
also, i think you forgot one branch (auto-updates) :)
thanks for the cleanup!
Ok I'll try to merge this as see how it goes for chives.
excellent! don't forget about --noop ;) (now also patn)
before we close this ticket, we should consider automated reboots as well, either as a new ticket, or by documenting the procedure to automate reboots now.
at least the process is not clear to me at all right now: it's taking a long time and it's error-prone.
move automated reboots to another ticket, #33406 (moved)
Trac: Description: upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about upgrades and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
checklist:
install needrestart everywhere, in interactive mode
switch needrestart to automatic mode
install unattended-upgrades everywhere
fix major upgrades docs to disable unattended-upgrades during the upgrade run
to
upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about auomated installs and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
checklist:
install needrestart everywhere, in interactive mode
switch needrestart to automatic mode
install unattended-upgrades everywhere
fix major upgrades docs to disable unattended-upgrades during the upgrade run
hiro deployed unattended upgrades everywhere last week, so this is almost done.
next step here is to fix the docs to mention that we have this running (and deprecate the tor-prepare-upgrades stuff) and fix the major upgrades docs, as documented in the summary.
we should also make sure that unattended-upgrades follows backports upgrade. i installed smartd from backports everywhere in #33684 (moved) (on new servers) and it would be important to have that work.
Trac: Description: upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about auomated installs and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
checklist:
install needrestart everywhere, in interactive mode
switch needrestart to automatic mode
install unattended-upgrades everywhere
fix major upgrades docs to disable unattended-upgrades during the upgrade run
upgrades take up a significant chunk of time every week and distract sysadmins (or at least me) from focusing on other projects.
upgrades should be therefore automated, as much as possible.
see also #31239 (moved) about auomated installs and this is part of the wider "ops card questionnaire", where we answered no to a question about this, see #30881 (moved).
checklist:
install needrestart everywhere, in interactive mode
switch needrestart to automatic mode
install unattended-upgrades everywhere
fix major upgrades docs to disable unattended-upgrades during the upgrade run
there was a problem with ganeti during the last buster point release upgrade. it could be with needrestart or unattended-upgrades, but it should be fixed. i documented what i know in #34185 (moved).
looks like needrestart doesn't find php processes that dsa-check-libs does find, needs to be checked:
root@crm-int-01:~# needrestart Scanning processes... Scanning candidates... Scanning linux images... Running kernel seems to be up-to-date.Restarting services... systemctl restart cron.serviceNo containers need to be restarted.No user sessions are running outdated binaries.root@crm-int-01:~# dsa-check-libs The following processes have libs linked that were upgraded: torcivicrm: php (772, 776)
same with colchicifolium:
root@colchicifolium:~# dsa-check-libs The following processes have libs linked that were upgraded: collector: C1 CompilerThre (25575), C2 CompilerThre (25575), CollecTor-Sched (25575), Finalizer (25575), Reference Handl (25575), Service Thread (25575), Signal Dispatch (25575), VM Periodic Tas (25575), VM Thread (25575), java (25575), logback-1 (25575), logback-2 (25575), logback-3 (25575), logback-4 (25575), logback-5 (25575), logback-6 (25575), logback-7 (25575), logback-8 (25575), pool-2-thread-1 (25575), pool-2-thread-2 (25575), pool-2-thread-3 (25575), pool-3-thread-1 (25575), pool-3-thread-2 (25575), pool-3-thread-3 (25575)root@colchicifolium:~# needrestart Scanning processes... Scanning linux images... Running kernel seems to be up-to-date.No services need to be restarted.No containers need to be restarted.No user sessions are running outdated binaries.
although the latter is harder to fix: it's not clear to me how that process is started... it's certainly not supervised by systemd directly anyways: