Opened 21 months ago

Last modified 12 months ago

#29387 assigned task

Publish our puppet repository

Reported by: ln5 Owned by: anarcat
Priority: Medium Milestone:
Component: Internal Services/Tor Sysadmin Team Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


The Puppet repository used for the Tor infrastructure is not public.
We should fix that.


Child Tickets

#30009needs_reviewanarcatconsider trocla for secrets management in puppetInternal Services/Tor Sysadmin Team
#30020acceptedanarcatswitch from our custom YAML implementation to HieraInternal Services/Tor Sysadmin Team
#30770newtpaconsider alternatives to the puppet mono-repoInternal Services/Tor Sysadmin Team
#31633newtpapublish HTML documentation of our puppet sourceInternal Services/Tor Sysadmin Team

Change History (8)

comment:1 Changed 20 months ago by ln5

Owner: changed from ln5 to tpa

comment:2 Changed 20 months ago by anarcat

From what i understand, there *are* (very few but still) sensitive parts to the repo, so we'll have to create a new history. If we're going to do that, I'd like to take that opportunity to rearchitecture the repository a little bit. There are standards on how to organize puppet code and, because of its old history, the current repository is slightly out of sync with those.

For example, we store "3rd party modules" in 3rdparty. Those usually go in modules instead. And what we have in modules often look more like profiles than modules. A more complete introduction and documentation for those ideas is better described here:

Similarly, we use a "monorepo" approach for now which makes things much simpler, but really complicates collaboration with external, public modules. I've been bitten by this trying to patch the Prometheus module, for example - it's been quite difficult to have local changes to the 3rd party module as I had to commit the changes in our repo, then copy those changes to an external checkout of the repository. I ended up with a hybrid approach of having the 3rd party module checked out under 3rdparty/modules/prometheus/.git while at the same time added to the parent .git repo, which means I need to commit changes twice for changes to propagate, which isn't much better than copying stuff around.

Most people seem to be using either r10k, librarian and/or code manager to handle that problem. There's upstream documentation on how to configure this here:

Improving this setup would also allow us to bootstrap a new puppetmaster more easily which, in turn, might allow us to do continuous integration (CI) on the Puppet setup, which would reduce a lot of "YOLO" commits we often have to do on the puppet repo because we can't test changes locally.

I know this opens a broader range of things to do, but I figured it was a good opportunity to bring it up. Besides, I doubt the repository in its current form will encourage much collaboration if it's non-standard. If we adopt community practices, we will be able to collaborate much more than with the current approach which is, after all, the objective of sharing that code...

comment:3 Changed 20 months ago by anarcat

oh and this was brought up today again because of all the noise I made in the internal channel. a simpler fix for that would be to move the commit announcements to another channel, should I create a ticket for that?

comment:4 Changed 19 months ago by anarcat

another note here... one improvement we could have in the infrastructure, if we make such a shift, is to have multiple "environments" (prod/stage/dev, prod/test, etc), which requires a way to assign environments to specific nodes. this is a minimal implementation that uses Hiera to do that:

comment:5 Changed 19 months ago by anarcat

Owner: changed from tpa to anarcat

comment:6 Changed 19 months ago by anarcat

so concretely, the TL;DR: of what I am proposing is this:

  1. convert everything to hiera (#30020) - this requires creating roles for each machine (more or less)
  2. move current modules/ into profiles/ and audit for private data
  3. move any private data into hiera/
  4. move 3rdparty modules into modules/
  5. publish everything but hiera/ as a new repository

Final picture

Once this is done, the final picture will look like this in /etc/puppet:

  • hiera/ - private data. machine -> role assignements, secret stuff like the alias file, machine location, price and other similar metadata and details (see also #29816)
  • modules/ - equivalent of the current 3rdparty/ directory: fully public, reusable code that's aimed at collaboration. mostly code from the Puppet forge or our own repository if no equivalent there
  • profiles/ - magic sauce on top of 3rd party modules/, already created a few modules/profiles/ for grafana and prometheus, the profiles configure official 3rd party classes with our site-specific criteria
  • roles/ - abstract classes that regroup a few profiles. for example roles::monitoring could currently include profiles::nagiosmaster, profiles::prometheus::server and profiles::grafana as an implementation

This could all be done in the current repository, without creating a new clean history one, but it would prepare us for that final step. And that step would simply be to move modules/, profiles/, and roles/ into a public repository, while keeping hiera/ private in its own repository.

Alternative proposal

The alternative approach is simply to create an entirely new repository that is identical to the current one, minus the virtual aliases file. But then I don't know where I would put the alias file, and I think it would be a missed opportunity to follow the industry's best practices I documented earlier in this ticket.

Further discussion

I would love to get feedback on this before I foray any further into this maze. For now I think it's safe to keep going on the Hiera conversion, as I discussed this with weasel and it seems to be consensual. But it seems the other ideas here (namely to use this opportunity to reshuffle the repository structure) seem to be less consensual.

Also note that I kept trocla out of the picture for now. We could keep using the current hkdf in this system, but it would be the last function left in the puppetmaster module, from what I can tell, which is another reason why I'm tempted to replace it as well.

comment:7 Changed 17 months ago by anarcat

another aspect here is how to manage sub-repositories. (I moved this comment into a separate child ticket, in #30770)

Last edited 17 months ago by anarcat (previous) (diff)

comment:8 Changed 12 months ago by anarcat

i really need to get this going, but i'm too busy doing actual things in the repo right now. ;)

as a workaround, i published this for the cache module so that other orgs can see how we did things. not ideal, but at least i could share:

Note: See TracTickets for help on using tickets.