Opened 3 months ago

Last modified 3 weeks ago

#29387 assigned task

Publish our puppet repository

Reported by: ln5 Owned by: anarcat
Priority: Medium Milestone:
Component: Internal Services/Tor Sysadmin Team Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

The Puppet repository used for the Tor infrastructure is not public.
We should fix that.

cf https://trac.torproject.org/projects/tor/wiki/org/meetings/2019BrusselsAdminTeamMinutes#Makingmoreofourrepositoriespublic

Child Tickets

TicketStatusOwnerSummaryComponent
#30009assignedanarcatconsider trocla for secrets management in puppetInternal Services/Tor Sysadmin Team
#30020assignedanarcatswitch from our custom YAML implementation to HieraInternal Services/Tor Sysadmin Team

Change History (6)

comment:1 Changed 6 weeks ago by ln5

Owner: changed from ln5 to tpa

comment:2 Changed 6 weeks ago by anarcat

From what i understand, there *are* (very few but still) sensitive parts to the repo, so we'll have to create a new history. If we're going to do that, I'd like to take that opportunity to rearchitecture the repository a little bit. There are standards on how to organize puppet code and, because of its old history, the current repository is slightly out of sync with those.

For example, we store "3rd party modules" in 3rdparty. Those usually go in modules instead. And what we have in modules often look more like profiles than modules. A more complete introduction and documentation for those ideas is better described here:

https://puppet.com/docs/pe/2017.2/r_n_p_intro.html

Similarly, we use a "monorepo" approach for now which makes things much simpler, but really complicates collaboration with external, public modules. I've been bitten by this trying to patch the Prometheus module, for example - it's been quite difficult to have local changes to the 3rd party module as I had to commit the changes in our repo, then copy those changes to an external checkout of the repository. I ended up with a hybrid approach of having the 3rd party module checked out under 3rdparty/modules/prometheus/.git while at the same time added to the parent .git repo, which means I need to commit changes twice for changes to propagate, which isn't much better than copying stuff around.

Most people seem to be using either r10k, librarian and/or code manager to handle that problem. There's upstream documentation on how to configure this here:

https://puppet.com/docs/pe/2017.2/cmgmt_managing_code.html

Improving this setup would also allow us to bootstrap a new puppetmaster more easily which, in turn, might allow us to do continuous integration (CI) on the Puppet setup, which would reduce a lot of "YOLO" commits we often have to do on the puppet repo because we can't test changes locally.

I know this opens a broader range of things to do, but I figured it was a good opportunity to bring it up. Besides, I doubt the repository in its current form will encourage much collaboration if it's non-standard. If we adopt community practices, we will be able to collaborate much more than with the current approach which is, after all, the objective of sharing that code...

comment:3 Changed 6 weeks ago by anarcat

oh and this was brought up today again because of all the noise I made in the internal channel. a simpler fix for that would be to move the commit announcements to another channel, should I create a ticket for that?

comment:4 Changed 4 weeks ago by anarcat

another note here... one improvement we could have in the infrastructure, if we make such a shift, is to have multiple "environments" (prod/stage/dev, prod/test, etc), which requires a way to assign environments to specific nodes. this is a minimal implementation that uses Hiera to do that:

https://github.com/Zetten/puppet-hiera-enc

comment:5 Changed 3 weeks ago by anarcat

Owner: changed from tpa to anarcat

comment:6 Changed 3 weeks ago by anarcat

so concretely, the TL;DR: of what I am proposing is this:

  1. convert everything to hiera (#30020) - this requires creating roles for each machine (more or less)
  2. move current modules/ into profiles/ and audit for private data
  3. move any private data into hiera/
  4. move 3rdparty modules into modules/
  5. publish everything but hiera/ as a new repository

Final picture

Once this is done, the final picture will look like this in /etc/puppet:

  • hiera/ - private data. machine -> role assignements, secret stuff like the alias file, machine location, price and other similar metadata and details (see also #29816)
  • modules/ - equivalent of the current 3rdparty/ directory: fully public, reusable code that's aimed at collaboration. mostly code from the Puppet forge or our own repository if no equivalent there
  • profiles/ - magic sauce on top of 3rd party modules/, already created a few modules/profiles/ for grafana and prometheus, the profiles configure official 3rd party classes with our site-specific criteria
  • roles/ - abstract classes that regroup a few profiles. for example roles::monitoring could currently include profiles::nagiosmaster, profiles::prometheus::server and profiles::grafana as an implementation

This could all be done in the current repository, without creating a new clean history one, but it would prepare us for that final step. And that step would simply be to move modules/, profiles/, and roles/ into a public repository, while keeping hiera/ private in its own repository.

Alternative proposal

The alternative approach is simply to create an entirely new repository that is identical to the current one, minus the virtual aliases file. But then I don't know where I would put the alias file, and I think it would be a missed opportunity to follow the industry's best practices I documented earlier in this ticket.

Further discussion

I would love to get feedback on this before I foray any further into this maze. For now I think it's safe to keep going on the Hiera conversion, as I discussed this with weasel and it seems to be consensual. But it seems the other ideas here (namely to use this opportunity to reshuffle the repository structure) seem to be less consensual.

Also note that I kept trocla out of the picture for now. We could keep using the current hkdf in this system, but it would be the last function left in the puppetmaster module, from what I can tell, which is another reason why I'm tempted to replace it as well.

Note: See TracTickets for help on using tickets.