Opened 8 months ago

Last modified 6 weeks ago

#32025 new project

Stop using corpsvn and disable it as a service

Reported by: arma Owned by: tor-gitadm
Priority: Medium Milestone:
Component: Internal Services/Service - git Version:
Severity: Normal Keywords:
Cc: gaba Actual Points:
Parent ID: #17202 Points:
Reviewer: Sponsor:

Description

In #17202 we're going to decommission the server that runs our various svn services.

We have a plan for the public svn.tpo service: #15948

and we are making a plan for svninternal: #15949

That leaves corpsvn, which I think is the most actively used still -- for example our accounting folks use it. This ticket is about making and finishing the plan for shutting down the corpsvn service.

Child Tickets

Change History (6)

comment:1 Changed 8 months ago by arma

On the plus side, I think we're much closer to being able to do this ticket now, since the main person who puts things into corpsvn these days is Sue, and I suspect she just commits things to have them somewhere for posterity, i.e. she isn't using any of the actual revision control features.

So if we froze the corpsvn repo and gave her new instructions on where to send new things (probably Nextcloud), I bet we'd have a good start here.

She would still need to reference a checkout of corpsvn e.g. for giving documents to the auditors, but she probably has one on her laptop, so maybe "being available to help her if that checkout breaks" is a win over "need to keep corpsvn going as a service".

There might be other folks who still use their copy of corpsvn in a read-only kind of way. Maybe that's Erin. Or more? We need to ask the operations team.

Then we would have a next task which is "triage and sort the current corpsvn files, probably by category, and do something smart with them" but in theory that step can be orthogonal to shutting down the corpsvn service.

comment:2 Changed 7 months ago by anarcat

Then we would have a next task which is "triage and sort the current corpsvn files, probably by category, and do something smart with them" but in theory that step can be orthogonal to shutting down the corpsvn service.

I've opened #32272 just for that, which is (unexpectedly for me) also a problem with svn-internal.

So we can treat this ticket as just the "shutdown SVN corp" task. Worst case, we just move everything over to Nextcloud when it's ready (#32267).

comment:3 Changed 3 months ago by arma

[I wrote this explanatory text for Gaba and Anarcat, and I'm posting it to the ticket too for posterity.]

My rough outline for a way forward would be:

(1) Freeze corpsvn (i.e. make it read-only), and make a full checkout of it somewhere, and have that accessible in case Sue needs to access it.

(2) Give Sue someplace temporary to put her new files. Maybe that's Nextcloud. *Not* move all the old files there, or at least not by default.

(3) Put together a strike team to look at the frozen corpsvn checkout, plus the frozen internalsvn checkout. Build a list of categories (HR, finance, grantwriting, grant manager, etc), and sort the files into these categories, discarding as many files as possible. Figure out where else people are storing these files currently (granthub? google docs? their hard drive?). Make a comprehensive plan for how files of each category should be stored, and who should have read or write access per category. For example, there's no reason that HR documents should go into the same database, or even the same storage service, as grant proposals.

Step 3 is bigger than just svn, since it has to do with how we should actually properly store our internal files. Anarcat gave a start to that process in #32273.

(4) Get Sue and others to switch over to using the new process we develop in '3'.

We could do (3) before we do (1), and then we would never need to do (2). It depends how eager we are to shut down corpsvn. Any plan where we put off (3) indefinitely is dangerous though. For example, we could be open to gdpr messes in our current state -- plus actual security failures too.

comment:4 in reply to:  3 Changed 3 months ago by arma

Replying to arma:

(1) Freeze corpsvn (i.e. make it read-only), and make a full checkout of it somewhere

Oh, and here is what might be a critically useful further observation: I believe we don't need any svn history or previous versions of things. That is, I think that the only things we want to rescue, and then triage, are the current versions of the files in the repo.

So we could do an svn checkout somewhere internally, keep those files there, and then once we have Sue putting her new files somewhere smarter, we could shut down the svn service and keep the svn checkout around for when we do step (3), plus be able to access it in a pinch if Sue somehow loses her own svn checkout.

(We might want to keep the actual svn database around for a while too, since e.g. I bet doing an svn checkout on Windows makes the line endings correct for Windows, and other subtle things that won't be immediately obvious to us at first.)

comment:5 Changed 3 months ago by anarcat

So here's the gist of it, from what I understand (thanks for the details! :)

  1. corpsvn readonly
  2. move Sue files somewhere, only live files without history
  3. archival policy (#32273)
  4. switch to new archival policy

This looks like a great roadmap. Can we do at least (1) and (2) (with Nextcloud) right now?

We could do (3) before we do (1), and then we would never need to do (2). It depends how eager we are to shut down corpsvn.

I'm pretty eager to shutdown the SVN server.

We have the gayi shutdown scheduled for march in the roadmap (#17202). It's part of a train of service migration from the old KVM/libvirt infrastructure to the new ganeti cluster. gayi was one of the *first* machines I wanted to shutdown, because it was on a machine (textile/kvm1) that was almost empty. Alas, it was simpler to just migrate it than to wrangle this bag of knots. :)

The next milestone that's coming is the stretch to buster upgrade. That deadline is roughly "this summer". It would be important to shut down gayi before that, otherwise it's more needless work (like the migration I had to do) that we would need to do here. That work is not currently factored in our roadmap.

But we still have to maintain that box now, and worse we migrated it to new infra, so it's taking precious room that should be reserved for *other* machines that need to migrate. If we want to run Nextcloud and SVN concurrently, which we are *both* paying for in one way or another, I would argue that we (TPA) should be provided with the budget to do so accordingly. Otherwise SVN should be on its way out.

And if it's not on its way out, TPA should be clearly notified and given the means to handle that change.

Any plan where we put off (3) indefinitely is dangerous though. For example, we could be open to gdpr messes in our current state -- plus actual security failures too.

I agree with this, but inertia is likely to bring us there. I think the plan of archiving the stuff somewhere and moving only "live" documents in NC is, in the short term, a good one. But it's true we should not let go of this bug and fix it, otherwise it will certainly come and bite us in the bottom in the future.

That said, I really don't like the feeling I have right now that the gayi virtual host is being held in ransom against that work. ;) Someone needs to own up to the archival problem (#32273, specifically) and just do it, and it can be orthogonal to the migration to Nextcloud (or else).

Make a comprehensive plan for how files of each category should be stored, and who should have read or write access per category. For example, there's no reason that HR documents should go into the same database, or even the same storage service, as grant proposals.

Maybe this discussion belongs to #32273, but I should just note that if we want a different "storage service" per type of document, we're going to end up creating a lot of nextcloud instances. :) I'm not sure how we would do this otherwise. I keep hearing that we're worried about Nextcloud's access controls and security, but I have yet to hear an actual solution to that problem.

So for now, can we just move ahead on some plan? :)

comment:6 Changed 6 weeks ago by arma

Cc: gaba added

We made a plan some months ago in a vegas team mtg, but I don't remember its details. I guess I assumed somebody had transcribed it here but I don't see any new comments.

I *think* the plan in a nutshell was "keep using svn for now, and wait until nextcloud has...some feature that I've forgotten...and then we'll again look into switching. And in the mean time, make sure that not that many people have read or write access to corpsvn."

Gaba or Anarcat, do you remember the something that we were waiting for nextcloud to have?

Note: See TracTickets for help on using tickets.