Opened 3 weeks ago

Last modified 3 weeks ago

#32273 new task

archive private information from SVN

Reported by: anarcat Owned by:
Priority: Medium Milestone:
Component: Internal Services/Services Admin Team Version:
Severity: Normal Keywords:
Cc: arma Actual Points:
Parent ID: #17202 Points:
Reviewer: Sponsor:

Description

a common problem in the internal and corp SVN repository shutdown is "what do we do with all that stuff now". for example, the internal repository is shutdown now (#15949) but there is still information there that is valuable. or not. we're not sure. we think so, but maybe some of it should be destroyed.

so we need to answer the following questions:

  1. which data should be kept and destroyed from the repositories?
  2. where should it be kept?

so far, I went under the assertion that the answers were:

  1. keep everything
  2. in nextcloud

but it seems this might not be exactly right.

Child Tickets

Change History (2)

comment:1 Changed 3 weeks ago by anarcat

Inventory

nick, when he opened #15949, identified the following information as being present in svn-internal:

  • contact_info -- information on contacting various people at Tor
  • forms -- expense forms, timesheets, etc.
  • jobs -- an archive of (some) past job postings
  • manual_bridgedb -- a listing of bridges that we hand out by hand when asked
  • monthly_reports -- some old reports on what we were doing back in 2011/2012
  • newsleters -- some old newsletters we sent out in 2011/2012.
  • notes -- memoranda on some conversations and blog post drafts and whatnot.
  • proposals -- proposals we've made to various organizations
  • roadmaps -- old roadmap documentaations
  • supporting_organizations -- peopel who have helped us out, and what they said
  • tbb-qa -- information about TBB testers, information for TBB testers
  • todo-lists -- schedule info and todo lists for different TP members

Some of this information could become public. Some (like my home address and phone number) could go onto an internal wiki page, if we trust our ability to set that up. Some could go onto internal git repositories instead.

Then there's all the stuff in corp SVN (#32025) which is similar.

We need to do a final inventory of all this stuff, sort it into buckets (ie. a unique category any document will fit into), and then do it (ie. sort the documents in the right buckets).

So far I am aware of the following tools for managing files and content so far:

  • public Trac wiki (public documentation)
  • public TPA wiki (public documentation)
  • public git repositories, gitlab and gitolite (public code, design documents)
  • private git repositories, gitlab and gitolite (private code only?)
  • private gitlab wikis (not used yet)
  • Storm: pads, Kanboard, and who knows what else (meeting minutes, project management)
  • Nextcloud (calendars, contacts, files of all sorts)
  • Google Docs (?)
  • Granthub (grant applications?)
  • SVN internal (see above)
  • SVN corp (financial stuff, grants, etc?)

From what I understand, the resulting "buckets" would be:

  1. public git repositories on gitlab (public documentation and code)
  2. private git repositories on gitlab (private documentation and code)
  3. private wikis on gitlab (private documentation?)
  4. Nextcloud (calendars, contacts, kanboards, files of all sorts)
  5. Granthub (grant applications)
  6. No google docs?

In other words, everything but grant applications would belong to Nextcloud.

Last edited 3 weeks ago by anarcat (previous) (diff)

comment:2 Changed 3 weeks ago by anarcat

Requirements

one thing we need to clarify here is what the requirements are. it seems we want:

  • permanence - there should be backups and no data loss in the event of an attack or hardware failure
  • archival - old data should eventually be pruned, for example personal information about past employees should not be kept forever, financial records can be destroyed after some legal limit, etc.
  • privilege separation - some of the stuff is private from the public, or even to tor-internal members. we need to clearly define what those boundaries are and are strongly they need to be (e.g. are Nextcloud circles sufficient? can we put stuff on Google Docs? what about share.riseup.net or pad.riseup.net? etc)

I might be missing some things here of course, would be glad to expand on those.

This is of course a wider problem than just SVN, and that should be part of a wider audit affecting also external services (hinted above in the Google Docs reference). But this ticket is mostly about SVN, because we have been trying to turn off that server for four years now and we have to start with something.

Last edited 3 weeks ago by anarcat (previous) (diff)
Note: See TracTickets for help on using tickets.