Opened 9 months ago

Closed 6 months ago

#33082 closed task (fixed)

decomission kvm3 AKA macrum, 7 VMs to migrate

Reported by: anarcat Owned by: hiro
Priority: Medium Milestone:
Component: Internal Services/Tor Sysadmin Team Version:
Severity: Normal Keywords: tpa-roadmap-april
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by anarcat)

  • [x] crm-ext-01.torproject.org (part of #32198) done at 116.202.120.186
  • [x] crm-int-01.torproject.org (also part of #32198) done at 116.202.120.190
  • [x] forrestii.torproject.org (fpcentral, #33729) server migrated to 116.202.120.185
  • [x] nevii.torproject.org (DNS master) - #33834
  • [x] rude.torproject.org (RT) - migrated to 116.202.120.187
  • [x] troodi.torproject.org (Trac, #33731) - migrated to 116.202.120.188
  • [x] vineale.torproject.org (gitweb, #33730) - done, new IP is 116.202.120.189

The CRM machines might have already been migrated by the time we start this, see #32198.

Will require a new Ganeti node (#33083).

Child Tickets

TicketStatusOwnerSummaryComponent
#33729closedanarcatforestii IP address change planned for Ganeti migrationApplications/Tor Browser
#33730closedanarcatvineale IP address change planned for Ganeti migrationInternal Services/Service - git
#33731closedanarcattroodi IP address change planned for Ganeti migrationInternal Services/Service - trac
#33834closedanarcatnevii IP address change planned for Ganeti migrationInternal Services/Tor Sysadmin Team

Change History (22)

comment:1 Changed 9 months ago by anarcat

Description: modified (diff)

comment:2 Changed 7 months ago by anarcat

Status: newaccepted

this is on! i'll start this next week, since fsn-node-05 is online (#33083)!

comment:3 Changed 7 months ago by anarcat

Summary: decomission kvm3, 7 VMs to migratedecomission kvm3 AKA macrum, 7 VMs to migrate

first sync performed with the automated tools, resync takes about 35 minutes.

next step is to perform a sync with suspend, adopt and renumber the VMs for testing.

comment:4 Changed 7 months ago by anarcat

forgot to mention, this is the magic command that was ran and that takes 35 minutes on a good day:

./ganeti --verbose -H forrestii.torproject.org,nevii.torproject.org,rude.torproject.org,troodi.torproject.org,vineale.torproject.org,crm-ext-01.torproject.org,crm-int-01.torproject.org --verbose libvirt-import --ganeti-node=fsn-node-05.torproject.org --libvirt-host=macrum.torproject.org

comment:5 Changed 7 months ago by anarcat

Description: modified (diff)

created sub tickets for the notifications, remaining services are internal enough to not warrant coordination.

comment:6 Changed 7 months ago by anarcat

i did an adoption of all the boxes, but it failed at crm-int-01 because we ran out of IPs. i removed nevii and reimported crm-int-01 because the latter seemed more important to migrate first.

i tried to order a new ip block from hetzner, but it seems we can only order IPv6. i opened a ticket to clarify the next steps.

comment:7 Changed 7 months ago by anarcat

Description: modified (diff)

all machines renumbered (renumber-instances) internally and available for testing, except the CRM boxes which are on a different timeline.

documented the IP addresses in the summary.

comment:8 Changed 7 months ago by anarcat

Description: modified (diff)

vineale done.

comment:9 Changed 7 months ago by anarcat

Description: modified (diff)

rude migration in progress.

TTLs lowered at 22:51UTC.

renumbering:

--- /mnt/etc/network/interfaces.bak	2016-09-01 18:17:33.302001995 +0000
+++ /mnt/etc/network/interfaces	2020-03-31 23:59:06.952570913 +0000
@@ -1,14 +1,16 @@
 # This file describes the network interfaces available on your system
 # and how to activate them. For more information, see interfaces(5).
 
+# The loopback network interface
 auto lo
 iface lo inet loopback
 
-allow-hotplug eth0
+# The primary network interface
+auto eth0
 iface eth0 inet static
-    address 138.201.212.230/28
-    gateway 138.201.212.225
+    address 116.202.120.187/27
+    gateway 116.202.120.161
 iface eth0 inet6 static
     accept_ra 0
-    address 2a01:4f8:172:39ca:0:dad3:6:1/96
-    gateway 2a01:4f8:172:39ca:0:dad3:0:1
+    address 2a01:4f8:fff0:4f:266:37ff:fee0:8604/64
+    gateway 2a01:4f8:fff0:4f::1

LDAP, nagios, and /etc on rude IP changes done.

it should be done.

comment:10 Changed 7 months ago by anarcat

a new netblock was allocated by hetzner, and was configured in Ganeti with:

root@fsn-node-01:~# gnt-network add --network 49.12.57.128/27 --gateway 49.12.57.129 gnt-fsn13-02
root@fsn-node-01:~# gnt-network connect --nic-parameters=link=br0,vlan=4000,mode=openvswitch gnt-fsn13-02 default
root@fsn-node-01:~# gnt-network info gnt-fsn13-02
Network name: gnt-fsn13-02
UUID: f989bc71-7c0e-41c9-9bf1-5d6020726886
Serial number: 1
  Subnet: 49.12.57.128/27
  Gateway: 49.12.57.129
  IPv6 Subnet: None
  IPv6 Gateway: None
  Mac Prefix: None
  Size: 32
  Free: 29 (90.62%)
  Usage map:
        0 XX.............................X                                 63
         (X) used    (.) free
  externally reserved IPs:
    49.12.57.128, 49.12.57.129, 49.12.57.159
  connected to node groups:
    default (mode:openvswitch link:br0 vlan:4000)
  not used by any instances

hopefully that will "just work"!

comment:11 Changed 7 months ago by anarcat

Description: modified (diff)

troodi/trac done!

comment:12 Changed 7 months ago by anarcat

Description: modified (diff)

forrestii migrated!

comment:13 Changed 7 months ago by anarcat

Description: modified (diff)

create a ticket for nevii, as it's more complicated than the others

comment:14 Changed 7 months ago by anarcat

Description: modified (diff)

nevii done, only the CRM left, whoohoo!!!!!

comment:15 Changed 6 months ago by anarcat

Keywords: tpa-roadmap-april added; tpa-roadmap-march removed

this wasn't completed in march, so move to april

comment:16 Changed 6 months ago by anarcat

Description: modified (diff)

the old CRM boxes were migrated now, next step is to retire all of macrum, which i will start tomorrow.

comment:17 Changed 6 months ago by anarcat

hiro wants to test this procedure. :)

comment:18 Changed 6 months ago by anarcat

Owner: changed from anarcat to hiro
Status: acceptedassigned

comment:19 Changed 6 months ago by hiro

I have followed the retire-a-host procedure https://help.torproject.org/tsa/howto/retire-a-host/ up to point 13 in the list.
We just need to remove it from the wiki and physically from hetzner.

-hiro

comment:20 Changed 6 months ago by hiro

Next step is wiping the disks.

comment:21 Changed 6 months ago by hiro

Disks have been wiped and machine will be cancelled on 10th of May.

comment:22 Changed 6 months ago by hiro

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.