Opened 9 months ago

Closed 9 months ago

Last modified 9 months ago

#32937 closed project (fixed)

install a new node in the gnt-fsn cluster (fsn-node-03)

Reported by: anarcat Owned by: anarcat
Priority: Medium Milestone:
Component: Internal Services/Tor Sysadmin Team Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

we decided to use the part of the emergency budget allocated to a new Ganeti node to create such a node now. if we follow the current order, the new node would be named fsn-node-03.

our new FSN Ganeti cluster (gnt-fsn) is full. when it reboots, the load goes through the roof and it can barely keep up when one node is missing.

plus we need extra capacity to cover the various decommissioning process we have under way (#32802, #31686, #29974). it's now essential to have that extra capacity to cover for those retirements.

the retirement of the kvm* boxes, in particular, might give us the extra budget required to pop another node (which would be fsn-node-04, but let's not get ahead of ourselves).

Child Tickets

Attachments (1)

daemon.log (68.9 KB) - added by anarcat 9 months ago.

Download all attachments as: .zip

Change History (18)

comment:1 Changed 9 months ago by anarcat

Status: assignedaccepted

created the server at hetzner, awaiting shell.

comment:2 Changed 9 months ago by anarcat

we toyed with the idea of getting a AX line here, which i documented in the new-machine page, but ended up going with the regular PX thing.

the box is now up, with the following pubkeys:

ssh-dss AAAAB3NzaC1kc3MAAACBAMljYhWq1MpFbijeTwP0ymSOPGuneFbjEGIqPWoY/qVd+ZtPl7i0Zb3HmkApX3bekvv2yWP+svZA1d30Omli/FzEtJfMUfssqPbE5gqEot+nGMhPXPvwsxs/t7FO0k0xFVzzFsypTm4+RFRiKqWY4gvwwwDfHv3n8NfyO+jOUga9AAAAFQDDltXnrmUiHcGTqFfUzEyokKvYlwAAAIEArFEoY00Rd4w18/utCLE2Y5MXVDFZXxgcVV+HPq+k9unfGZjK+jRRUKhq+yqSVIMdGoy65ddCB4/YRke+wQWgNr+Q6BGUf1ROWalhv1/rpbqq8Vmpf2D46yRgYEhUmLOljrXVEC7a1cPiX3cQRfL5CV4enmCLF9S/ZHvDIPAv2mYAAACAXabnKEAS+2kfc4wxJ19N/YmKbLZ+LpwEqX4/q6vN7OZ+SwIXAw749A1HvVV3PjUdR47tzsA/9hiT8UkXPLRjmA8wSXiKnyrYVHjl8ouEu4h35On89+ZxFXz406llHlCb52fFrH1YpfFuzEHEzJyjSoCcnn4ZIU1AQw5Y7OdJcbA= 
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFF0WC3IaJuPiiuFYn8nGxbunNA8VbYMXKtmIIzED5DvcFc9Fy6x9M5KKY5gBvjFrZMGwDqnuXszmbGawviR4CU= 
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAICXo6tObnMENCT1rQEsi5xPpnRdWxiJ1ubyKxBhsOEnb 
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCumRjxbkKGvkfWk/nMFUvXDpCQtDtgBkYrGOPbP2y+5AXdrkBcEavKF6opqldBQoyairBQVLFcn9hYZO/A/+wKO7qgKKHOa2MHgW1FbWcrGd1+w7Cqxp7Gs/Ir8W921uBAM/9mQkwcZIsgnQx3w/x6/eIa3A8Wma1e41IUjRzt+e8Aq5xoSShqlvr/yXqkUJEj+MzIklOqtp4IYZywig/9uxGL0D7UOfCNZMAPNfuMw8+S2B4u+vLCITMm1yHhn/5pPMiODwDJ64LdNQysssZHQkjHwJNTH6cluN1tMiHe0niehWbGThoQFnzeMFuHCA4xVF8L8k0TcVPv+KzxfepZ 

and IP 46.4.22.162. it's in the right datacenter (fsn1-dc13) so we're good on that.

now onto the install procedure.

comment:3 Changed 9 months ago by anarcat

worked on the setup-storage config, but it fails with this mysterious error:

Cannot determine size of /dev/crypt_format_md1 - scheme unknown

Full log, which includes the config file:

root@rescue ~ # setup-storage -f setup-storage-fsn-node-3 -d -X
disklist: nvme0n1
nvme1n1
sda
sdb
Starting setup-storage 2.2
Using config file: setup-storage-fsn-node-3
Input was:
# open questions
# --align=optimal?
# leave keys in /tmp/fai or specify passphrase?
# use sameas: to set all disk names earlier?
# bios_grub flag?

disk_config nvme0n1 disklabel:gpt bootable:2
# bios grub second stage
primary -       8MiB    -       -
# /boot
primary -       512MiB  -       -
# rest is RAID+LUKS+LVM
primary -       0-      -       -

disk_config nvme1n1 disklabel:gpt bootable:2
# same as above
primary -       8MiB    -       -
primary -       512MiB  -       -
primary -       0-      -       -

disk_config sda disklabel:gpt
primary -       0-      -       -

disk_config sdb disklabel:gpt
primary -       0-      -       -

disk_config raid fstabkey:uuid
raid1   /boot   nvme0n1p2,nvme1n1p2     ext4    rw,noatime,errors=remount-ro
raid1   -       nvme0n1p3,nvme1n1p3     -       -
raid1   -       sda1,sdb1       -       -

# FAI defaults to -c aes-xts-plain64 -s 256
disk_config cryptsetup
luks    -       /dev/md1        -       -
luks    -       /dev/md2        -       -

disk_config lvm fstabkey:uuid
# previous convention was "vg_$hostname"
vg      vg_nvme crypt_format_md1
vg_nvme-root    /       30G     ext4    rw
vg_nvme-swap    swap    1G      swap    sw
vg      vg_hdd  crypt_format_md2

# HDD disks config intentionally left blank
(CMD) parted -s /dev/nvme0n1 unit TiB print 1> /tmp/vg8ajSauDw 2> /tmp/vMMazFsqO7
Executing: parted -s /dev/nvme0n1 unit TiB print
(STDERR) Error: /dev/nvme0n1: unrecognised disk label
(STDOUT) Model: SAMSUNG MZQLB960HAJR-00007 (nvme)
(STDOUT) Disk /dev/nvme0n1: 0.87TiB
(STDOUT) Sector size (logical/physical): 512B/512B
(STDOUT) Partition Table: unknown
(STDOUT) Disk Flags: 
Parted could not read a disk label (new disk?)
(CMD) parted -s /dev/nvme0n1 mklabel gpt 1> /tmp/MK287MFjgl 2> /tmp/55bOjMK73A
Executing: parted -s /dev/nvme0n1 mklabel gpt
(CMD) parted -s /dev/nvme0n1 unit TiB print 1> /tmp/NuP83_L6et 2> /tmp/kubLOQr9Wo
Executing: parted -s /dev/nvme0n1 unit TiB print
(STDOUT) Model: SAMSUNG MZQLB960HAJR-00007 (nvme)
(STDOUT) Disk /dev/nvme0n1: 0.87TiB
(STDOUT) Sector size (logical/physical): 512B/512B
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start  End  Size  File system  Name  Flags
(STDOUT) 
(CMD) parted -s /dev/nvme0n1 unit B print free 1> /tmp/FN2KfViTkm 2> /tmp/y2Pytod_rd
Executing: parted -s /dev/nvme0n1 unit B print free
(STDOUT) Model: SAMSUNG MZQLB960HAJR-00007 (nvme)
(STDOUT) Disk /dev/nvme0n1: 960197124096B
(STDOUT) Sector size (logical/physical): 512B/512B
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start   End            Size           File system  Name  Flags
(STDOUT)         17408B  960197107199B  960197089792B  Free Space
(STDOUT) 
(CMD) parted -s /dev/nvme0n1 unit chs print free 1> /tmp/1IjsTMlqdY 2> /tmp/lU63MMaJBC
Executing: parted -s /dev/nvme0n1 unit chs print free
(STDOUT) Model: SAMSUNG MZQLB960HAJR-00007 (nvme)
(STDOUT) Disk /dev/nvme0n1: 116737,80,62
(STDOUT) Sector size (logical/physical): 512B/512B
(STDOUT) BIOS cylinder,head,sector geometry: 116737,255,63.  Each cylinder is 8225kB.
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start   End           File system  Name  Flags
(STDOUT)         0,0,34  116737,80,29  Free Space
(STDOUT) 
(CMD) parted -s /dev/nvme1n1 unit TiB print 1> /tmp/QjjsC5zar2 2> /tmp/7UcTC3hxOf
Executing: parted -s /dev/nvme1n1 unit TiB print
(STDERR) Error: /dev/nvme1n1: unrecognised disk label
(STDOUT) Model: SAMSUNG MZQLB960HAJR-00007 (nvme)
(STDOUT) Disk /dev/nvme1n1: 0.87TiB
(STDOUT) Sector size (logical/physical): 512B/512B
(STDOUT) Partition Table: unknown
(STDOUT) Disk Flags: 
Parted could not read a disk label (new disk?)
(CMD) parted -s /dev/nvme1n1 mklabel gpt 1> /tmp/NcxnmAx4H4 2> /tmp/GHflvDI9cH
Executing: parted -s /dev/nvme1n1 mklabel gpt
(CMD) parted -s /dev/nvme1n1 unit TiB print 1> /tmp/F_ioSuepVF 2> /tmp/RGHrXNq3ax
Executing: parted -s /dev/nvme1n1 unit TiB print
(STDOUT) Model: SAMSUNG MZQLB960HAJR-00007 (nvme)
(STDOUT) Disk /dev/nvme1n1: 0.87TiB
(STDOUT) Sector size (logical/physical): 512B/512B
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start  End  Size  File system  Name  Flags
(STDOUT) 
(CMD) parted -s /dev/nvme1n1 unit B print free 1> /tmp/JMHhWwaRE4 2> /tmp/3yByvQ8pC0
Executing: parted -s /dev/nvme1n1 unit B print free
(STDOUT) Model: SAMSUNG MZQLB960HAJR-00007 (nvme)
(STDOUT) Disk /dev/nvme1n1: 960197124096B
(STDOUT) Sector size (logical/physical): 512B/512B
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start   End            Size           File system  Name  Flags
(STDOUT)         17408B  960197107199B  960197089792B  Free Space
(STDOUT) 
(CMD) parted -s /dev/nvme1n1 unit chs print free 1> /tmp/DvwVu2WqiU 2> /tmp/79TMfEZhI3
Executing: parted -s /dev/nvme1n1 unit chs print free
(STDOUT) Model: SAMSUNG MZQLB960HAJR-00007 (nvme)
(STDOUT) Disk /dev/nvme1n1: 116737,80,62
(STDOUT) Sector size (logical/physical): 512B/512B
(STDOUT) BIOS cylinder,head,sector geometry: 116737,255,63.  Each cylinder is 8225kB.
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start   End           File system  Name  Flags
(STDOUT)         0,0,34  116737,80,29  Free Space
(STDOUT) 
(CMD) parted -s /dev/sda unit TiB print 1> /tmp/JIkqtOkUuP 2> /tmp/3EiOFTm7w9
Executing: parted -s /dev/sda unit TiB print
(STDERR) Error: /dev/sda: unrecognised disk label
(STDOUT) Model: ATA TOSHIBA MG06ACA1 (scsi)
(STDOUT) Disk /dev/sda: 9.10TiB
(STDOUT) Sector size (logical/physical): 512B/4096B
(STDOUT) Partition Table: unknown
(STDOUT) Disk Flags: 
Parted could not read a disk label (new disk?)
(CMD) parted -s /dev/sda mklabel gpt 1> /tmp/P3Ze3gJlhk 2> /tmp/MWTeWjTArm
Executing: parted -s /dev/sda mklabel gpt
(CMD) parted -s /dev/sda unit TiB print 1> /tmp/YUdeL80vX3 2> /tmp/TW2EWrJtu0
Executing: parted -s /dev/sda unit TiB print
(STDOUT) Model: ATA TOSHIBA MG06ACA1 (scsi)
(STDOUT) Disk /dev/sda: 9.10TiB
(STDOUT) Sector size (logical/physical): 512B/4096B
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start  End  Size  File system  Name  Flags
(STDOUT) 
(CMD) parted -s /dev/sda unit B print free 1> /tmp/SHw1gM3OgI 2> /tmp/YyrsyIdxOz
Executing: parted -s /dev/sda unit B print free
(STDOUT) Model: ATA TOSHIBA MG06ACA1 (scsi)
(STDOUT) Disk /dev/sda: 10000831348736B
(STDOUT) Sector size (logical/physical): 512B/4096B
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start   End              Size             File system  Name  Flags
(STDOUT)         17408B  10000831331839B  10000831314432B  Free Space
(STDOUT) 
(CMD) parted -s /dev/sda unit chs print free 1> /tmp/7jB9P7heVI 2> /tmp/YA8IORj9eZ
Executing: parted -s /dev/sda unit chs print free
(STDOUT) Model: ATA TOSHIBA MG06ACA1 (scsi)
(STDOUT) Disk /dev/sda: 1215865,39,45
(STDOUT) Sector size (logical/physical): 512B/4096B
(STDOUT) BIOS cylinder,head,sector geometry: 1215865,255,63.  Each cylinder is 8225kB.
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start   End            File system  Name  Flags
(STDOUT)         0,0,34  1215865,39,12  Free Space
(STDOUT) 
(CMD) parted -s /dev/sdb unit TiB print 1> /tmp/xSDW4y246S 2> /tmp/4uJcTQmrF5
Executing: parted -s /dev/sdb unit TiB print
(STDERR) Error: /dev/sdb: unrecognised disk label
(STDOUT) Model: ATA TOSHIBA MG06ACA1 (scsi)
(STDOUT) Disk /dev/sdb: 9.10TiB
(STDOUT) Sector size (logical/physical): 512B/4096B
(STDOUT) Partition Table: unknown
(STDOUT) Disk Flags: 
Parted could not read a disk label (new disk?)
(CMD) parted -s /dev/sdb mklabel gpt 1> /tmp/sokuCJH_uw 2> /tmp/YpCDmAqg0m
Executing: parted -s /dev/sdb mklabel gpt
(CMD) parted -s /dev/sdb unit TiB print 1> /tmp/chgQkdwPwZ 2> /tmp/L70AY4kmhr
Executing: parted -s /dev/sdb unit TiB print
(STDOUT) Model: ATA TOSHIBA MG06ACA1 (scsi)
(STDOUT) Disk /dev/sdb: 9.10TiB
(STDOUT) Sector size (logical/physical): 512B/4096B
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start  End  Size  File system  Name  Flags
(STDOUT) 
(CMD) parted -s /dev/sdb unit B print free 1> /tmp/H6cCqPfSjv 2> /tmp/B6Q2Il5Tgz
Executing: parted -s /dev/sdb unit B print free
(STDOUT) Model: ATA TOSHIBA MG06ACA1 (scsi)
(STDOUT) Disk /dev/sdb: 10000831348736B
(STDOUT) Sector size (logical/physical): 512B/4096B
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start   End              Size             File system  Name  Flags
(STDOUT)         17408B  10000831331839B  10000831314432B  Free Space
(STDOUT) 
(CMD) parted -s /dev/sdb unit chs print free 1> /tmp/v5U6BwCVbR 2> /tmp/G3R5gZBg0D
Executing: parted -s /dev/sdb unit chs print free
(STDOUT) Model: ATA TOSHIBA MG06ACA1 (scsi)
(STDOUT) Disk /dev/sdb: 1215865,39,45
(STDOUT) Sector size (logical/physical): 512B/4096B
(STDOUT) BIOS cylinder,head,sector geometry: 1215865,255,63.  Each cylinder is 8225kB.
(STDOUT) Partition Table: gpt
(STDOUT) Disk Flags: 
(STDOUT) 
(STDOUT) Number  Start   End            File system  Name  Flags
(STDOUT)         0,0,34  1215865,39,12  Free Space
(STDOUT) 
    No volume groups found.
(CMD) mdadm --examine --scan --verbose -c partitions 1> /tmp/hgUrKMw3a3 2> /tmp/HxEzhckhyV
Executing: mdadm --examine --scan --verbose -c partitions
Current disk layout
$VAR1 = {
          '/dev/sdb' => {
                          'disklabel' => 'gpt',
                          'bios_cylinders' => '1215865',
                          'begin_byte' => 0,
                          'bios_sectors_per_track' => '63',
                          'partitions' => {},
                          'bios_heads' => '255',
                          'sector_size' => '512',
                          'end_byte' => '10000831348735',
                          'size' => '10000831348736'
                        },
          '/dev/nvme0n1' => {
                              'end_byte' => '960197124095',
                              'size' => '960197124096',
                              'bios_cylinders' => '116737',
                              'begin_byte' => 0,
                              'disklabel' => 'gpt',
                              'sector_size' => '512',
                              'partitions' => {},
                              'bios_sectors_per_track' => '63',
                              'bios_heads' => '255'
                            },
          '/dev/nvme1n1' => {
                              'end_byte' => '960197124095',
                              'size' => '960197124096',
                              'bios_cylinders' => '116737',
                              'begin_byte' => 0,
                              'disklabel' => 'gpt',
                              'bios_heads' => '255',
                              'bios_sectors_per_track' => '63',
                              'partitions' => {},
                              'sector_size' => '512'
                            },
          '/dev/sda' => {
                          'end_byte' => '10000831348735',
                          'size' => '10000831348736',
                          'disklabel' => 'gpt',
                          'bios_cylinders' => '1215865',
                          'begin_byte' => 0,
                          'bios_sectors_per_track' => '63',
                          'partitions' => {},
                          'bios_heads' => '255',
                          'sector_size' => '512'
                        }
        };
Current LVM layout
$VAR1 = {};
Current RAID layout
$VAR1 = {};
Current device tree
$VAR1 = {
          '/dev/nvme0n1' => undef,
          '/dev/nvme1n1' => undef,
          '/dev/sda' => undef,
          '/dev/sdb' => undef
        };
Cannot determine size of /dev/crypt_format_md1 - scheme unknown

ping'd people in #fai for help - it sure looks like it's not partitionning the disk at all...

Last edited 9 months ago by anarcat (previous) (diff)

comment:4 Changed 9 months ago by anarcat

okay, after help from MrFai on IRC, I got this config to work, which is pretty frigging awesome:

# open questions
# --align=optimal?
# leave keys in /tmp/fai or specify passphrase?
# use sameas: to set all disk names earlier?
# bios_grub flag?

disk_config nvme0n1 disklabel:gpt bootable:2
# bios grub second stage
primary -       8MiB    -       -
# /boot
primary -       512MiB  -       -
# rest is RAID+LUKS+LVM
primary -       0-      -       -

disk_config nvme1n1 disklabel:gpt bootable:2
# same as above
primary -       8MiB    -       -
primary -       512MiB  -       -
primary -       0-      -       -

disk_config sda disklabel:gpt
primary -       0-      -       -

disk_config sdb disklabel:gpt
primary -       0-      -       -

disk_config raid fstabkey:uuid
raid1   /boot   nvme0n1p2,nvme1n1p2     ext4    rw,noatime,errors=remount-ro
raid1   -       nvme0n1p3,nvme1n1p3     -       -
raid1   -       sda1,sdb1       -       -

# FAI defaults to -c aes-xts-plain64 -s 256
disk_config cryptsetup
luks    -       /dev/md1        -       -
luks    -       /dev/md2        -       -

disk_config lvm fstabkey:uuid
# previous convention was "vg_$hostname"
vg      vg_nvme md1
vg_nvme-root    /       30G     ext4    rw
vg_nvme-swap    swap    1G      swap    sw

vg      vg_hdd  md2

# HDD disks config intentionally left blank

This gives us the following non-verbose run, which is also pretty awesome:

root@rescue ~ # setup-storage -f setup-storage-fsn-node-3 -X
Starting setup-storage 2.2
Using config file: setup-storage-fsn-node-3
    No volume groups found.
Executing: wipefs -af /dev/nvme0n1p1
Executing: wipefs -af /dev/nvme1n1p1
Executing: mdadm --stop --scan
Executing: mdadm --assemble --scan --config=/tmp/fai/mdadm-from-examine.conf
Executing: mdadm -W --stop /dev/md0
Executing: mdadm -W --stop /dev/md1
Executing: mdadm -W --stop /dev/md2
Executing: head -c 2048 /dev/urandom | od | tee /tmp/fai/crypt_dev_md1
Executing: head -c 2048 /dev/urandom | od | tee /tmp/fai/crypt_dev_md2
Executing: wipefs -af /dev/nvme0n1p2
Executing: wipefs -af /dev/nvme0n1p3
Executing: parted -s /dev/nvme0n1 mklabel gpt
Executing: parted -s /dev/nvme0n1 mkpart primary "" 1048576B 9437183B
Executing: parted -s /dev/nvme0n1 mkpart primary "" 9437184B 546308095B
Executing: parted -s /dev/nvme0n1 set 2 boot on
Executing: parted -s /dev/nvme0n1 mkpart primary "" 546308096B 960197107199B
Executing: wipefs -af /dev/sdb1
Executing: parted -s /dev/sdb mklabel gpt
Executing: parted -s /dev/sdb mkpart primary "" 1048576B 10000831331839B
Executing: wipefs -af /dev/sda1
Executing: parted -s /dev/sda mklabel gpt
Executing: parted -s /dev/sda mkpart primary "" 1048576B 10000831331839B
Executing: wipefs -af /dev/nvme1n1p2
Executing: wipefs -af /dev/nvme1n1p3
Executing: parted -s /dev/nvme1n1 mklabel gpt
Executing: parted -s /dev/nvme1n1 mkpart primary "" 1048576B 9437183B
Executing: parted -s /dev/nvme1n1 mkpart primary "" 9437184B 546308095B
Executing: parted -s /dev/nvme1n1 set 2 boot on
Executing: parted -s /dev/nvme1n1 mkpart primary "" 546308096B 960197107199B
Executing: parted -s /dev/nvme1n1 set 2 raid on
Executing: parted -s /dev/nvme0n1 set 2 raid on
Executing: parted -s /dev/nvme0n1 set 3 raid on
Executing: parted -s /dev/nvme1n1 set 3 raid on
Executing: parted -s /dev/sdb set 1 raid on
Executing: parted -s /dev/sda set 1 raid on
Executing: yes | mdadm --create  /dev/md0 --level=raid1 --force --run --raid-devices=2 /dev/nvme1n1p2 /dev/nvme0n1p2
Executing: mkfs.ext4  /dev/md0
Executing: yes | mdadm --create  /dev/md1 --level=raid1 --force --run --raid-devices=2 /dev/nvme0n1p3 /dev/nvme1n1p3
Executing: yes | mdadm --create  /dev/md2 --level=raid1 --force --run --raid-devices=2 /dev/sdb1 /dev/sda1
Executing: yes YES | cryptsetup luksFormat /dev/md1 /tmp/fai/crypt_dev_md1
Executing: cryptsetup luksOpen /dev/md1 crypt_dev_md1 --key-file /tmp/fai/crypt_dev_md1
Executing: yes YES | cryptsetup luksFormat /dev/md2 /tmp/fai/crypt_dev_md2
Executing: cryptsetup luksOpen /dev/md2 crypt_dev_md2 --key-file /tmp/fai/crypt_dev_md2
Executing: pvcreate -ff -y  /dev/mapper/crypt_dev_md2
Executing: vgcreate  vg_hdd  /dev/mapper/crypt_dev_md2
Executing: vgchange -a y vg_hdd
Executing: pvcreate -ff -y  /dev/mapper/crypt_dev_md1
Executing: vgcreate  vg_nvme  /dev/mapper/crypt_dev_md1
Executing: vgchange -a y vg_nvme
Executing: lvcreate  --yes -n root -L 30720 vg_nvme
Executing: mkfs.ext4  /dev/vg_nvme/root
Executing: lvcreate  --yes -n swap -L 1024 vg_nvme
Executing: mkswap  /dev/vg_nvme/swap
/dev/md0 UUID=4bfcb3a7-c549-4c1b-be3a-ff2f5648525e
/dev/vg_nvme/swap UUID=71656b76-e3c0-46e0-b171-a6ff78fcd5c4
/dev/vg_nvme/root UUID=f96dc710-9044-485a-9120-3075f28aa697

This also leaves configuration files in /tmp/fai, including mdadm.conf, fstab, (broken) crypttab (because it requires keyfiles) and the two luks keyfiles.

i'll start with this and move ahead with the next step of the install process.

comment:5 Changed 9 months ago by anarcat

Rerunning the install:

  1. login
  1. added an explicit step to set the hostname instead of hiding it in the disk partitionning
  1. partitionned the disks with the following configuration file:
# open questions
# --align=optimal?
# leave keys in /tmp/fai or specify passphrase?
# use sameas: to set all disk names earlier?
# bios_grub flag?

disk_config nvme0n1 disklabel:gpt bootable:2 align-at:1M
# bios grub second stage
primary -       8MiB    -       -
# /boot
primary -       512MiB  -       -
# rest is RAID+LUKS+LVM
primary -       0-      -       -

disk_config nvme1n1 disklabel:gpt bootable:2 align-at:1M
# same as above
primary -       8MiB    -       -
primary -       512MiB  -       -
primary -       0-      -       -

disk_config sda disklabel:gpt align-at:1M
primary -       0-      -       -

disk_config sdb disklabel:gpt align-at:1M
primary -       0-      -       -

disk_config raid fstabkey:uuid
raid1   /boot   nvme0n1p2,nvme1n1p2     ext4    rw,noatime,errors=remount-ro
raid1   -       nvme0n1p3,nvme1n1p3     -       -
raid1   -       sda1,sdb1       -       -

# FAI defaults to -c aes-xts-plain64 -s 256
disk_config cryptsetup
luks    -       /dev/md1        -       -
luks    -       /dev/md2        -       -

disk_config lvm fstabkey:uuid
# previous convention was "vg_$hostname"
vg      vg_nvme md1
vg_nvme-root    /       30G     ext4    rw
vg_nvme-swap    swap    1G      swap    sw

vg      vg_hdd  md2

# HDD disks config intentionally left blank
  1. install the system, modified version:
mkdir -p /target && mount /dev/vg_nvme/root /target &&
mkdir -p /target/boot && mount /dev/md0 /target/boot &&
        mkdir -p /target/run && mount -t tmpfs tgt-run /target/run &&
        mkdir /target/run/udev && mount -o bind /run/udev /target/run/udev &&
        bootdisk=/dev/nvme1n1 &&
        ROOTPASSWORD=$(tr -dc 'A-Za-z0-9' < /dev/urandom | head -c 30) &&
        apt-get install -y grml-debootstrap && \
        sed -e 's/postfix//;
                s/vlan//;
                s/bridge-utils//;
                s/ifenslave//;
                s/resolvconf//;
                s/zsh//;
                s/strace//;
                s/os-prober//;
                s/bzip2//;
                s/file//;
                s/lsof//;
                s/most//;
                $adbus
                $acryptsetup-initramfs
                ' /etc/debootstrap/packages > /root/grml-packages &&
        grml-debootstrap --grub "$bootdisk" --target /target \
            --hostname `hostname` --release buster \
            --mirror https://mirror.hetzner.de/debian/packages/ \
            --packages /root/grml-packages \
            --password "$ROOTPASSWORD" \
            --remove-configs --defaultinterfaces &&
        umount /target/run/udev /target/run

I've also reset the LUKS passphrases with:

LUKS_PASSPHRASE=$(tr -dc 'A-Za-z0-9' < /dev/urandom | head -c 30) &&
echo $LUKS_PASSPHRASE | cryptsetup luksAddKey /dev/md1 --key-file=/tmp/fai/crypt_dev_md1 &&
echo $LUKS_PASSPHRASE | cryptsetup luksAddKey /dev/md2 --key-file=/tmp/fai/crypt_dev_md2 &&
cryptsetup luksRemoveKey /dev/md1 --key-file=/tmp/fai/crypt_dev_md1 &&
cryptsetup luksRemoveKey /dev/md2 --key-file=/tmp/fai/crypt_dev_md2
  1. step 4 is replaced with:
( cat /tmp/fai/fstab ; echo ; echo tmpfs /tmp tmpfs defaults,size=512m 0 0 ) > /target/etc/fstab

that tmpfs stuff could probably be merged into the setup-storage configuration.

  1. this step was step 11 and moved up so we avoid regenerating the initrd for nothing
  1. i rewired the luks-setup script so that it correctly deals with multiple PVs setup, and hardcoded the "discard" option because i think it's fair to assume / is on SSD.
  1. now a noop
  1. done
  1. done, weirdly doesn't match the output of FAI
  1. I had to run this before step 9 to make grub happy:
parted --script /dev/nvme0n1 set 1 bios_grub on
parted --script /dev/nvme1n1 set 1 bios_grub on
  1. network looks good (DHCP)
  1. regen'd, need to figure out how to tell setup-storage to the bios_grub magic and fix its mdadm.conf so it matches
  1. unmounted everything
  1. documented in tor-passwords
  1. rebooted

and it caaaame back! whoohoo! we have a base system installed with setup-storage!!!

comment:6 Changed 9 months ago by anarcat

rest of the new-machine procedure:

  1. fixed /etc/hosts and /etc/resolv.conf
  1. fixed /etc/network/interfaces to add IPv6 and added the IPs by hand (!) this was required for LDAP generation to work
  1. added to LDAP
  1. made the puppet dance
  1. security upgrades (none)
  1. fixed aliases and hardening
  1. rebooted

still need to:

  1. add to spreadsheet (blocked on #33031)
  2. add to nagios
  3. configure ganeti
Last edited 9 months ago by anarcat (previous) (diff)

comment:7 Changed 9 months ago by anarcat

added to nagios

comment:8 Changed 9 months ago by anarcat

already present in spreadsheet. next up: GANETI!!

(oh, and i also need to setup mandos, garh)

comment:9 Changed 9 months ago by anarcat

i deployed the openswitch package and config, and rebooted the box and *that* worked. then I ran puppet with the ganeti profile and that had a few errors because of module loading again.

then i rebooted and all hell broke loose. the initrd comes up and i can enter the crypto password (mandos is not setup yet), but the boot doesn't complete for some reason.

rebooting in rescue shows that the machine hasn't been able to reach userland ever since that puppet run. i'll attach the daemon.log of that run here for perusal, but something definitely went wrong here.

i'll try to finish the mandos config to see if that's the problem.

Changed 9 months ago by anarcat

Attachment: daemon.log added

comment:10 Changed 9 months ago by anarcat

i've completed the mandos setup

i disabled the modules_disabled.timer and added a condition to the .service so it doesn't start if a /etc/no_modules_disabled file is present (by hand)

i disabled ipsec (systemctl disable ipsec)

and i'm trying another reboot

comment:11 Changed 9 months ago by anarcat

nothing works. i've requested a KVM console, but i'm not sure how long that's going to take.

to access the machine from the rescue system, I do this:

cryptsetup luksOpen /dev/md1 crypt_dev_md1
vgchange -a y 
mount /dev/vg_nvme/root /mnt
mount /dev/md0 /mnt/boot
for dev in dev sys proc run ; do mount -o bind /$dev /mnt/$dev ; done 
hostname fsn-node-03
chroot /mnt
Last edited 9 months ago by anarcat (previous) (diff)

comment:12 Changed 9 months ago by anarcat

i've just tried adding the a (new) mandos key to get md2 covered as well, without luck. now i'll try to just purge mandos altogether.

comment:13 Changed 9 months ago by anarcat

didn't need to purge mandos and figured out the issue. the problem was that setup-storage doesn't use the same PV names as the ones defined by the shell script... so i changed the lvm config a little bit to allow for that, and now the machine boots again.

i tried to add /dev/md2 to the crypttab and that triggered another failed reboot. so that's something that should also be figured out.

comment:14 Changed 9 months ago by anarcat

Status: acceptedneeds_review

i figured out the crypttab thing: the other fsn nodes use a keyfile, so I did that as well.

i also had to rename the VG, which was... painful, but eventually worked (after update-initramfs -u; update-grub).

now I added the node to the cluster and everything looks good. next up is to actually test VM creation.

comment:15 Changed 9 months ago by anarcat

and installing VMs work!!!! this is awesome. :)

i've updated the install docs and slammed a few TODOs here and there. next step here is to start migrating stuff in there, but we're good.

we should focus on a VM we can empty quickly so we can add another pair to fsn-node-03. because right now the secondaries to fsn-node-03 are the existing fsn nodes and those are kind of crammed.

so maybe next step is to retire textile #31686...

comment:16 Changed 9 months ago by anarcat

Resolution: fixed
Status: needs_reviewclosed

comment:17 Changed 9 months ago by anarcat

during the textile transfer, a disk problem was detected on sda, which was replaced by hetzner. but shortly after the new disk started being in use again, the problem came back. followup in #33098.

Last edited 9 months ago by anarcat (previous) (diff)
Note: See TracTickets for help on using tickets.