Skip to content

Post-powercut Todo List

A list of things that should be done/checked immediately after a power cut:

  • Ensure the aperture servers have the correct IP addresses:
    • eno1 should have the internal IP address (10.10.0.0/24) - this should be reserved by DHCP on mordor
    • eno2 should have no IP address
    • br0 should have the external IP address (136.206.16.0/24) - this should also be reserved by DHCP on mordor
  • If the bastion-vm fails to start, check:
    • /storage is mounted rw on each aperture server
    • br0 is present and configured on each aperture server
    • vm-resources.service.consul is running and http://vm-resources.service.consul:8000/bastion/bastion-vm-latest.qcow2 is accessible
    • if the latest symlink points to a corrupted image, ln -sf it to an earlier one
  • All the nixos boxes rely on DNS for LDAP and NFS:
    • Make sure bind is running on paphos
    • mount /storage
    • systemctl restart httpd, php-fpm-rbusers-* and ldap
  • Apache on hardcase sometimes tries to start before networking is finished starting. To fix it, disable/re-enable it a few times. This usually makes it turn on.
  • Mailman on hardcase has a lock file at /var/lib/mailman/lock/master.lck. If it doesn't shut down correctly, this lock file will block mailman from starting up. Remove it with:
Bash
rm /var/lib/mailman/lock/master.lck
  • paphos is old and sometimes its time will become out of sync. To make sure its time is accurate, run:
Bash
sudo service ntp restart

and ensure you have the correct time with date