Clevis and Tang – NBDE

Introduction

What is NBDE? Its a short name for Network Bound Disk Encryption and it uses LUKS for the encrypted disks and clevis and tang for automatically unlock with a network key server.

I’ve managed to get LUKS together with clevis and tang to work with the following OS’s.

RHEL 7,8,9, Ubuntu 20.04, 22.04 and SUSE 15.

There are some differences between the OS’es regarding some syntax and package names, also difference in support for unlocking OS partition with tang servers.

To fix the boot unlock I’ve used dracut independent of what has been used before, there is also differences in what is supported in the dracut/initramfs kernel, if its NetworkManager or ifcfg/legacy network. Here EL is changing during the minor releases of EL8 and 9, so something that worked for EL 8.4 doesn’t work for 8.6 (where for example ifcfg legacy networking got deprecated in dracut/initramfs)

 

fstab example from scripts with xfs filesystem

"${ENDCRYPTEDUUID}  ${MOUNTPOINT} xfs defaults,_netdev,relatime,inode64,uquota,gquota,x-systemd.requires=network-online.target,x-systemd.requires=systemd-cryptsetup@${ENCRYPTDEVICE}.service 1 2"
 
crypttab example 
${CRYPTDEVICE} UUID="${UUID}" none _netdev
 
Packages needed for EL is 
clevis clevis-luks clevis-dracut clevis-systemd
for Ubuntu 
xfsprogs parted clevis clevis-dracut clevis-luks
for SUSE
clevis clevis-luks clevis-dracut clevis-systemd
 
Services needed to make sure they are started
EL / Ubuntu / SUSE
systemctl --now enable remote-cryptsetup.target
 
 
For doing this when servers are deployed I have a static passphrase for slot 0 on the LUKS device, which is later changed with and ansible playbook and put into a password safe.
 
I went for using UUID because systemd bootup in parallell doesnt make it sure that /dev/sdb is keeps the same name over reboot. This makes it also a problem with fstab, the systemd unit created has the unlocked device name in its name (for example systemd-cryptsetup@sdb1crypt.service).
 
The part with early unlock all of OS’s needs some kind of networking running, 
 
added this to /etc/dracut.d/10-ip.conf
 
rd.neednet=1 rd.shell=0 rd.emergency=reboot rd.luks.timeout=60 rd.timeout=60 ip=dhcp
 
To force networking even if ip is defined, sometimes dracut didn’t add network support anyway.
rd.luks.timeout=60 is if networking doesn’t come up, the OS stops trying after a while so to force this I configured it to reboot. In the case you have teaming/bonding in case the switches hasn’t come up yet when servers boots after a power outage, then the reboot was needed in my environment.
 
In later versions of clevis client its quite easy to list and manipulate which tang servers the device uses.
 

clevisandtang:~ # clevis luks list -d /dev/sdb1
1: tang '{"url":"http://tang1"}'
2: tang '{"url":"http://tang2"}'
3: tang '{"url":"http://tang3"}'

When I add the tang servers, I first download their advertisement with something like this

curl-sfg http://tang/adv -o /root/$tangserver-adv.jws

To be able to use offline later on if needed

I’ve also made a playbook to use when rotating the tang server keys, which is needed to force an update from the tang servers clients, it would be somewhat bad if tang server keys got rotated and without the tang clients to get the key in time 🙂
 
There are alot of spread out ”documentation” on clevis and tang, but no ”one place” to find it all.
Then we have the changes between minor versions in EL 8 and 9, it can suddenly stop working when deploying a EL 9.2 with something that worked in 9.0.
Like in EL 8.6 where it required autofs to work so in fstab the option ”x-systemd.automount” was needed but later on that option sometimes didn’t work for all disks, so going back to static mount as in the option showed above after EL 8.6.
 
There are some things to take inte account, passphrase, should we have a passphrase on slot 0? How do we store that one securely in that case? (and manage it, changing it and so on).
Setup more than 1 tang server, in the example above I have 3 tang servers, but more isn’t bad.
If one is down, it will try the next in the list. Then there is the dependence on DNS, should IP address be used instead?
 

Du kanske också gillar…

Populära inlägg