I’ve undone the majority of new bugs introduced by Ubuntu in their Hirsute Hippo release. It took quite a while to chase some of them down. I normally avoid using short term releases but dovecot had some serious problems in 20.04, so upgraded to groovy to resolve, and groovy was a clean update, unlike this one, did not break anything, but since these short term releases are only supported for nine months, one is obligated to upgrade again until you get back to a stable release, the next stable release will be 22.04.
The most disasterous is a Poettering effect, “systemctl reboot” now hangs the machine hard rather than rebooting, this combined with the fact that NFS never really worked 100% correctly under Linux created a situation where I could not get into the physical host to force a reboot of mail.eskimo.com which is a virtual machine SO I needed to drive down to the co-location facility to fix.
Then they opted to replace my systemd scripts for postfix and dovecot. This is problematic because on our server various things are mounted via NFS, among them encryption certificates since it would be a pain to maintain 30 separate certificates for each machine, I have a few wildcard certs for various domains and I have the certs mounted on an NFS partition. That way I only have to update in one place.
This necessitates delaying the startup of postfix, dovecot, and anything else that needs access to encryption certificates until AFTER the partition is mounted. This is easily accomplished with systemd scripts provided upstream operating system providers don’t change them for you and remove these things.
Then they changed pam configuration adding a check for weak passwords which seemed like a good idea except the actual module wasn’t there so it resulted in PAM failures because it couldn’t find the referenced module.
Lastly they broke postfwd, and this was particularly challenging since postfix which calls it rather than report a problem with it, gave the rather generic error, “server misconfiguration”, yet, postfix check which is supposed to check the configuration file reported no errors. I finally found it by turning debugging peer on for my workstation IP address and attempting to send a message and tracing through the logs.
That revealed that postfwd didn’t start. Postfwd is used here to check for spambots using someone’s account if their password is compromised and force a password change if this occurs. Postfwd is a perl script and I’m not fluent in perl so that made it a particularly difficult challenge but it was complaining of missing perl modules in the logs, however, the modules it says were missing were in fact installed. I finally removed the Ubuntu postfwd package and installed it directly from github, fixed.
So after much gnashing of teeth and pulling of hair, I believe our mail system is back to fully operational status again. Since it required many reboots, I checked the NFS status of all servers mounting from it and they all appear to be okay.