Mail Server Repaired

     I’ve undone the majority of new bugs introduced by Ubuntu in their Hirsute Hippo release.  It took quite a while to chase some of them down.  I normally avoid using short term releases but dovecot had some serious problems in 20.04, so upgraded to groovy to resolve, and groovy was a clean update, unlike this one, did not break anything, but since these short term releases are only supported for nine months, one is obligated to upgrade again until you get back to a stable release, the next stable release will be 22.04.

     The most disasterous is a Poettering effect, “systemctl reboot” now hangs the machine hard rather than rebooting, this combined with the fact that NFS never really worked 100% correctly under Linux created a situation where I could not get into the physical host to force a reboot of mail.eskimo.com which is a virtual machine SO I needed to drive down to the co-location facility to fix.

     Then they opted to replace my systemd scripts for postfix and dovecot.  This is problematic because on our server various things are mounted via NFS, among them encryption certificates since it would be a pain to maintain 30 separate certificates for each machine, I have a few wildcard certs for various domains and I have the certs mounted on an NFS partition.  That way I only have to update in one place.

     This necessitates delaying the startup of postfix, dovecot, and anything else that needs access to encryption certificates until AFTER the partition is mounted.  This is easily accomplished with systemd scripts provided upstream operating system providers don’t change them for you and remove these things.

     Then they changed pam configuration adding a check for weak passwords which seemed like a good idea except the actual module wasn’t there so it resulted in PAM failures because it couldn’t find the referenced module.

     Lastly they broke postfwd, and this was particularly challenging since postfix which calls it rather than report a problem with it, gave the rather generic error, “server misconfiguration”, yet, postfix check which is supposed to check the configuration file reported no errors.  I finally found it by turning debugging peer on for my workstation IP address and attempting to send a message and tracing through the logs.

     That revealed that postfwd didn’t start.  Postfwd is used here to check for spambots using someone’s account if their password is compromised and force a password change if this occurs.  Postfwd is a perl script and I’m not fluent in perl so that made it a particularly difficult challenge but it was complaining of missing perl modules in the logs, however, the modules it says were missing were in fact installed.  I finally removed the Ubuntu postfwd package and installed it directly from github, fixed.

     So after much gnashing of teeth and pulling of hair, I believe our mail system is back to fully operational status again.  Since it required many reboots, I checked the NFS status of all servers mounting from it and they all appear to be okay.

Mail

     I’ve identified the issue with mail, but I’ve managed to hang the physical server in an attempt to fix and it is going to require a drive to the co-location facility so things may be broken for the next 1-2 hours.

Postfix

     Last night’s upgrade broke something in the postfix configuration of the mail client server used for sending mail.  Unfortunately it is giving only an extremely generic error making identifying what it is difficult.  It claims the server is misconfigured yet postfix check shows no errors.  Argh!

     I am working on it.

Tonight’s Maintenance Completed

     All Debian based kernels were upgraded to 5.12.4.

     This took an hour and 15 minutes to complete on the physical server hosting the web server because the iomemory drivers did not want to compile under 5.12.4.  After much pulling of hair I found that it was a libc6 version mismatch between the machine that I installed the kernel on and the one I built it on.  The DKMS module for iomemory required this.

     Mail has been upgraded to ubuntu 21.04.  Most of the servers here are on long term releases but dovecot was barely usable on the 20.04 release prompting a rapid upgrade to a short-term intermediate release.

Tonight’s Upgrades

     The web server will be down longer than normal after tonight’s kernel updates because I have to recompile a driver used with the flash drive for the newer kernel.

     The client mail server, mail.eskimo.com, will be intermittently available for several hours after the kernel update as I will also be performing an operating system update.

Kernel Upgrade Friday 11PM PDT

     We will be doing kernel upgrades on all of our servers.  Expected time frame 11PM-11:30PM with perhaps a straggler machine or two.  We’ve had a lot of systemd issues recently that has made startup less than 100% reliable.

     This will affect all of our shell customers and web hosting customers as well as https://friendica.eskimo.com/, https://hubzilla.eskimo.com/, and https://nextcloud.eskimo.com/

Mail from Outlook

     Expect mail issues from outlook.jp and perhaps outlook in general, I do not know to what degree they share services, but outlook.jp appears hacked to death recently and we’ve been receiving a lot of phishing scams from it.  I’m blocking scam addresses as I become aware but this sort of thing will eventually cause a lot of crap to back up on their servers, mail errors to result, and ultimately fail2ban will ban affected servers.  I’ve also sent them to their abuse address but I’ve never received other than a bot response and these continue so obviously they are not addressing the problem.

Outgoing Mail Fixed

     Outgoing mail is fixed except it is still shy on memory.  I meant to reboot last night to increase memory but issues with Debian kept me preoccupied.  I will reboot mail later tonight to double the memory allocation.

Outgoing Mail

     I made some changes to the client mail server yesterday in an attempt to relieve some memory overload issues without rebooting and in the process broke mail in a way that it is causing mail to get stuck in queue.  I’m working on correcting this.  Your mail is not lost and will get sent on it’s way out of queue as soon as I determine what is misconfigured.