Kernel Upgrades / Nextcloud

     I will be doing another kernel upgrade this Friday between 11pm-midnight requiring reboots of all machines.

     As I feared, there were bugs introduced in NFS, particular an issue with delayed requests in NFSD causing the daemon to die (nfs server daemon).  While I have not encountered this yet it is a documented problem so going to upgrade to fix.

     Also some issues where introduced into KVM/QEMU used for hosting virtual machines.  Again, I haven’t experienced these but it is a race condition and so just a matter of time.

     With respect to Nextcloud, still no fix from the developers.  I’m going to attempt a re-install while keeping the existing database.

NextCloud

     I apologize if Nextcloud is not working for you at the moment.  I attempted an upgrade yesterday and it failed resulting in file integrity check failing for a bunch of files.

     I opened a ticket for this and found out many others are experiencing the same thing for this release, but as of yet there is no fix.

 

Mail and Web Maintenance

     I intent to take the mail and web servers down tonight at 11pm to make backup images.  This is just a way to quickly restore the mail server or web server if something catastrophic happens to their file systems.  The mail server should take about 1/2 hour and the web server probably similar.  Don’t know exactly as I have not made an image backup since moving it to flash storage.

Reboots Complete

     Tonight’s kernel upgrade went smoother than I had anticipated.  I had anticipated issues based upon the fact that we were moving to a new major release and I knew that NFS code got somewhat re-worked again.  This has always been a problem in Linux, not sure why they can’t get a 1995 vintage protocol correct but it’s always been difficult.  But this time NFS worked flawlessly.  No duplicate ID’s in fscache either.  NFS properly mounted on all machines.

     NIS bound on all but one.

     A few machines had some start up issues unrelated to kernel, in particular I had removed hddtemp from one because it is a virtual machine and there is no temp sensor for hddtemp to read but it did not remove the systemd service unit.  Argh.  But anyway that was quick and easy to correct.

     So as far as I know everything appears to be functional.  The only machine that failed was one that I had not put into service yet.  Something in the EFI settings got messed and it thinks it’s out of space on the boot device but it is not.  But since it’s not yet in service it is of no consequence.

Ubuntu

     Someone or some thing halted Ubuntu this evening at just after 8pm.  I do not know what, could not locate anything in the logs.

     At this point I’ve removed public access to systemctl, removed a number of graphical reboot/shutdown tools, and performed an early upgrade of the kernel on this machine in case it is a kernel bug.

     If you happen to notice the machine goes aware right after you do something, please let me know what that something is.  E-mail nanook@eskimo.com.  Thank you.

Kernel Upgrades Friday 4/15 11pm-Midnight

     I’m going to upgrade all of the system kernels to 5.17.3 on Friday April 15th between 11pm and midnight Pacific time.

     I would have stuck with the 5.16 kernel until hell froze over if Linus had chose to make it a long term supported kernel, but unfortunately he did not.  5.15 did not perform well and is the previous long term release.  As near as I can tell, a kernel has to have garbage performance in order to be considered for long term support.

     Anyway, I expect reboots to be done by 11:30 and then the next half an hour will be fixing any broken NFS/NIS mounts and given that they’ve done major work on NFS, probably will see some new unfamiliar NFS issues.  So if this update runs past midnight that is probably why.

     This will effect all Eskimo’s web services including our main website https://www.eskimo.com/, nextcloud https://nextcloud.eskimo.com/, friendica at https://friendica.eskimo.com/, and hubzilla at https://hubzilla.eskimo.com/.

Debian / GCC

     Debian is back up but the gcc is back to an older version so I’m going to attempt a compile 11.2 which I know is a good version as I use it for kernel compiles.

Debian

     Sorry, I broke Debian tonight.  I am going to load backups so I can recover /usr/local which I destroyed by installing a broken version of gcc that overwrote the working version.  Then I will reboot back into the current version and replace /usr/local with the backed up version.  Sorry this is kind of a screwball process owing to backing up the machine as a disk image.  It should be operational again by about 1AM.