MxLinux back up

     The shell server, mxlinux.eskimo.com, is again available.

     It has been upgraded from 18.3 to 19.2.  So far 19.2 appears to be MUCH more solid and bug free relative to 18.3.  I have yet to find any actual bugs in the implementation (but a hell of a lot of operator errors).

     I have not installed many user applications yet, so many mailers, news readers, editors, etc, are not there yet but will be soon.  I think you may want to consider this as a good alternative to debian and ubuntu now.  It is based off of the Debian 10 code base.

     The mail system is basically functional and alpine and mailx are installed but most other mailers are not yet installed.  The pan news reader is installed, most others are not installed.

Iglulik Spontaneously Booted about 3AM

     Iglulik spontaneously rebooted about 3AM, this is the second time it has done this and Ice, the server which has /mail spool, has also spontaneously booted once.  It appears that 5.7 has some stability issues when NFS file systems are exported.  The machines not exporting file systems have been completely stable.

     I know that the Linux community is doing a lot of work on NFS right now, which is good because NFS in Linux has been buggy forever but not in ways that crash machines but in ways that when a server goes away and comes back the clients do not always recover properly.  There was a lot of work in the 5.7 kernel and they just came out with a new nfs-kernel package.  So hopefully they’ll get this resolved soon.

     So I’ve compiled 5.7.1 now on Iglulik.  If it does not spontaneously boot again before tonight I will be rebooting in the early AM to load the new kernel.

MxLinux Down for Upgrade

      I am taking mxlinux down to upgrade from MxLinux 18.3 to MxLinux 19.2.  Because of a change in the debian code base upon which MxLinux is based and because they’ve bastardized Debian beyond the point where normal upgrade procedures can be used, a complete re-install is necessary so it may be down for a few days.

Mail Issues

     We are having some problems with the mail sub-system that I am still struggling to understand.  It is responding slowly and refusing service to some hosts while permitting it to others and I have not yet been able to determine why.

Ice Spontaneously booted today

     Ice, a machine which holds the /mail partition as well as mail, mx2, and some private virtual machines, spontaneously rebooted today.

     There would appear to be some stability issues in 5.7 yet.  I suspect these related to NFS as both ice and iglulik export major NFS partitions used by the rest of the machines here and the physical hosts which do not as well as the virtual machines which do not have all been stable.

     There were some major changes in the NFS v4.2 code in 5.7 which I hope, when adequately debugged, will result in better reliability so I’m not giving up on this kernel just yet but will keep up with point releases and try to identify in greater detail what is failing.

     Everything is back in service but I still need to check all the mounts.

Iglulik

     Iglulik, the host which hosts /home directories spontaneously booted about 3pm today.  I do not yet know what caused this.  I will need to re-check all the NFS mounts on the other hosts since invariably some of them fail to remount correctly.

Hosts NFS / NIS Mounts / Binding Verified

     Hosts NFS mounts have been checked and NIS bindings have been checked.  A few hosts failed to come up completely after reboot.  All of these problems have been resolved and all hosts are operational except for OpenSuse.

     OpenSuse has a problem with a library that breaks NIS.  I opened a ticket on this close to half a year ago.  If it is not resolved soon I am going to discontinue this host.

     If anyone has any suggestions for a better Linux distro, please e-mail them to nanook@eskimo.com.

     Thank you.

Reboots Complete – Still Checking Hosts

     The reboots are completed but I am about an hour behind schedule.

     Two things set me back.  First, SOMETHING installed dnsmasq on my stealth master DNS server.  It is a master that is hidden behind a firewall so that hackers can’t inject nastiness into it and then it supplies all the secondary servers with zone records.

     Because it has bind, it does not need dnsmasq.  Further, dnsmasq breaks bind IF it starts first because it uses the same network port (53) as bind thus blocking bind’s ability to attach to that port and function.

     So at some past point when I rebooted, about a week ago, zone records just now expired and all the secondary servers quit serving them, so when I went to ssh into the server, my workstation couldn’t find them (and neither could any external computer), thus it was broken for everyone but because I had posted about the reboots everyone was expecting an outage and nobody called so I was unaware until i tried to connect and then it took me a little while to figure out what the hell was going on.

     And then once that was resolved, one of Canonical’s engineers (the Ubuntu developers) asked me to try an experiment for them in order to try to nail down a problem with a apparmor profile for libvirtd, and that took additional time.

     Everything is rebooted now but I am still checking for proper NFS mounts and NIS binding of hosts to servers.