Partial Web Outage

     We have a partial web outage now.

     This was caused by a misconfiguration of the mariadb database on the new server that caused it to eat itself.  I am in the process of correcting that configuration issue and restoring the server.  Unfortunately, owing to the size of the database, this will take some time.  Perhaps an hour or so.

Wednesday Evening / Night

     Sometime Wednesday evening or night I am going to take the new machine, Inuvik.eskimo.com down, to replace the two drives I had intended to use for backup space as both have suffered a head crash at some point and have media errors that are too numerous to map-out with two spare tracks provided.  Because when used in a raid, you can’t use the Linux defective i-node to map them out manually, they are of no use to me, but they may be for someone else as they are 4TB drives with perhaps 300 512 byte blocks defective each.  They are seven years old so out of warranty (warranted for five years).

     This new machine presently has https://friendica.eskimo.com/, the Debian shell server debian.eskimo.com and the Manjaro shell server, manjaro.eskimo.com on it so these will be out of service while the drives are replaced.

     So if someone wants a couple of 4TB with about three hundred bad sectors that you need to manage manually let me know.  Preference will be given to Eskimo North customers, and then after that first come first serve.  Preferably, be close enough to Shoreline, WA to pick up, else be willing to pay for shipping.

Database Migration

I am migrating the mariadb database (you may know it as mysql) from the old server to the new server tonight.  Please do not make posts or other database changes that you can not afford to lose tonight.  There is 117GB in the database, so this is not something that can happen instantly.

Unplanned Outages

     Sorry for the unplanned outages of Debian and Manjaro tonight.

     As I mentioned earlier, I had moved these two to the new machine primary because I have to build kernels on both, on Debian because Debian signs it’s kernels and won’t load an unsigned kernel built on a non-Debian machine, on Manjaro also because it’s kernel environment is unique.  So moving them to this machine reduces the time it takes to build kernels.

     Well, I was installing software to get it ready for it’s main function which is web applications, and I had ufw installed, and accidentally installed avahi, autoipd, ppp, and firewalld, and somehow this broke networking.

     While I was at the co-lo fixing this stuff, I also changed some configuration to make the system NOT depend upon disk-by-uuid because as useful as this feature is, it is not reliable and results in failed boots.

Debian, Majaro, Kernel Upgrades

     Manjaro.eskimo.com is now operational again, including Mate Desktop and workspace switcher.

     Debian.eskimo.com is now fully upgraded to Bookworm.

     We will be doing a kernel upgrade Saturday July 8th starting at 11pm.

     This upgrade will require reboots of all servers and hence interruption of all services, paid and free, including mail, web hosting, shell accounts, and virtual private servers, and https://friendica.eskimo.com/, https://hubzilla.eskimo.com/, https://nextcloud.eskimo.com/, and https://yacy.eskimo.com/.

     Most of these services will not be down longer than about ten minutes except for https://yacy.eskimo.com/, because it takes about 45 minutes to rebuild a database.  All services should be back up by midnight.

Debian Upgrade – Still In the Works – And More…

The upgrade from Bullseye to Bookworm is turning out to be a MUCH larger upgrade than was the upgrade from Buster to Bulleye, more than 8200 packages are being updated so it is taking much longer than expected.

Then, when it is finished, I am going to try to move this virtual domain off the existing server to the newly built server because this is one of the hosts I build kernels on owing to the necessity of them being signed for Debian and the new machine builds a kernel package almost twice as fast as the one it’s presently on.  So when it gets done there is going to be some downtime while I copy the virtual machine between machines.

No Service Interruption Tonight

     After a bit of experimentation, I found having the CPU voltage follow the CPU frequency resulted in about 40 watts of idle power savings but occasionally caused CPU data errors.  The latter is not acceptable so that part of the plan has been eliminated.  I will be installing the new server tonight though it will take time to migrate functionality to it, but it will not require a service interruption.

Maintenance Work July 1st 10pm-2am

     I am planning on doing some major maintenance work tomorrow.  First will be the installation of our newest server.  This should not disturb existing services but there is always the potential for operator error.

     The second thing I will be doing, and this WILL result in downtime for various services, is to take the various physical servers down for some time to do some BIOS tuning.  The idea is to switch from a fixed CPU voltage to a variable voltage that changes with clock speed so that during times of low load on a given machine, power consumption and heat production will be less.

     CPUs require more voltage to be stable at higher clock frequencies.  All of our modern machines change clock rate with load up to some defined maximum, but presently are using a fixed CPU voltage which is suitable to the highest load.  This change will have them change their core voltage with clock frequency as necessary for the frequency they are operating at during any given time.  This will require some benchmarking and load testing to optimize.

     This will not change the peak capabilities of the machines significantly.  It may allow them to clock slightly higher than normal for brief intervals if hit with a sudden load when idle and cool because of the thermal mass of the CPU cooler, but mostly it will affect only the low load and idle conditions.