I’ve replaced the Intel NIC’s with TP-Link NIC’s. The first machine took close to two hours because at first I was not able to get it to work. I finally chased it down to lack of patience on my part, these cards take approximately 1 minute to initialize.
I was not able to use the old cards in a private network as I had planned because as soon as I configured one, localhost became bound to it and broke the other connection. I’m sure this is operator malfunction but I will need to do further research.
I will be taking various machines down tonight for about fifteen minutes each to install new NIC cards with non-Intel chipsets. From 4.15.0 forward, the Linux kernel has had a bug in the Intel E-1000 drivers that cause the cards to lock-up when hardware offloading is used. Usually these lock-ups are transient resulting in 2-3 second delays in data but occasionally the cards will lock hard and require a drive to the co-lo facility to physically reset the machine.
Because the servers most affected are those carrying heavy traffic, the NFS server providing the home directories in particular, I will be replacing the NIC cards on all the NFS servers. This will affect virtually all of our services but will prevent long down times like we suffered Sunday morning from recurring.
I filed a bug report April of this year on this problem. Canonical has offered me various kernels to try, many of them either did not boot at all or were extremely unstable. At this point I feel it’s more cost effective and less service affecting just to replace the hardware.
The ethernet controller on the server that provides /home pages wedged today. Most services depend upon being able to access /home and were unavailable as a result.
As near as I can tell looking at the logs, the ethernet wedged shortly after 5AM but I did not receive any telephone calls until around 11AM. I was unable to fix it from here so I had to drive to the co-location facility.
Everything was back in service at 12:46 (afternoon).
Completed imaging julinux.yellow-snow.net. Give this machine a try, it’s a very nice Ubuntu derived server. It has the stability and software base of Ubuntu but with nicer artwork.
I am taking julinux down for a few minutes to make an image to capture all the recently installed software in the image.
I am investigating a new location for our user meetings. This would be Amante Pizza on 123rd and Roosevelt Northeast in Seattle.
The Pizza is excellent, they have spirits for those who care to imbibe, and they have a big screen TV that in theory can be connected to a computer and we can use for presentations. Many people seem to want a more structured meeting but trying to do presentations on paper doesn’t work so well. Being able to fire up a computer with a big screen live would be a huge plus.
They do not know what inputs it has so I have to stop by and determine how to connect.
I had been to Amante before when they were on 196th and just off of 44th in Lynnwood. That restaurant had good food, great decor, but piss poor service. This restaurant has excellent food, excellent service, but marginal decor. However I think the room there is much more suited to our needs, it is completely walled off with glass walls from the rest of the restaurant so our noise won’t interfere with other diners and vice versa.
I managed to get Mush to compile on modern Linux machines. It is now available on ALL of our Linux servers.
We have a new server available for your use but this one is in the yellow-snow.net domain. The full server name is julinux.yellow-snow.net. If you use this server, e-mail you send will by default by firstname.lastname@example.org. E-mail to this address will also come to your INBOX.
This server is a new Linux distribution called JULinux but it is only barely a distribution as it is essentially the Mate spin of Ubuntu configured to look like Windows with very nice artwork. The software is all 100% Ubuntu so it has the stability, security, and is current like Ubuntu.
The KDE implementation is broken in as much as the logout does not work so I would ask that you avoid using KDE with x2go on this server until I can figure out how to get it fixed. All other x2go compatible window managers are working.
If you do not have an existing Mate configuration and connect to this machine with x2go using mate, it will look like Windows. If you do have an existing configuration then it will look the same as all the other Debian / Ubuntu derivatives.
We do not yet have a corresponding yellow-snow.net web appearance but this is in the works.
Received a report that people were unable to login to Debian this afternoon. I found that ypbind had died but there was no indication in the logs as to why. I restarted it and logins are functioning normally again.
Slow service early this morning and the temporary unavailability of mail.eskimo.com was the result of a denial of service attack where upon our name servers were used as amplifiers in a denial of service attack aimed at us. I had to lower the external view rate limit because of this, hopefully it is still adequate to service legitimate requests.
There are aspects of this attack that I do not understand. They forged an address of 22.214.171.124 from outside (udp packets so no three-way connect) and directed requests at 126.96.36.199, so our name servers would attempt to reply to 188.8.131.52 but there was no host on that IP address and the result was that our router didn’t know what to do with it and it overloaded it logging what it considered “Martian” packets.
The puzzling aspect of this is I have a firewall rule that SHOULD block all traffic from an external interface which has an internal address. I was able to mitigate the attack by blackholeing 184.108.40.206 at the name servers and rate limiting responses.