Virtual / Scientific / Debian Maintenance

Late this evening (probably around 11pm), I will be taking virtual.eskimo.com, which hosts scientific.eskimo.com and debian.eskimo.com shell servers, down for maintenance.

I need to do this to troubleshoot CPU overheating issues on this machine.  It is nearly identical to Iglulik in terms of hardware, but the CPU on Iglulik is barely above room temperature where the CPU on this box is barely below the boiling point of water.  They are both i7-2600 CPUs.

I will suspend the guests before shutting down so that anything running on shutdown will be running again on start-up.

Shellx Restored To Service

Shellx is now restored to service and available for use.  The IP address has changed, you may need to flush your DNS cache.

If you use ssh (or anything that tunnels through ssh such as nx or vnc) to connect, you may also need to remove the line from .ssh/known_hosts as the IP is different.

Shellx Temporarily Out Of Service

It turns out that shellx was the target of the attack that affected the Bellevue Co-Location facility at Isomedia.  They have black-holed the IP address of shellx to protect the rest of their customers and as a result it was necessary to change the IP to bring shellx back into service.

In the process of doing so, I discovered some missing software and configuration issues that are potentially security affecting and so am working to correct those.  After that is done I will need to take the machine down for 25 minutes or so to image it.

Denial of Service Attack

The co-location facility where we have our equipment is currently undergoing a denial of service attack.  They are working to isolate the source and mitigate the attack.  In the meantime it is severely impacting the routing to our equipment.

Thanks for the Advice

I want to thank all of you who have clicked on the “Advise Us” link and taken the time to fill out our survey.

The major thing that I’ve come away with is that our website needs to be upgraded in at least four major ways.

1) It needs to be responsive so that it is viewable and operational on smart phones and tablets.  This was not only stated, especially where e-mail is concerned, but also implied by the fact that so few of our customers are connecting via a smart phones even though in 2012, they made up over 25% of primary browsers and probably even more today.  Those are essentially lost customers.

2)  It needs a modern interface.  People are asking for features and information that is already present which is a strong indicator that said features and information are too difficult to find.  People also indicated a desire for better aesthetics.

3) There is a need for information that currently is not available on our website.

4) There is a desire for web applications to do certain things easily such as leaving a vacation message, configuring domain hosting and other services online, etc.

A lot of the complaints centered around problems that have been resolved for the last year or year and a half.  We haven’t had a single crash since we upgraded the infrastructure and there have only a few short unintended outages, the result of upgrades gone bad.  Not much I can do about the past.

A lot of people would like us to provide our own high speed access.  Unfortunately, the level of capitalization that requires works on a scale of millions of customers but not on a scale of hundreds or even thousands so we are forced to work with telephone companies and wholesale providers.

Virtually all respondents said we needed better documentation.

 

Scientific Key

Somehow the key for scientific.eskimo.com got hurt.  In order to connect with NX it will be necessary to re-download the key and re-import it into NX-client 3.5 or NX-player 4.0.x.

You can get the key here: scientific.key

Customer Affecting Maintenance Completed

That portion of this mornings maintenance activity that required machines to be out of service has been completed.  There is some additional work that will be done later today but it involves machines that are duplicated so will not impact service.

This Mornings Maintenance Taking Longer

This mornings maintenance is taking longer than expected.  Two virtual machines suspended instead of shutting down for reasons unknown and as a result the images were unusable.  I had to take shellx down for second time to re-image it and haven’t even started on the web server yet.  When shellx comes back up, it will be done for the night, then the web server will go down for about twenty minutes.  After that, all the customer affecting work will be done save for some compression jobs which may slow things somewhat.