Saturday 18th February 2017

Apollo Apollo outage

Apollo unresponsive, investigating caused by a deadlock following cancellation of find command, a bug noted on 3.18 kernels.

Further, remote access on Apollo is offline, assuming it went down with the data center power outage on 2/14.

Remote hands are taking care of power cycling the server at this time.

1:30 PM - server back up, remote access firmware reflashed. We're also changing how account suspensions happen, because before today an account suspension persisted the web server config. If an account is suspended and IP address recycled for use elsewhere after 6 months, when a server comes up from boot that IP address is no longer bound to the server resulting in a critical error.

An account once suspended will have its Apache configuration also removed to resolve such situations.