Not to take away from Customer Appreciation Month – But today’s extended outage did throw some customers for a loop, and the Reason For Outage (RFO) is better laid out in a post vs. Twitter/Facebook posts.
Earlier this AM the server praise had a brief outage, due to a drive failure in the storage array the exact response from the datacenter was:
One of RAID disks failed. We are working on its replacing.
Easy peasy – it was replaced and only a minor hiccup as the storage system went briefly into read-only mode. Fast forward to later in the day, to rebuild the redundancy and get the new drive properly in the RAID array, it must be re-sync’d into the system, thus what we experienced this afternoon, tackled with extremely high load on websites & a storm of SPAM hitting the servers filters, it was a combo of just bad timing.
The server went down, and in coming back up ran a FSCK (for Windows Users, think Scan Disk) it checked the disk to make sure everything was working correctly and this can take some time – but things leveled out and came back online with some handy help from the datacenter crew, things are back online.
Now please be aware there is still a sync process on-going and its syncing a LOT of data, so sluggish response may continue – please bear with me on this, I wish I could speed it up, but redundancy isn’t something I take lightly.
The server will continue to be closely monitored and backups still are on-going so your data is safely backed up – but if you have any questions or concerns about the outage today, please do not hesitate to contact me via Email, Ticket, Forums or Live Chat.
Thanks again for your patience and prayers!