FlockHosting Blog

Church Hosting Outage Update

It has certainly been quiet on here, but a post about an outage was not my plan for a first post back, and while I could have posted the FlockUpdate first, an explanation was due – So let’s get to it.

Thursday afternoon there was an outage which affected Church Hosting and a few other server customers. Immediately the DC was reached and they began a review of hardware and services to see what was going on, and they began researching the cause.

We had some radio silence from them as usually means they are working on a problem if they’ve found one, but it appears it was an issue that expanded to more customers in the data center than just FlockHosting assets.

Upon finally talking with a tech as things were wrapping up it seems the data center had an HVAC failure which began to endanger customer servers and they were required to shut things down to keep the heat down and resolve the issue as quickly as possible.

The crews did an amazing job, no data loss or issues.

The secondary outages were caused by many folks who’d been patiently waiting for your Church sites to come back up. The servers were blasted with activity which caused 2 additional web outages as to allow the servers to fully boot & get caught up a little, the web servers were halted briefly, this allowed email to start flooding in, server startup scripts to process boot up processes, and do so and not bring servers offline due to the sheer load of WordPress doing catch-ups like wp-cron running backup scripts at the same time to do backups which were to have been done during the outage.

DC has made repairs for now, but they may do additional maintenance in the coming weeks to ensure this doesn’t happen again so if there are any additional maintenance windows I will post that as soon as possible.

I do apologize for outage, I wish they could be avoided but thankfully God is Good and we don’t see these type that often if at all, so thank you guys for your prayers during the outage, kind words in tickets/live-chat – there are some things coming in July which will continue to expand Church Hosting so more good things are coming!

RFO: Praise

Not to take away from Customer Appreciation Month – But today’s extended outage did throw some customers for a loop, and the Reason For Outage (RFO) is better laid out in a post vs. Twitter/Facebook posts.

Earlier this AM the server praise had a brief outage, due to a drive failure in the storage array the exact response from the datacenter was:

One of RAID disks failed. We are working on its replacing.

Easy peasy – it was replaced and only a minor hiccup as the storage system went briefly into read-only mode. Fast forward to later in the day, to rebuild the redundancy and get the new drive properly in the RAID array, it must be re-sync’d into the system, thus what we experienced this afternoon, tackled with extremely high load on websites & a storm of SPAM hitting the servers filters, it was a combo of just bad timing.

The server went down, and in coming back up ran a FSCK (for Windows Users, think Scan Disk) it checked the disk to make sure everything was working correctly and this can take some time – but things leveled out and came back online with some handy help from the datacenter crew, things are back online.

Now please be aware there is still a sync process on-going and its syncing a LOT of data, so sluggish response may continue – please bear with me on this, I wish I could speed it up, but redundancy isn’t something I take lightly.

The server will continue to be closely monitored and backups still are on-going so your data is safely backed up – but if you have any questions or concerns about the outage today, please do not hesitate to contact me via Email, Ticket, Forums or Live Chat.

Thanks again for your patience and prayers!