STF Outage - 6/1-6/2

User avatar
doc
Site Admin
Posts: 3578
Joined: Thu Mar 31, 2005 2:38 pm

STF Outage - 6/1-6/2

Post by doc »

OK, Here's my story and I'm sticking to it.

STF and 9000 other websites have been dark for 2 days because of a fire and explosion at our host ISP in Houston, TX. During that time, we have not been able to do anything but twiddle our thumbs and wait for our connection to be restored. I pray fot those that actually run business sites. They must have lost their minds.

Anyway, we seem to be back rolling again. For those interested, here is the latest update from our host.
06/01/2008 11:00 PM CDT Update
As previously committed, I would like to provide an update on where we stand following yesterday's explosion in our H1 data center. First, I would like to extend my sincere thanks for your patience during the past 28 hours. We are acutely aware that uptime is critical to your business, and you have my personal commitment that The Planet team will continue to work around the clock to restore your service.

As you have read, we have begun receiving some of the equipment required to start repairs. While no customer servers have been damaged or lost, we have new information that damage to our H1 data center is worse than initially expected. Three walls of the electrical equipment room on the first floor blew several feet from their original position, and the underground cabling that powers the first floor of H1 was destroyed.

There is some good news, however. We have found a way to get power to Phase 2 (upstairs, second floor) of the data center and to restore network connectivity. We will be powering up the air conditioning system and other necessary equipment within the next few hours. Once these systems are tested, we will begin bringing the 6,000 servers online. It will take four to five hours to get them all running.

We have brought in additional support from Dallas to have more hands and eyes on site to help with any servers that may experience problems. The call center has also brought in double staff to handle the increase in tickets we're expecting. Hopefully by sunrise tomorrow Phase 2 will be well on its way to full production.

Let me next address Phase 1 (first floor) of the data center and the affected 3,000 servers. The news is not as good, and we were not as lucky. The damage there was far more extensive, and we have a bigger challenge that will require a two-step process. For the first step, we have designed a temporary method that we believe will bring power back to those servers sometime tomorrow evening, but the solution will be temporary. We will use a generator to supply power through next weekend when the necessary gear will be delivered to permanently restore normal utility power and our battery backup system. During the upcoming week, we will be working with those customers to resolve issues.

We know this may not be a satisfactory solution for you and your business but at this time, it is the best we can do.

We understand that you will be due service credits based on our Service Level Agreement. We will proactively begin providing those following the restoration of service, which is our number priority, so please bear with us until this has been completed.

I recognize that this is not all good news. I can only assure you we will continue to utilize every means possible to fully restore service.

I plan to have an audio update tomorrow evening.

Until then,

Douglas J. Erwin
Chairman & Chief Executive Officer
Given the circumstances, I am pleased with the host getting going as fast as they have. For all those inconvenienced by the outage, my apologies.

Regards to all,

doc
User avatar
doc
Site Admin
Posts: 3578
Joined: Thu Mar 31, 2005 2:38 pm

Post by doc »

Here's the latest update.




Dear Customer,

Late last night, I told you we hoped to have power to the 6,000 servers in Phase 2 of our H1 data center by midnight, with all servers up by early morning. I am glad to say we came close, just a few hours after sunrise. At this time, 100% of our servers in Phase 2 have power, and our technicians are working with customers on any remaining server issues. We are confident all remaining issues will be resolved shortly.

I also explained the significant challenge we faced in the other phase where the actual explosion occurred. Our team came up with a creative way to restore power quicker than the 4-5 day outage. We decided not to wait for equipment for the electrical room completely, opting instead for a temporary solution to get power to the 3,000 servers. That solution involves using generator power for the next 10 -12 days until all the new equipment arrives to rebuild the electrical room for Phase 1. I explained that we expected to have a temporary solution in place by midnight tonight, with servers powered up tomorrow. The good news is that as you read this letter, the power is restored, and the temporary solution is in effect. Within the next two hours, the remaining 3,000 servers have power. We have overstaffed our data centers again to help during this initial power up.

This now leaves us facing step two of this process, which requires getting all of the equipment delivered and then rebuilding the electrical room to its original standard. To make the cutover to the rebuilt electrical room, the operations group believed it would take a maintenance outage of 24-48 hours. I have good news on that front. It's not perfect, but at present we now believe the maintenance window will be just 4-6 hours. That's still too long, and we will continue this week to find ways to reduce the time. Given that there will be some outage for the cutover, we will execute this step at midnight on a Saturday, either June 7 or June 14. We want to pick the most appropriate time to minimize impact to you.

I must admit that I am amazed. We are almost 18 hours ahead of schedule with this phase, thanks to our great suppliers and of course the great folks working here at The Planet. This could never have happened without the help of both, and I want to thank all of them.

There is still more work to do, but the progress is terrific. We will continue to work any and all customer issues, and we face the challenge of putting the permanent power fix in place for Phase 1. Nonetheless, there is still good news based on what I told you last night.

As each hour passes, we learn more and more. Please give us the time to continue our planning. We will provide you with information as we have it.

Until tonight's update ...

Douglas J. Erwin
Chairman & Chief Executive Officer
We are in Phase I, of course...... :roll: . And, as you can read, we apparently will have another shut down in the next couple of Saturday nights. Hopefully, most of us can find another way to spend that particular Saturday. And, I suppose we are not beyond trouble yet with the application of "temporary power" After all, gas is getting expensive. I will continue to post the news as it may impact the board and members. Stay tuned.

doc
SiQDiZ
Site Admin
Posts: 266
Joined: Mon Oct 29, 2007 1:13 pm

Post by SiQDiZ »

Allright we're up again ladies and gentlemen, had massive datacorruption on various SQL tables. Hurray for power outages. :evil:
Post Reply