On October 26, 2005: Data Lost
Here's a quick summary of what occurred on “White Wednesday”:
- While altering some data tables in the code to clean up a bit of unneeded material, Jick accidently deleted them all.
- Most of the data had been backed-up the day before, but certain tables (such as the Ascension table and Hagnk's item table) hadn't been fully backed-up in a long time -- since September 6, 2005, over a month prior.
- The server hosts were unwilling to cooperate and refused to restore the needed files themselves.
- KoL was taken down and what data remained was restored.
- The Time Arc was created to give a narrative explanation to the oddities throughout the Kingdom.
- All players received a piece of petrified time after the game was restored.
The following is a detailed account giving an exact timeline quoting each of Jick's announcements during the course of the event.
Jick announced via the KoL Forums 1:00pm his time:
Q: How many clicks does it take for Jick, when he's not paying enough attention, to destroy the contents of the primary database server?
A: Ask Mr. Owl.
O: One, Two, Three, Four. Four, because Jick is the biggest dumbass who has ever traversed this earth by shuffling his dumb, dumb buttocks in a grisly and pitiful parody of walking.
So, yeah. In an effort to drop a couple of tables from the database, I accidentally dropped the entire database. Which meant we had two options.
1) Get the host to try to undelete the files, since it was such that it required physical access to the box to attempt it, or
2) Restore from backups. This option is less desirable.
The first option proved impossible, partly because of technical issues, but mostly because of the host's antagonistic policies. Their techs are not allowed to interact with customer data in any way, due to “liability issues.” We pressed the issue, but to no avail.
So, we have to restore from backup. We do two kinds of backups. Full backups, which are done during the times when rollover lasts a long time, and incremental “core data” backups that are done every night, about five hours after rollover.
The last full backup was done on September 6th. It had been a while, and I was getting nervous. That's why I announced a long rollover yesterday. Luckily, I managed to drop the database about 30 seconds before I was gonna back it up.
Ideally, this would've meant a one-day rollback. We'd have to manually process a day's worth of donations, and everybody would lose a day's progress, but not such a terrible thing, in the grand scheme of things.
In our less than ideal world, though, some stuff went wrong. There was an issue with the core backup that caused a couple of tables to not get backed up, and there were also a couple of other tables that had never been added to the core backup, since they were added since the last time I updated the backup script, which was towards the end of ascension testing. This is going to cause some data to revert to September 6th.
So, basically, here's what's gonna happen. As near as we can tell, inventory, player data, flags, quest status, all the core stuff, that'll swing back to what it was last night, five hours after rollover. Some people lose a day, some people don't. Mall store inventory is still up-to-date, since that's on the secondary server, along with kmail and a bunch of other stuff that wouldn't have been a big deal to lose.
However. This is where the groaning begins. There are a few important things that are gonna revert to September 6th, because they didn't, for one of the two reasons I described above, get saved during the nightly backup run. Notice how I use the passive voice to shift responsibility away from myself. I mean, notice how the passive voice is used by me.
Chief among them: Hagnk's. Yeah. I know. Ouch. Basically, your items in ascension storage will revert back to what they were 6 weeks ago. Which is gonna do, for a lot of people, one of two things: If somebody had stuff there yesterday that wasn't there six weeks ago, it'll be gone. And if somebody had stuff in there six weeks ago that is now in their inventory, it'll basically be duped.
Also, trade offers. This will also cause some duping and/or lost items, but it should be relatively minimal.
Also... and I know this is gonna hurt some people a lot, ascension records. We need to figure out what to do about this. A lot of the staff and dev team are in favor of a total reset on these, since they feel it'd be better than reversion. I'm open to suggestions from the community at large on how to handle it.
Last, but not least, some permanent skill data will be lost. We can reconstruct most of it. The worst case scenario will be the person who had a ton of softcore skills and was in Hardcore yesterday. Everybody else, we can reconstruct most, if not all of what was lost.
Right now (1PM my time on Wednesday,) what's going on is that we're actually restoring the data from last night's backup. It's been running all night, because apparently the scheme we used for backing stuff up is quick to dump it out, but incredibly slow to dump it back in. It's as aggravating for us as it is for you, because we can't really do ANYTHING until this is done.
I suppose this goes without saying, but I take full responsibility for this. It was a dumb mistake, and a series of prior dumb mistakes that screwed up the backup process. It wasn't Mr. Skullhead, it wasn't anybody else on the staff or the dev team, it was all me. And I will try to make it better.
We'll have some kind of special eventy type stuff after we see what kind of shape we're in, and at the very lest get everybody some sort of “I was there” bit of in-game goodies. Riff and I have a lot of time to sit around talking about stuff today while the restore runs, so I'm sure we'll come up with some kind of interesting narrative to provide an explanation for what happened and some nifty new content to soften the blow.
No idea how to end this. Thanks for your patience and support. We'll get through this together, us and y'all. You with your beverage or drug of choice, us with a mixture of equal parts coffee and Pepto-Bismol.
p.s. i love ween0rz
depending on the URL used, this appeared:
Warning: mysql_connect(): Can't connect to MySQL server on '10.0.0.1' (111) in /home/jick/htdocs/adventure.php on line 19 Could not connect to database server:0 -
Jick announced via the KoL Forums 7:00pm his time:
We're still getting the database contents transferred from the sandbox machine here in the office where we reconstructed it to the live database server. This was sort of a dumb plan because of our slow upstream (normal cable,) but we wanted to get going on restoring from the backup early, while there was still a chance of recovering the original data, and that meant not touching the live server's hard drive.
When this is finished, in what I hope will be an hour or two, we'll hop onto the dev server and get everything working, at which point we'll assess the actual damage.
Over the next day, we'll reconstruct as much of the missing data as we can do automatically, and Xeno will write a trouble-ticket system so people can report missing skills and items from Hagnk's. We'll write some tools for Xlyinia so she can restore stuff on a case-by-case basis. It won't be as fast as people would like, but it will happen.
So far, we have plans in place for the following:
- Reconstructing permanent skills. We can see how many times you've ascended (this is in the player table as well as in the ascension records,) so the skills we don't automatically regrant can be given out case-by-case.
- Reconstructing entire inventories lost in Hagnk's. Up to the status of the inventory table on the 6th. This should solve most of the losses for the people who've played a long time and ascended for the first time since the 6th. Those people will get their 6th inventory dumped into Hagnk's.
- Restoring, for the most part, any lost stainless steel or plexiglass items. Again, we've got solid counts on ascensions, so this will be fairly easy to verify.
As we get a better handle on the situation, we'll figure out what else can be dealt with and what can't.
I feel like it's prudent at this point to say that the game will definitely not come back up tonight. We want to take until at least rollover-time tomorrow to make sure as little damage as possible is done.
p.s. i still love ween0rz
Jick announced via the KoL Forums 10:50pm his time:
Okay. We got the data set back up on the dev server. We've done a once-over testing of the regular game functionality, but it's still gonna take some work.
My plan now is as follows:
Riff and I will work on the “make it up to you” content while Xeno writes the scripts to repair / reconstruct the lost data. There are a handful of relatively mundane things to fix, it's just gonna take some time.
In a couple of hours, I'm gonna let the rest of the dev team back onto the server, to have them help us poke at stuff to make sure nothing else is screwed up. Also to test the new content.
I can say with some certainty now that I believe the game will come back up at some point tomorrow. We're not really going to have a good sense of what individual people lost until they actually log on and report it.
p.s. zomg ween0rz
Jick announced via the KoL Forums Thu Oct 27 8:07pm his time:
Okay, so we're about ready to open back up.
Everything is as restored as we can get it. Riff and I are done with the first part of the “make it up to you” content and it's being tested on dev as I write this.
Xeno is working on a tool to help people recover lost permanent skills. We'll have instructions on how to report losses and whatnot, and we'll do our best to get everything back to everybody over the next few days.
I'm thinking less than an hour, at this point.
p.s. i still love teh cock
In the aftermath, several temporal rifts have opened up.
Due to the 7:00 pm post, White Wednesday will not seem to do as much damage as expected, but may still affect those that have ascended multiple times.
For those of you who are witnessing this today, please speak your mind in discussion! Just type a line or so and leave some space between each comment...leave your name if you wish, too.
Due to the way events unfolded, and the (perceived) devastating nature, some people also reference to this event as Hurricane Jick or Hurricane Hagnk, drawing parallels to Hurricane Katrina. Other names suggested for the event, and the player who first mentioned it on the forums, if known: Ragnarok (Blodax Devourer of Souls); All your database are belong to us; Total ReKol; Dia del PWNT (Borax The Clean); and ApoKOLypse (Benedictine Monkey). According to a trivial update on January 15, 2006, the official name for the event is the Great Time Catastrophe.