Author Topic: Unplanned Outage and Maintenance  (Read 12882 times)

0 Members and 1 Guest are viewing this topic.

Offline Zacam

  • Magnificent Bastard
  • Administrator
  • 211
  • I go Sledge-O-Matic on Spammers
    • Steam
    • Twitter
    • ModDB Feature
Unplanned Outage and Maintenance
Well, with all the fun that has been taking place around here what with script kiddies getting rambunctious and the like, we didn't happen to realize a few things.

Like the fact that the main machine was running out of drive space. And that some of the log files, not rotating properly, were getting MUCH larger than they should have.
Add to that, at some point, the system restarted or the services for it timed out in some fashion. One of those services being MySQL and since it ended unclean on a full disk, the forum Database decided it was corrupted.

And so, by the exhaustive efforts of rev_posix specifically and myself, we've managed to get all the drive space reclaimed and the database -theINory- restored.
If you find that a post that you made at some point can no longer be found, or that an attachment no longer downloads, feel free to let me know this, but be advised ahead of time that there might not be anything that can be done about it.

I'd like to thank everybody for their patience and apologise for the amount of unexpected down time. What made this even hairier for us is that our cable service provided decided to have an outage of their own while we were in the midst of getting things operational. In short, a comedy of Murphian proportions took place.
Report MediaVP issues, now on the MediaVP Mantis! Read all about it Here!
Talk with the community on Discord
"If you can keep a level head in all this confusion, you just don't understand the situation"

¤[D+¬>

[08/01 16:53:11] <sigtau> EveningTea: I have decided that I am a 32-bit registerkin.  Pronouns are eax, ebx, ecx, edx.
[08/01 16:53:31] <EveningTea> dhauidahh
[08/01 16:53:32] <EveningTea> sak
[08/01 16:53:40] * EveningTea froths at the mouth
[08/01 16:53:40] <sigtau> i broke him, boys

 

Offline headdie

  • i don't use punctuation lol
  • 212
  • Lawful Neutral with a Chaotic outook
    • Skype
    • Twitter
    • Headdie on Deviant Art
Re: Unplanned Outage and Maintenance
**** you guys have had a rough time lately.  many thanks for your efforts getting this place back up and running :D
Minister of Interstellar Affairs Sol Union - Retired
quote General Battuta - "FRED is canon!"
Contact me at [email protected]
My Release Thread, Old Release Thread, Celestial Objects Thread, My rubbish attempts at art

 

Offline Rodo

  • Custom tittle
  • 212
  • stargazer
    • Steam
Re: Unplanned Outage and Maintenance
I felt naked there for a few hours.

Thanks for restoring the forums, you make me happy :D
el hombre vicio...

 

Offline Crybertrance

  • 29
  • Conventional warheads only, no funny business
Re: Unplanned Outage and Maintenance
I felt naked there for a few hours.

Thanks for restoring the forums, you make me happy :D

You weren't the only one...
<21:08:30>   Hartzaden fires a slammer at Cybertrance
<21:09:13>   Crybertrance pops flares, but wonders how Hartzaden acquired aspect lock on a stealth fighter... :\
<21:11:58>   *** The_E joined #bp [email protected]
21:11:58   +++ ChanServ has given op to The_E
<21:12:58>   Hartzaden continues to paint crybertrance and feeding the info to a wing of gunships
<21:14:07>   Crybertrance sends emergency "IM GETING MY ASS KICKED HERE!!!!eleventy NEED HELPZZZZ" to 3rd fleet command
<21:14:50>   Hartzaden jamms the transmission.
<21:14:51>   The_E explodes the sun

 

Offline FireSpawn

  • 29
  • Lives in GenDisc
Re: Unplanned Outage and Maintenance
All hail our benevolent HLP overlords!
If you hit it and it bleeds, you can kill it. If you hit it and it doesn't bleed...You are obviously not hitting hard enough.

Greatest Pirate in all the Beach System.

Peace is a lie, there is only passion.
Through passion, I gain strength.
Through strength, I gain power.
Through power, I gain victory.
Through victory, my chains are broken.
The Force shall free me.

 

Offline rev_posix

  • Administrator
  • 213
  • I have the password to your shell account...
    • Trials and Tribulations
Re: Unplanned Outage and Maintenance
I'm just glad that the DB dump of the forums was pretty current and happened before the corruption took effect.  That would have sucked as the next current backup was about a week old
--
POSIX is fine, as is Rev or RP

"Although generally it is considered a no no to disagree with a mod since it's pretty much equivalent to kicking an unpaid janitor in the nuts while he's busy cleaning up somebody elses vomit and then telling them how bad they are at cleaning it up cause you can smell it down the hall." - Dennis, Home Improvement Moderator @ DSL Reports

"wow, some people are thick and clearly can't think for themselves - the solution is to remove warning labels from poisons."

 

Offline Mongoose

  • Rikki-Tikki-Tavi
  • Global Moderator
  • 212
  • This brain for rent.
    • Steam
    • Something
Re: Unplanned Outage and Maintenance
A few people seem to have stumbled across a couple of longer posts that have been truncated...the weird thing is that these posts were made a good year or two ago, well before any of the actual database issues.  Lorric's post here is one of them, and the other is Battuta's interview with Jason Scott here.  (Seriously, of all the posts to get hit...)  Fortunately in both cases we have backups of the original posts, so it's not exactly urgent, but I figured there might be something useful to figure out on the back-end.

  

Offline karajorma

  • King Louie - Jungle VIP
  • Administrator
  • 214
    • Karajorma's Freespace FAQ
Re: Unplanned Outage and Maintenance
The Diaspora release thread also was truncated. Again I had a backup.
Karajorma's Freespace FAQ. It's almost like asking me yourself.

[ Diaspora ] - [ Seeds Of Rebellion ] - [ Mind Games ]

 

Offline Zacam

  • Magnificent Bastard
  • Administrator
  • 211
  • I go Sledge-O-Matic on Spammers
    • Steam
    • Twitter
    • ModDB Feature
Re: Unplanned Outage and Maintenance
Database corruption doesn't care when the post was made as far as what it will munch. And usually, the longest possible entries will be the most likely hit, regardless of when they were made.

That's why it's called corruption.
Report MediaVP issues, now on the MediaVP Mantis! Read all about it Here!
Talk with the community on Discord
"If you can keep a level head in all this confusion, you just don't understand the situation"

¤[D+¬>

[08/01 16:53:11] <sigtau> EveningTea: I have decided that I am a 32-bit registerkin.  Pronouns are eax, ebx, ecx, edx.
[08/01 16:53:31] <EveningTea> dhauidahh
[08/01 16:53:32] <EveningTea> sak
[08/01 16:53:40] * EveningTea froths at the mouth
[08/01 16:53:40] <sigtau> i broke him, boys

 

Offline Mongoose

  • Rikki-Tikki-Tavi
  • Global Moderator
  • 212
  • This brain for rent.
    • Steam
    • Something
Re: Unplanned Outage and Maintenance
True, but rev's post made me think that you guys were restoring stuff from a backup before the corruption, which is why I was puzzled.

 

Offline rev_posix

  • Administrator
  • 213
  • I have the password to your shell account...
    • Trials and Tribulations
Re: Unplanned Outage and Maintenance
I did.  However, something might have gone weird during the conversion from SMF 2.0RC to the current, which this sounds like to me, especially if the posts are all chopped at about the same length.

If you have backups, kewl.  If not, anyone with shell could ssh in and pull the missing data from the backup I used to recreate the backend.
--
POSIX is fine, as is Rev or RP

"Although generally it is considered a no no to disagree with a mod since it's pretty much equivalent to kicking an unpaid janitor in the nuts while he's busy cleaning up somebody elses vomit and then telling them how bad they are at cleaning it up cause you can smell it down the hall." - Dennis, Home Improvement Moderator @ DSL Reports

"wow, some people are thick and clearly can't think for themselves - the solution is to remove warning labels from poisons."

 

Offline Goober5000

  • HLP Loremaster
  • 214
    • Goober5000 Productions
Re: Unplanned Outage and Maintenance
I did.  However, something might have gone weird during the conversion from SMF 2.0RC to the current, which this sounds like to me, especially if the posts are all chopped at about the same length.
Didn't you read that email I sent to you and Zacam? :p  I theorized that the database entries could have been truncated at an apostrophe, and so far, every truncated post I've seen has fit that pattern.  And there are quite a lot of truncated posts -- including some on hosted internals with fairly important project information on them.

Quote
If you have backups, kewl.  If not, anyone with shell could ssh in and pull the missing data from the backup I used to recreate the backend.
The problem is that this requires someone to notice that there are posts with missing data.  And the posts which have been truncated don't appear to fit any pattern, other than that they may have been truncated at apostrophes.  Couldn't we take the forums offline for a few hours and just run a comparison of all posts on the forum against all posts in the backup?

 

Offline mjn.mixael

  • Cutscene Master
  • 212
  • Chopped liver
    • Steam
    • Twitter
Re: Unplanned Outage and Maintenance
BAH! I've lot at least two big posts... one of them a pretty important internal BtA post, the other a tutorial on the Render Boutique. Sadly, I don't have backups. :(
Cutscene Upgrade Project - Mainhall Remakes - Between the Ashes
Youtube Channel - P3D Model Box
Between the Ashes is looking for committed testers, PM me for details.
Freespace Upgrade Project See what's happening.

 

Offline rev_posix

  • Administrator
  • 213
  • I have the password to your shell account...
    • Trials and Tribulations
Re: Unplanned Outage and Maintenance
The problem is that this requires someone to notice that there are posts with missing data.  And the posts which have been truncated don't appear to fit any pattern, other than that they may have been truncated at apostrophes.  Couldn't we take the forums offline for a few hours and just run a comparison of all posts on the forum against all posts in the backup?
In theory, yes.

The simplest way I could think of to do this would be disallow all access to the forums (probably shutting down apache to be sure), dumping all 2.2 gigs of raw SQL, performing a diff on the backup and the new dump (they are just text files), finding the differences between the two, and then looking to see if they are posts or something else, like a new timestamp, new post, new user, etc.

You volunteering to do so?  :P
--
POSIX is fine, as is Rev or RP

"Although generally it is considered a no no to disagree with a mod since it's pretty much equivalent to kicking an unpaid janitor in the nuts while he's busy cleaning up somebody elses vomit and then telling them how bad they are at cleaning it up cause you can smell it down the hall." - Dennis, Home Improvement Moderator @ DSL Reports

"wow, some people are thick and clearly can't think for themselves - the solution is to remove warning labels from poisons."

 

Offline Goober5000

  • HLP Loremaster
  • 214
    • Goober5000 Productions
Re: Unplanned Outage and Maintenance
NO U. :p

Actually, what I was thinking was loading all 2.2 gigs of raw SQL into a parallel table, then writing a custom program to iterate through every post (and private message) and just compare the lengths.  (Or, if there are character encoding problems, compare the number of alphanumeric characters, while ignoring the special characters.)  But that's the programmer in me thinking, not the sysadmin.

Whatever way we do it, though, it needs to get done.  There are just too many truncated posts, and undoubtedly we haven't found them all.

 

Offline Zacam

  • Magnificent Bastard
  • Administrator
  • 211
  • I go Sledge-O-Matic on Spammers
    • Steam
    • Twitter
    • ModDB Feature
Re: Unplanned Outage and Maintenance
Goober: NO U, now shuddup and listen.

1: You can't do a diff because there are conversion changes between SMF 2.0.2 DB and RC5. Fail. And across 2.2 gigs? No, throwing twin DB's into a mastered array and doing a sync would be faster but still computationally expensive as all sin.
1A: Also, the above is only really feasible in Sybase. MySQL, wonderful as it is, just doesn't have what it would take to do something like this elegantly OR swiftly.

2: Easier method: Since these are all mostly older posts: Mount a pre-collapse DB on an iteration of the forum software with a unique URL, then between two tabs, go to the post ID and copy from one to the other.

Not only is number two easier AND guaranteed to get correctly formatted into the newer DB structure, it's also far less hassle to accomplish. All it requires is for meticulous individuals to report truncated post message #'s as they are found on the forum. Or directly link them. Then modify the link in a second tab/browser/whatever to the alternate URL forum and make with the CTRL+C and CTRL+V like a boss.

Granted, it doesn't catch across all the possible instance for a "full clean" version, but given what it'll take to achieve option 1, it the more significantly productive "right now" result to run with.

Check the Admin internal for more information.

For everybody else: Check this out and report away; http://www.hard-light.net/forums/index.php?topic=83008.0
« Last Edit: November 30, 2012, 02:20:21 am by Zacam »
Report MediaVP issues, now on the MediaVP Mantis! Read all about it Here!
Talk with the community on Discord
"If you can keep a level head in all this confusion, you just don't understand the situation"

¤[D+¬>

[08/01 16:53:11] <sigtau> EveningTea: I have decided that I am a 32-bit registerkin.  Pronouns are eax, ebx, ecx, edx.
[08/01 16:53:31] <EveningTea> dhauidahh
[08/01 16:53:32] <EveningTea> sak
[08/01 16:53:40] * EveningTea froths at the mouth
[08/01 16:53:40] <sigtau> i broke him, boys

 

Offline jr2

  • The Mail Man
  • 212
  • It's prounounced jayartoo 0x6A7232
    • Steam
Re: Unplanned Outage and Maintenance
How about 2A; Do 2), then convert from SMF 2.0.2 to RC5, to get rid of the conversion changes; then do a db compare-whatever-you-call it? 

Granted, I have no clue as to if that would be feasible, but it seems to me to be logically possible.

 

Offline Zacam

  • Magnificent Bastard
  • Administrator
  • 211
  • I go Sledge-O-Matic on Spammers
    • Steam
    • Twitter
    • ModDB Feature
Re: Unplanned Outage and Maintenance
You can't reverse 'upgrade' like that and there is no guarantee that the results (even if you could) wouldn't end up being for the worse.
Report MediaVP issues, now on the MediaVP Mantis! Read all about it Here!
Talk with the community on Discord
"If you can keep a level head in all this confusion, you just don't understand the situation"

¤[D+¬>

[08/01 16:53:11] <sigtau> EveningTea: I have decided that I am a 32-bit registerkin.  Pronouns are eax, ebx, ecx, edx.
[08/01 16:53:31] <EveningTea> dhauidahh
[08/01 16:53:32] <EveningTea> sak
[08/01 16:53:40] * EveningTea froths at the mouth
[08/01 16:53:40] <sigtau> i broke him, boys