This is the tale of change-ship-class and how much I
hate love it.
Once upon a time, DefCynodont119 was making a mission involving the change-ship-class SEXP.
Actually, that's too recent. Let's go back in time a bit further.
Once upon a time, Goober5000 found a fascinating function called "change_ship_type()" and decided to make a SEXP that called it.
Actually, that's still too recent. Let's go back a bit further.
Once upon a time, some coder(s) at Volition wrote a function named "change_ship_type()", to be used whenever a ship needed to change types. Fairly straightforward, right? This was for things like FRED (where you can change the class of a ship in the ship editor) and the loadout screen (where you can change the class of your and your wingmates' ships). Somewhat important about these things is that they don't take place while a mission is actively ongoing; that will turn out to bite us in the ass later.
So, fast-forward.
Descent: FreeSpace is released... you know what? Let's fast-forward a
bit further than that. Goober5000 sees "change_ship_type()" and presumably imagines the possibilities exposing such a function to FREDers could create. Anyone who's seen Axem's work probably knows that the possibilities are, to put it succinctly, really cool. So the change-ship-class SEXP is created.
This has made a lot of people very angry and been widely regarded as a bad move. This made a lot of people very happy, in the years to come, but was plagued with problems; in that very commit, Goober apparently noticed that change_ship_type() would reset the ship's shields and hull, which is not particularly desired in most use-cases, although didn't take the time to fix it properly at first (remember that this was January of 2003 and this was revision 278
(!) of a repository that hit over 11,000 revisions before we made the switch to Git, over twelve years later). Over time, lots of other issues would be noticed and fixed (or attempted to be fixed, and then actually fixed later, as is often the case with FSO), like making it handle countermeasure counts, afterburner fuel, maximum speed as adjusted by engine power, and so on and so forth.
One particular issue lurked, however, unfixed all the while.
This takes us back to where we (almost) started: DefCynodont119 was making a mission involving the change-ship-class SEXP. But what's this?
I need someone to help us Identify the cause of a highly complicated CTD Error/Bug:
I am having problems with the Change-Ship-Class SEXP
Spontaneous crashes without an FSO error message! Debug log not showing any helpful information! Enter an apparently-bored coder who clearly had nothing better to do with his time:
Can you provide a test mission to recreate this crash?
And so the debugging process began. It was quite simple at first; the executable crashed almost immediately, and the debugger was quite clear about what the problem was: the player was targeting a subsystem from the version of the ship before change_ship_type() had been called, which was pointing to a submodel number that did not exist on the new model, causing an out-of-bounds array access on the polymodel's submodel list, resulting in FSO attempting to interpret what appeared to be a segment of shader code as though it were the data for a submodel, leading to nonsensical data which ultimately caused a crash when invalid values were sorted (or, to get technical, the executable
attempted to sort them, discovered a non-comparable value, and threw an exception).
"Easily fixed," said I, and proceeded to quickly write some code in change_ship_type() to handle the player targeting a subsystem on the ship changing classes.
I found the problem
Ah hah hah hah hah hah... "the problem," he says. Look at this naïve idiot. Anyway, this would turn out to only be the beginning of the work to be done on change_ship_type(). As it turned out, the next problem to be solved was also fairly straightforward:
OK, so I messed around with a variation of the test mission, only this time with the workarounds and this happened:
Assert: pair2.dist >= 0
File: shiphit.cpp
Line: 1088
You see, sometimes when ships get hit, they trail "spark" particles behind them. These are stored as, basically, a submodel and an offset for the sparks to emit from. When these sparks referred to a submodel that no longer existed... well, let's just say that despite the differences in these two errors (one throwing an exception and one triggering an assertion), they both boiled down to the same issue. At first, I thought I'd have to do some complicated trickery to find out which submodels were referenced by which subsystems, and then try to transfer the spark data... and then it occurred to me that any change in the model renders the offset coordinates more-or-less completely useless even
if you could find equivalent subsystems, so I went with the far simpler solution of simply clearing all spark data.
That was not the end of it. No, not the end of it by far.
You see, the AI can also target subsystems. And that targeting is also handled by pointers, just like the player's targeting (in point of fact, the player's target is actually handled by an AI object, so they're
exactly the same). So the code I'd written to handle the player's target would need to be generalized to handle every AI, or you get the exact same problem from non-player fighters. Right as I finish fixing that issue, I smack my forehead as I realize homing weapons
also have a subsystem pointer, pointing towards the subsystem they're homing in on.
The list of things change_ship_type() needs to iterate over in order to fix out-of-date subsystem references keeps growing longer, and yet the mission doesn't stop crashing. "What's crashing this time? An array of the last subsystem each player targeted on each ship? Agh! Okay, let's fix that, too." Still crashing. "What. EMP missiles now? That doesn't even make sense! The AI's EMP target-scrambling doesn't
target subsystems!" And yet, as intermittent as the crash was, it was still crashing. "Okay," said I, "give me the version of the test mission you're using and I'll do my best to make it crash."
Well, I did. And it did. And, to my considerable annoyance/perplexion... it was crashing due to an issue I thought had already been fixed.
<MageKing17> Okay, this is ****in' weird.
<MageKing17> I went through all that trouble to fix the AI targeting invalid subsystems in change_ship_type(), and then get a crash because... an AI is targeting an invalid subsystem!
It got weirder.
<MageKing17> I'm so confused.
<MageKing17> This ship has no goals to target a subsystem.
<MageKing17> How is it targeting a subsystem?
I scoured the code, looking for every reference I could find to ai_info.targeted_subsys being set; it pretty much only happened in set_targeted_subsys(), which was usually called with NULL (meaning "clear any subsystem target"). One of the very few places it was called with an actual subsystem was in ai_set_attack_subsystem(), which... is called in exactly one place, if the AI has a "destroy subsystem", "disable", or "disarm" goal.
<MageKing17> what the ****
<MageKing17> there should be no reason for a non-turret AI to ever be targeting a subsystem without having a goal to destroy a subsystem (or disable/disarm)
I was baffled. I was tearing my hear out (no, really; it hurt like hell). As far as I could tell, this scenario
could not be happening, and yet it was clearly happening, or my debugger wouldn't be showing me this AI targeting a nonexistant turret.
But just then...
<MageKing17> oh wait
...I finally noticed...
<MageKing17> what's this function over here that looks like the exact problem
...that there was this one other place that called set_targeted_subsys() with a subsystem instead of a null pointer. Specifically, ai_big_strafe_maybe_attack_turret().
* MageKing17 facepalms
"What does ai_big_strafe_maybe_attack_turret() do?" you ask? Well, it's called when an AI either tries to evade a projectile, or gets hit by one while strafing a capital ship. It makes the AI direct its attention towards the turret that fired said projectile (depending somewhat on AI class), on the reasoning that taking out turrets that are shooting at it would help it to accomplish its other tasks without blowing up in the process. "How does it know which turret fired the projectile?" you might be wondering. Well, I'm glad you asked!
As it happens, it's handled by a pointer to the firing subsystem, stored on the projectile.
So, now we need to iterate not just through homing weapons, but
every single weapon in the game looking for both homing subsystems and the turrets that (may have) fired them. However, with that fixed I couldn't get the test mission to crash any more, so
hopefully (fingers crossed) this bug is finally fixed (although the PR is
still waiting on code review as of this posting merged now).
The End?
Oh please oh please be the end of this