Bit of behind the scenes, but just wanted to thank all of our testers for all the awesome work they do! Just as an example, their invaluable help has allowed us to hopefully reach an end to the bug hunt saga that begin in January...First, we get reports of FSO flickering during certain conditions, then within days followed by reports of textures getting switched around, the screen sometimes inverting and zooming in, and subsystems not loading properly on ships we know are setup correctly. This deluge occurred in such a relatively short time frame that we first guessed they were perhaps all related, but alas none of them were easy to reproduce and seemingly happened at random times for different testers. Moreover, not all testers were experiencing the same bugs, but almost everyone was getting some kind of bug. The usual culprits were pursued, and debug logs underwent deep scrutiny. Some debug logs showed hundreds of OpenGL Errors, surely this was the cause? On a whim we discovered those error messages were simply the cause of having a third-party screen recorder on, which Discord searching also confirmed to occur on even FS Retail. Perhaps this was the cause all along? Still, to be sure, we developed multiple pull requests to clean-up various FSO code aspects.
Once PRs were merged testing commenced again, yet the bugs still gleeful persisted and mocked our futile attempts to squash them. Efforts to reproduce were in vain, the bugs would disappear into the darkness when restarting FSO and the logs continued to turn up no viable leads. It was time for a more robust approach, we would try bisecting when these bugs appeared, which for FotG was not an easy task as new features were merged continually, and the bugs themselves were so random it would be almost futile to confidently identify when they first started. Still, we embarked on this brute force approach by deploying multiple testing branches built from commits from many moons ago...As expected this effort was ultimately not useful, as even old builds and timeframes revealed some iterations of the bugs. Previously, we had fixed long-standing and hard to reproduce bugs, but this confluence of random manifestations made this bug especially hard to pin down. Equally perplexing was that no other mods were reporting these issues. FotG lives on the cutting edge of FSO builds, but these bugs had been going on for months. At this point we had exhausted our conventional options.
Fortunately, during this time Lafiel had been aware of our plight and was able to coordinate with Limbert, who had been experiencing many of the issue. The plan was to catch one of the issues red-handed while running RenderDoc, though easier said than done given the elusive nature of the bugs. After much collaborative planning and instruction, a bug was finally captured within the program. Digging into the RederDoc results, Lafiel gleaned a root cause was related to FotG's cockpit use of Real-time-texturing. Yet the RenderDoc did not supply the smoking gun we had hoped for. Still, it pointed Lafiel in the right direction, and after many tests he was at least able to reproduce the flickering bug. Not long after a PR followed (
https://github.com/scp-fs2open/fs2open.github.com/pull/6767)--but this was not the end. Lafiel correctly deduced there were likely multiple, unrelated bugs at play and the following months proved this hypothesis true.
With the flickering bug now gone, the other graphic errors rose to the occasion, most notably the y-inversion of the screen. This bug proved harder to reproduce than the flickering and Lafiel and other developers were simply never able to have it occur on their own machines. We were once again at a dead end. RenderDoc suggested it was something once again with RTT, but without a reliable reproducible case the chances of fixing it were slim. At this point we had to back up and once again dig through the debug logs and compare systems. Interestingly, Lafiel, who had never managed to hit the y-inversion had nearly the same systems as Kestrellius, who did run into that bug. Ultimately the debug log did reveal a key difference--not in the mission or load events but in the game resolution. It turns out that everyone who experienced this bug had one aspect in common, they were running at a resolution that was not 1920x1080. Out of curiosity, we tested different resolutions and after a bit of exploration we were able to finally reproduce the issue (it had been dependent on resolution all along). With a reproducible case in hand, we brute forced a bisect, and found the error appeared only after updating to scalable fonts. This was the key, as armed with that lead and a way to trigger the bug, Lafiel quickly zeroed in on a cause and a PR to fix it (
https://github.com/scp-fs2open/fs2open.github.com/pull/6794).
At this point we were hesitant. Two bugs had been identified and fixed, but two remained, the incorrect model loading error and the random texture replacement bug. We guessed these were unrelated, but both proved again incredibly difficult to reproduce. Learning from our recent experience, we quickly identified that these bugs were occurring even on 1920x1080 resolutions, so at least that avenue was closed. Another aspect we found was that the y-inversion seemed to happen after a string of campaign missions. Pondering this plan, we concocted skeleton test campaigns and dug through old reports to again search for any similarities. We also brute force printed line after line of model loading within the debug logs. After tedious analyzing and testing we finally struck a viable path and found a way to reliably reproduce the bug with a custom campaign. Equipped with this path we again dug into the debug log model prints and found something peculiar. Models that were present in the briefing but not the mission were not being fully loaded. That realization sparked a follow-up recollection of a PR from earlier in the year that was aimed to optimized model loading of briefing ships. On a hunch we dug into that code and found our culprit, a missing argument, and made a fix PR (
https://github.com/scp-fs2open/fs2open.github.com/pull/6850).
Perhaps this was it, perhaps the texture replacement bug was just an offshoot manifestation of these earlier bugs. Mobilizing our testers we waited, and right on queue the texture replacement bug reared back into the forefront. Unlike its compatriots it had not yet been squashed. Moreover reports of this bug were the first we had received in the beginning of the year, so it outlasted all the bugs fixed thus far with a timeframe of over 7 months. And, true to fashion, this bug was even more difficult to reproduce reliably than the last. Still our experience fighting the earlier bugs informed our approach, we hypothesized this bug was caused by some kind of texture slot loading mismatch that occurred over the course of loading many campaigns missions within one playthrough. Brute force debug log printing commenced once again, this time specifically targeted at texture slots. Finally, on occasion, we were able to reproduce the bug. This case turned out to be primarily a scripting related issue to texture loading, and we pushed a fix and kindly asked to our testers to begin running missions in earnest once again. This was today, and fingers crossed, reports will come back positive. It is entirely possible this was not the end, and this cliffhanger will result in more hair pulling. Still, we have pushed this far and have no plans to let the bugs win. If you've read this far you certainly deserve a Wookiee Cookie!
Fun fact, during this time Lafiel was developing entirely different PRs for new features, cleanup and optimization including multithreading for collisions. He's a real wizard and we can't thank him enough either for his work on these plus all the fantastic features and enhancements he has made over the years in FSO (VR, deferred shading and shadowing in cockpits, complete animation overhaul, modular POFs, complete particle overhaul and rework, developing an entirely novel way to do multisampling antialiasing within a deferred pipeline then writing a paper about it, plus so much more).