Hard Light Productions Forums

Modding, Mission Design, and Coding => FS2 Open Coding - The Source Code Project (SCP) => Topic started by: KeldorKatarn on June 21, 2009, 05:44:01 pm

Title: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 21, 2009, 05:44:01 pm
I'm experiencing several access violation crashes with this build. I think it happens somehow related to persona messages (which play correctly though). I'm not exactly sure though, since even the Debug build doesn't report anything but what's in the error log, which I attach here.

Any idea what this could be?

[attachment has decomposed]
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: karajorma on June 21, 2009, 05:48:32 pm
You don't get anything when debugging?
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 21, 2009, 05:50:30 pm
The Debug build just crashes like any other. It doesn't give me an option to jump to an IDE for debuggin, it just crashes.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 21, 2009, 05:52:36 pm
If you're asking me whether I get anything when running from inside the IDE.. I have no idea how to run FS_Open from inside the IDE and use a specific working directiory to run in to be honest.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: karajorma on June 22, 2009, 02:20:34 am
In that case it makes more sense to explain how.

Simply compile a build yourself and make sure it is already in your WCS folder (If you've set the environment variable it will do that automatically. If you don't know how, just ask) . Once you're done click on either the Freespace or FRED project in the solution explorer window and choose properties. Select Debugging. In Command select browse and find the exe. Command Arguments is the command line you want (You'll only need this for FRED since FS2 loads it in from the .cfg file).

Okay out of there and then simply select Debugging->Start Debugging from the menu. You might want to add the debugging toolbar to the ones visible if it doesn't show up as being able breakpoint the code near the critical section and step through the code a single line at a time makes it very easy to find the cause of bugs and should be something you use often.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 22, 2009, 06:56:25 am
Doesn't really help. It still just freezes. VS also just tells me that there is an access violation. It halts at the render_all function but I don't see any problems there or how a wrong address could be there somewhere. All that thing does is provide a render function pointer and a boolean value. the halting also doesn't really tell WHERE the access violation happens.

This crash is also nearly impossible to reproduce and seems to happen very much at random... I'm at a loss here how to find out what's causing this.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: portej05 on June 22, 2009, 07:19:24 am
The debugger halts exactly where the access violation occurs (if you've got it attached correctly).

I can't find a function called render_all. The VS callstack information is really important if this kind of error is occurring (Debug->Windows->CallStack, or it's present on a tab somewhere in your environment)

If we're going to debug this, we need a lot more information, e.g. a screenshot of VS or how you're causing this crash so that we can reproduce it.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 22, 2009, 08:28:47 am
I'm busy at the moment, I'll try again later.

It does halt somewhere yes (I'll get the exact function name again). but I don't see any place where this could cause any access violation because it doesn't access anything.

As for reproducing it... I'm not causing it at all.. as I said, it appears quite randomly somehow.

I'll try to narrow it down a bit more...
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: karajorma on June 22, 2009, 10:44:23 am
When it halts make sure that you turn on the View Call Stack options (it should be a tab at the bottom of the IDE). That will show you the exact chain of events that led to the crash.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 22, 2009, 11:25:58 am
I still can't make heads or tails of this. But here's the data I managed to collect. I can confirm that it always crashes at exactly this position but I have no way to reproduce it easily.
I also have no idea what exactly in this code could cause an Access Violation. It's just a function call, isn't it?

[attachment has decomposed]
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: portej05 on June 22, 2009, 11:33:01 am
Are you sure that you're debugging the source against EXACTLY the same executable? That would cause the errors we're seeing here.
Also, debugging a release version would give you the results you're seeing here

You really need to cause this problem with a debug version before you can get meaningful results in the debugger.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 22, 2009, 12:17:52 pm
Sigh, forgot to set it to compile the Debug version... back to it later. (I'm debugging the exact same build yes, just compiled Retail instead of Debug, forgot to set that in the IDE)

It doesn't help that Freespace keeps the mouse focus and I have to work with keyboard shortcuts in VS.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 22, 2009, 03:02:55 pm
Hmm, seems to be the shield-explosion rendering code... Some vertex pointer is a problem it seems. I can't find anything wrong with the model's shield however.
Does anyone find an error in here somewhere? Call stack is in the screenshots, the problematic value is marked.

[attachment has decomposed]
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: portej05 on June 22, 2009, 11:02:24 pm
You're cutting off more than half of the callstack in your screen shots.
My guess (because I can't confirm without seeing the second half of the callstack - horizontally) is that the second screen shot will show you that 'sv' is NULL.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 03:57:25 am
That may very well be. And sorry for the layout but I cannot change the window layout since I don't get my mouse back. FS keeps the mouse focus so there's very little I can do inside VS during debugging once FS has crashed.

Anyone got an idea how I can get my mouse back? I can only get it for the task bar, As soon as I enter application area the mouse is not only invisible but also it is not possible to click on anything (that way I could at least guess the mouse position, but as I said it is not there at all).
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: portej05 on June 23, 2009, 04:31:24 am
hmmm... I knew there was something fishy - the scrollbar is down the bottom of the collection of images, so it's a real pain to scroll around.

Anyway, I'm still going with the bet that sv is NULL, as for how to debug that, I'm not sure - I debug on a second monitor, and I've not had problems with mouse capture before.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 05:22:39 am
Well not everyone has a second monitor. I'm also not saying that I cannot debug if I simply pause and go to VS via Alt-TAB, but once the game crashes Alt-TAB doesn't work anymore and I need to go to the main logon screen of vista to start the task manager to even get back to my normal desktop. And there I don't have mouse focus, since that one is still captured and not released by the game. This makes it near impossible to really debug this crash.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: portej05 on June 23, 2009, 05:25:27 am
Stick in if ( sv==NULL ) Int3( );

You need to develop a hypothesis, and then test it.
The Int3( ) will allow you to get your mouse focus.

Unfortunately full screen debugging on a single monitor is not a fun task :P
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: karajorma on June 23, 2009, 05:31:18 am
Well not everyone has a second monitor. I'm also not saying that I cannot debug if I simply pause and go to VS via Alt-TAB, but once the game crashes Alt-TAB doesn't work anymore and I need to go to the main logon screen of vista to start the task manager to even get back to my normal desktop. And there I don't have mouse focus, since that one is still captured and not released by the game. This makes it near impossible to really debug this crash.

I've seen that one before. Did you have VC6 installed on that PC at any point?
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 05:33:56 am
I doubt it. I had it once but I'm nearly 100% sure that was before the last complete wipe and OS reinstall. I can rule that out as a cause pretty sure.
I get mouse back ok as soon as I stop debugging and FS is closed. The problem is that FS locks the mouse if it freezes until the process is killed.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 05:35:56 am
But aside from that problem: Anyone got an idea yet where that null/dangling pointer could be coming from?
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: karajorma on June 23, 2009, 05:51:25 am
In that case I'm rather stumped at to the cause of your problem. Obviously something is wrong with your setup somewhere cause you shouldn't be able to write code that locks up the debugger in that way.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: Wanderer on June 23, 2009, 06:06:16 am
When exactly does that crash appear? Is it related to a specific ship class (ie. errors in the pof file?)
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 06:10:13 am
As I said, I cannot easily reproduce this. It happens with a mission that I used to test a model with yes, but I checked that model in other missions before and I also see no probelms in the pof.
I cannot rule out a problem with the POF however it shouldn#t cause a null pointer crash even if it is the model, so somewhere in the code something doesn't get checked.

I'd dig further into this if I could, but the mouse lock makes that near impossible.

I always have to wait a while in that mission and run it a few times. probably a certain shield face has to get hit for this to happen so it is hard to really reproduce it. if it is an error in the pof i don't find it, the shield seems optically ok and the debug build doesn't report any parsing errors.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 07:52:43 am
Why is this commented out in read_model_file()??

Code: [Select]
    for ( j = 0; j < 3; j++ ) {
        pm->shield.tris[i].verts[j] = cfread_int( fp );  // read in the indices into the shield_vertex list
        /*
#ifndef NDEBUG
        if (pm->shield.tris[i].verts[j] >= nverts)
            if (!warning_displayed) {
                warning_displayed = 1;
                Warning(LOCATION, "Ship %s has a bogus shield mesh.\nOnly %i vertices, index %i found.\n", filename, nverts, pm->shield.tris[i].verts[j]);
            }
#endif
        */
    }
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 07:57:37 am
This should possibly be:

Code: [Select]
for ( j = 0; j < 3; j++ ) {
    pm->shield.tris[i].verts[j] = cfread_int( fp ); // read in the indices into the shield_vertex list

    if (pm->shield.tris[i].verts[j] >= pm->shield.nverts) {
        Warning(LOCATION, "Ship %s has a bogus shield mesh.\nOnly %i vertices, index %i found.\n", filename, pm->shield.nverts, pm->shield.tris[i].verts[j]);
    }
}


Warning() already contains a #ifdef DEBUG, so there's no need to repeat the preprocessor directive here.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 08:02:31 am
Ran a debug build with this code in it. Didn't report any shield error. but this warning should stay in anyway.

I'll keep looking where this error might occur.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: portej05 on June 23, 2009, 08:10:59 am
Warning() already contains a #ifdef DEBUG, so there's no need to repeat the preprocessor directive here.

I'm not familiar with this section of code, but for performance reasons it is best to avoid any debugging code whatsoever in very regularly called code, hence the ifdef.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 08:19:02 am
First of all this is only executed in DEBUG builds since the code of that function contains a #ifdef.

So in retail this code will simply do nothing at all and not even show up in the machine code. but in DEBUG builds this is essential to check a shield for correctness.

This wasn't the cause for the crash but this should stay in.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: portej05 on June 23, 2009, 08:27:05 am
Actually, your revised version has upwards of 7 dereferences, two comparisons and that is all done 3 times over.
In critical path code, that is rather a lot, and has a lot of potential to cause cache misses and going to memory for information.
Granted, it's small, but game code generally must be optimised to hell.

EDIT: Also, it's a warning, not an error.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: Wanderer on June 23, 2009, 08:38:23 am
Can you reproduce the problem? If so check what exactly goes wrong in here (ie. values and/or validities of both 'verts' and 'stp').
Code: [Select]
sv = &verts[stp->verts[i]];

And from there trace the problematic spot and try to verify what exactly is wrong.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: Spicious on June 23, 2009, 08:59:20 am
Actually, your revised version has upwards of 7 dereferences, two comparisons and that is all done 3 times over.
In critical path code, that is rather a lot, and has a lot of potential to cause cache misses and going to memory for information.
Granted, it's small, but game code generally must be optimised to hell.
The right side looks like it was stored in a local in the original code to avoid that problem. The left side is written to immediately before so caching shouldn't be a problem. It looks like it would almost get optimised out too.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 11:01:13 am
Can you reproduce the problem? If so check what exactly goes wrong in here (ie. values and/or validities of both 'verts' and 'stp').
Code: [Select]
sv = &verts[stp->verts[i]];

And from there trace the problematic spot and try to verify what exactly is wrong.

You have been following the thread have you? I told you that I cannot check anything, since I cannot get the mouse focus back for Visual Studio
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: Wanderer on June 23, 2009, 11:10:33 am
Does using windowed mode affect that?
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 11:21:11 am
I'll try that. The problem, as I mentioned, is, that it is really hard to reproduce to begin with. it might be that a certain shield triangle must be hit and I don't even know on what ship yet.
So I kinda have to restart the mission over and over and watch the AI fight until it finally crashes.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: Wanderer on June 23, 2009, 11:26:02 am
Yeah... Well i hope you can get it to work... you should be able to id the model (for example) by tracing the calls via stack to create_shield_explosion() function and accessing the polymodel struct (pm).
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: chief1983 on June 23, 2009, 12:28:48 pm
I was also going to suggest -window, it's a necessity for any serious debugging.  No need for alt-tab even when it's enabled, if a crash occurs you should automagically have full control of your mouse again.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 04:41:23 pm
I couldn't get the game to crash again (simply too random which triangle is hit) but I managed to find out what caused the crash.

The Sabre model of Saga had a buggy shield. Somehow the pof contained one single wrong triangle neighbor index into the shield-triangles array. The value was something like 2 million something or so while there were around 650 triangles.

Loading the model in PCS2 and re-saving it seems to have corrected the error. The warning didn't show up anymore and I believe this fixes the crash also.

To prevent such undetected errors, that lead to an access violation crash, in the future I'd really recommend adding the checks that I put in the code to find them. One of those checks is right now already in the code but commented out, the neighbor indices however are not checked at all.
Such checks are necessary in the debug builds so developers can make sure the models are alright.

I attached a patch file with the suffested added checks (both of them in #ifndef DEBUG clauses so they don't harm any performance of the retail build).

[attachment has decomposed]
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: chief1983 on June 23, 2009, 05:01:24 pm
Is that done during load or realtime in game?  I can't imagine an extra calculation is that big of a deal in debug just during load.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 23, 2009, 05:02:58 pm
It is done on mission load from what I can see. It is the read_model_file() function. Nothing realtime.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: Zacam on June 23, 2009, 07:36:32 pm
I can't imagine an extra calculation is that big of a deal in debug just during load.

Depends on what is being calculated. Full CRC enumeration on the VP's for example (as a default without having to run -verify_vps) adds quite a lot.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 24, 2009, 06:44:06 am
Well this isn't a big calculation. It is just a few time dereferencing values plus comparing them. That's all.

To get back to topic: Can this please be added to trunc? It will greatly improve the usefulness of the debug build for checking a model's correctness.
Since we've recently started to give all our capital ships mesh shields we could really use this :)
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: portej05 on June 24, 2009, 07:00:21 am
I'm OK with committing this, as long as:
1) Debug code remains in an ifdef _DEBUG or ifndef NDEBUG (i.e. debug builds are for debugging)
2) Someone can confirm that these warnings are not a nuisance with model data outside of that which KK is using
3) Some check to ensure that warnings are only shown once per model - it would be bloody irritating to see 6 warnings per model.

Patch doesn't look too bad.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 24, 2009, 08:37:18 am
3) probably needs to be fixed but that will be easy with setting a local bool per model and setting that one to true once a warning has been shown.

On the other hand I'd consider making this an error instead of a warning. This is not something that will simply lead to strange effects in the game, like a missing texture,
but instead will always lead to a crash sooner or later. So a model with faults like this should never be loaded actually.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: Tolwyn on June 24, 2009, 10:57:46 am
I tend to agree that the shield checking code is necessary - just imagine the nightmare of hunting down a CTD issue after the release. I used the model in question in quite a few missions, yet I did not experience any crashes. ;)
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: portej05 on June 24, 2009, 11:17:48 am
The checking code has been committed.
I used KKs checking code, but replaced Warning with 'Error'.
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: KeldorKatarn on June 24, 2009, 11:30:33 am
Thanks
Title: Re: A lot of access violation crashes with fs2_open_3_6_11(r/d)-20090526_r5309.exe
Post by: Tolwyn on June 24, 2009, 11:47:23 am
The checking code has been committed.
I used KKs checking code, but replaced Warning with 'Error'.

Thanks a bunch. ;)