Hard Light Productions Forums
Off-Topic Discussion => General Discussion => Topic started by: MP-Ryan on September 24, 2013, 10:56:17 am
-
I'm at a loss here after much thought on the subject, so maybe one of you fellows can figure this out. Bear with the long post.
Lately, I've been playing through the Thief series, and I'm now onto Deadly Shadows (released in 2005). I have installed a texture pack, and an unofficial patch that makes it considerably more stable on modern hardware. For quick reference, my basic computer specs:
- Intel E8400 3.0 Ghz
- 4 GB RAM (DDR2)
- BFG Geforce 8800 GTS 512 MB
This system turned 6 this past June.
I've got 14 hours of playtime into Deadly Shadows. Last night, during one specific mission, I heard my GPU fan suddenly spin up to max speed, and then the screen blanked and started flashing analog/digital - the video card shut itself down. Reboot the system. No anomalies. Temps normal. Shut it down, sprayed the card off with a duster and didn't get much (I regularly clean this system with the aid of an air compressor in the garage and the last cleaning wasn't long ago).
Boot back up. Relaunch the game. Played for about twenty minutes (same level, different areas). Hear the GPU fan spin to max, so I quickly shut down the game. Opened up the temp diagnostic; the GPU temperature is rapidly running (Celsius) through 3 temps: 56, 131, 259, back down to 56. It changes temp reading every second or so. Very odd. Shut down. Dusted off the card again with an air duster; again got very little out. Rebooted the system again.
This time, fired up the game again (same level, different areas). Played for about 30 minutes with RealTemp running in the background, alt-tabbing every so often to check the temps. Consistently 54-58 degrees. After about the 30 minutes mark, during an alt-tab where the temps were stable, sudden spike to 117, fan spins to max, card shuts itself down. Had to reboot again.
At that point, I took the card right out and flushed it with the air compressor, which did remove some dust bunnies from the inside. After popping it back in, I went and setup a custom fan profile that is a little more aggressive about cooling than the nVidia defaults as well, and played the game for about ten minutes with stable temps before heading off to bed.
Given that temperatures remain stable through the majority of gameplay, the game has previously been fine (and this issue hasn't come up lately in any other games, including a considerable amount of time spent in BioShock infinite and Bastion lately), and it's only bizarre random spikes that appear to be causing problems, I'm inclined to think that something about this particular section of the game and the patch are making my GPU have a fit when trying to display a certain angle of an area and causing the GPU to shut itself down to avoid melting. Either that, or the temperature sensor/HSF/GPU itself have suddenly gone on the fritz.
I'm going to try some testing tonight in another game and also with this one (as I've pretty much finished that level) and see if the issue continues. But I'm just wondering - any of you fellows have any insight that I haven't thought of? Before you ask, I don't have a spare video card to try swapping out.
-
wow, a four number geforce model, ive been through 2 video cards since i had one of those.
-
wow, a four number geforce model, ive been through 2 video cards since i had one of those.
Very helpful Nuke :P Since I'd rather not have to drop $200-250 on a new card of comparable performance, any ideas that might be beneficial? :)
-
i'm kinda thinking the temp sensor is crapping out. did you look at the GPU load/power consumption during the spikes? the on-screen display in msi afterburner (http://event.msi.com/vga/afterburner/download.htm) can show that without even alt-tabbing.
-
You know what, let's make this easy. Get Furmark. http://www.ozone3d.net/benchmarks/fur/
That is pure stress test to see if your GPU crashes and burns in fiery molten lava.
Any PC that is fully functional from PSU to GPU should handle Furmark without any issues. Now, if your GPU has developed problems with thermal sensor or anything else, you will see it while running Furmark.
If your card shuts down again, well then we at least know it isn't Deadly Shadows that is the problem here. As for your options if this happens, I suppose that depends on your engineering skills. Otherwise, get new GPU.
On the other hand, if your GPU runs Furmark up to 30 mins without any issues, your GPU is fine. Then the problem is with Deadly Shadows. In that case, check if your drivers are up to date and if the game has any patches. Failing that, check if other people have similar issues with Deadly Shadows. Or you can stop playing Deadly Shadows.
-
Thanks guys; I haven't had to deal with temp issue sin years, so I've never tried afterburner (but I will).
I'll try FurMark when I get home. I've run it before and the system tolerated it fine (2+ years ago), so I guess we'll see. You figure a 30 minute run should be sufficient, Fury?
-
I'll second the temperature sensor being the likely point of failure.
I'm pretty sure if a graphics cards hits 250C°+, then something will explode and kill you in short order.
-
ah 250º. That's cute! The pains I had with my laptop reaching 70º, cannot imagine what that would be.
-
I am 100% positive the 250 was a false reading, just so we're clear. The system did not spontaneously catch fire, which seems to be a good empirical test. =)
-
lol. I really can't be of any help. Hope you find the bug!
-
If you say you haven't frequently had GPU issues, I'm tempted to say your current problem is just some weird confluence of software instructions causing your card to choke. Were such a "choke" a rather benign failure, I'd be even more tempted to just gloss over and push through.
However, it doesn't sound like this is a benign issue, so I will not suggest inaction.
Before you ask, I don't have a spare video card to try swapping out.
Well, it is what it is, but spare hardware would open up a lot more options.
I wish I was computer-savy enough to offer some more concrete advice, but as things stand, I'll fall in with Fury's suggestion. A pure stress test should help identify or eliminate potential root causes.
-
Is this only occurring with the one game or is it happening with other software running? I've had certain games do that to my video card but for some reason only in full screen (full screen window now issues). I've also had it happen after directx software "updates" that some games install. I say updates because I think some install older versions.
-
Is this only occurring with the one game or is it happening with other software running? I've had certain games do that to my video card but for some reason only in full screen (full screen window now issues). I've also had it happen after directx software "updates" that some games install. I say updates because I think some install older versions.
I haven't had a chance for robust testing as this just occurred last night, *but* I have been running much newer games in the past few weeks quite regularly without a problem.
Either this is a brand-new universal issue, OR it's a problem just with this one section of this game in particular. FurMark should help me sort that out tonight. It seems odd that it should only start happening out of the blue in one particular section of one game; fortunately, after doing so research it seems I should be able to pick up a reasonable replacement (with slightly better performance) for around $130 should it become necessary.
-
If it does turn out to be a GPU issue - and even if the 250 degrees Celcius is a false reading - it's still possible, I guess, that the thermal interface between GPU and its cooler is no longer in optimal condition.
If the rapid fluctuations in temperature are not a sensor glitch, then there's got to be a reason why the thermal flux away from the GPU die is varying rapidly, and I can only really think of something like... say, air pocket in the thermal interface. When the GPU is at low temps, the thermal interface would still work well enough to keep temps at nominal values, but when temperature goes up, the air pocket would expand, causing the metal surfaces to detach, which would rapidly increase the GPU temps, which would put it on the throttle-down mode, allowing it (and the air pocket) to cool down, letting the metal parts to come back to closer contact... and the cycle could continue quite fast.
In such a case, one would hope that removing the GPU cooler, cleaning up the GPU die and cooler from the thermal gunk, and applying new, proper thermal interface material before re-seating the cooler would take care of THAT issue. Plus, if it's a several years old GPU, it may be a solid idea to do this anyway - just take care not to break anything in the retaining mechanism or the card itself.
It could also be something entirely different and it could be a software glitch, in which case physical fixes won't help. What's the sample rate of the sensor graph?
-
So, a 32 minute FurMark run produced a max temperature of 82 degrees, showed my fan scale ran perfectly, and experienced no artifacts, temp spikes, or other silliness.
Looks like this may be a software-specific issue.
-
...and you may all be interested to know that this was NOT a software issue and was in fact a symptom of video card death. The system refused to POST and threw out a beep code (for the memory) the other day; diagnostics actually turned out that it was the video card that was responsible. It has been replaced with a cheapish GT 640 which seems to have about comparable performance. I may swap that for a GTX 650 depending how it manages.
-
...and you may all be interested to know that this was NOT a software issue and was in fact a symptom of video card death. The system refused to POST and threw out a beep code (for the memory) the other day; diagnostics actually turned out that it was the video card that was responsible. It has been replaced with a cheapish GT 640 which seems to have about comparable performance. I may swap that for a GTX 650 depending how it manages.
A 640 can apperently run BF3 at reasonable settings. RivalLXFactor or some other popular BF3 channel (not sure which anymore) made a lot of videos with it.