Hard Light Productions Forums

Modding, Mission Design, and Coding => FS2 Open Coding - The Source Code Project (SCP) => Test Builds => Topic started by: Zacam on September 13, 2009, 08:29:27 pm

Title: Regarding SSE Builds
Post by: Zacam on September 13, 2009, 08:29:27 pm
This topic is for the discussion related to released SSE/SSE2 Builds. (Release and Debug)

This post is a placeholder.
Title: Re: Regarding SSE Builds
Post by: Aardwolf on September 13, 2009, 08:50:27 pm
K, I'll start things off. What are they?
Title: Re: Regarding SSE Builds
Post by: chief1983 on September 13, 2009, 10:04:48 pm
r u srs?

They are builds, with the SSE or SSE2 extensions enabled for fasting processing of certain types of arithmetic, optimized at the compiler level.  Beyond that, google it.
Title: Re: Regarding SSE Builds
Post by: jr2 on September 14, 2009, 03:22:44 pm
K, I'll start things off. What are they?

http://en.wikipedia.org/wiki/SIMD

Now you have a history of SIMD:

http://en.wikipedia.org/wiki/MMX_(instruction_set) (t.y. Intel)
http://en.wikipedia.org/wiki/3dnow (t.y. AMD)
http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions  (plain ol' SSE) (t.y. Intel)
http://en.wikipedia.org/wiki/SSE2 (t.y. Intel)
http://en.wikipedia.org/wiki/SSE3 (t.y. Intel)
http://en.wikipedia.org/wiki/SSSE3 (t.y. Intel)
http://en.wikipedia.org/wiki/SSE4 (t.y. Intel)
http://en.wikipedia.org/wiki/XOP_instruction_set (t.y. AMD)
http://en.wikipedia.org/wiki/FMA_instruction_set (t.y. AMD)
http://en.wikipedia.org/wiki/CVT16_instruction_set (t.y. AMD)
http://en.wikipedia.org/wiki/Advanced_Vector_Extensions (t.y. Intel and AMD)
http://en.wikipedia.org/wiki/SSE5 (t.y. AMD)
Title: Re: Regarding SSE Builds
Post by: Sushi on September 14, 2009, 03:31:28 pm
Yikes, Wall o' Wiki!

Practical upshot is that SSE/SSE2 builds theoretically can improve your FPS.
Title: Re: Regarding SSE Builds
Post by: Topgun on September 14, 2009, 05:27:33 pm
then why doesn't everyone use them?
Title: Re: Regarding SSE Builds
Post by: The E on September 14, 2009, 05:37:53 pm
Because, until very recently, they have not been made regularly.
And they have a noticeable effect only in situations where the CPU is bottlenecking your performance.
Title: Re: Regarding SSE Builds
Post by: Herra Tohtori on September 14, 2009, 07:18:22 pm
Because, until very recently, they have not been made regularly.
And they have a noticeable effect only in situations where the CPU is bottlenecking your performance.

...Which, for most users with demi-recent GPU (GeForce 8 series or newer - even high-end 7 series models tend to do pretty well) is the limiting factor in most cases of slowdowns.

Particles and collision detection are done on CPU at least, hence you should have better performance with explosions, beams, weapons with particle spawn, complex models, very numerous models, very numerous weapon blobs flying around and probably some other things that I have no idea about should work faster with the newer instruction set support.

Basically it allows the game to use more advanced features of the central processor unit, which should allow faster data execution.


What sort of benchmarks have been done already? Massive Battle with FRAPS benchmarking the FPS? What sort of results the SSE2 builds actually deliver (and how do other optimizations affect things) in numbers game?
Title: Re: Regarding SSE Builds
Post by: Zacam on September 15, 2009, 10:38:26 am
Another thing to note about the potential benefit of SSE vs Regular builds is that, while FPS raises may not be greatly enhanced (Mura, for example, might see an average gain of 3-4 FPS), the over all _feel_ should be slightly smoother, and it's recovery from events that would drop FPS should be faster.

This should also hopefully result in smoother, more acceptable gameplay even if the over all FPS increase is minimal, as the FPS will also be handled by the rating of the GPU. For myself, because the CPU was able to do more (and I already had a beefy GPU) my FPS increase was fairly significant.
Title: Re: Regarding SSE Builds
Post by: Sushi on September 15, 2009, 11:53:19 am
So when are these optimizations to SSE builds I've been hearing about going to make it into trunk?
Title: Re: Regarding SSE Builds
Post by: chief1983 on September 15, 2009, 01:59:25 pm
I wasn't aware of any optimizations to the SSE builds themselves, just that there have been some SSE-enabled (and therefore optimized) builds made available recently.
Title: Re: Regarding SSE Builds
Post by: Zacam on September 16, 2009, 09:38:38 pm
Octually, SSE as a build type has been available in the MSVC_2008 project for awhile. But these builds are based of a private selection of flag and options changes to those projects that do increase SSE/SSE2 performance over what is currently put out when left in the current public settings.

And they will make it in to trunk as soon as I am satisfied that no other possible combinations exist to enhance performance and once frequent enough testing has taken place that proves the stability of these builds.

I'll the submit the .patch and the additional files I have place into my working dir for consideration, and should nobody find any issue with there being any conflictive settings that the compiler is too stupid to catch that I'm not aware of, it'll likely make it in to a trunk commit hopefully before too long.
Title: Re: Regardin' SSE Builds<p/><font color=\
Post by: Aardwolf on September 19, 2009, 04:19:45 pm
I shall try this and see if it even works on me computer.   :p

Edit (stupid pirate script is making me de-piratify my original post):
Code: [Select]
Assert: !resize
File: gropengltexture.cpp
Line: 622

<no module>! KiFastSystemCallRet
<no module>! WaitForSingleObject + 18 bytes
<no module>! SCP_DumpStack + 260 bytes
<no module>! WinAssert + 208 bytes
<no module>! opengl_create_texture_sub + 2711 bytes
<no module>! opengl_create_texture + 998 bytes
<no module>! gr_opengl_tcache_set_internal + 217 bytes
<no module>! gr_opengl_tcache_set + 137 bytes
<no module>! opengl_render_pipeline_fixed + 1895 bytes
<no module>! gr_opengl_render_buffer + 232 bytes
<no module>! gr_render_buffer + 58 bytes
<no module>! model_render_buffers + 1947 bytes
<no module>! model_really_render + 2816 bytes
<no module>! model_try_cache_render + 55 bytes
<no module>! model_render + 676 bytes
<no module>! labviewer_render_model + 2253 bytes
<no module>! labviewer_do_render + 147 bytes
<no module>! lab_do_frame + 149 bytes
<no module>! game_do_state + 1453 bytes
<no module>! gameseq_process_events + 237 bytes
<no module>! game_main + 728 bytes
<no module>! WinMain + 330 bytes
<no module>! __tmainCRTStartup + 358 bytes
<no module>! WinMainCRTStartup + 15 bytes
<no module>! RegisterWaitForInputIdle + 73 bytes

Dunno if it's related to the SSE(2) tho. I'll try with a normal build and compare.
Title: Re: Regardin' SSE Builds
Post by: The E on September 19, 2009, 04:31:38 pm
Is there a full debug log?
Title: Re: Regardin' SSE Builds
Post by: Aardwolf on September 19, 2009, 04:33:39 pm
I just mantis'd this, it's issue #1994. I'll upload a debug log there in a minute.
Title: Re: Regardin' SSE Builds
Post by: Zacam on September 19, 2009, 04:34:43 pm
The hell?

That is, uh, rather interesting. Does that happen to both SSE builds (SSE and SSE2)? Is it Debug or Release?
Title: Re: Regardin' SSE Builds
Post by: Aardwolf on September 19, 2009, 04:37:35 pm
I don't think it's actually SSE-related. It seems to just be debug builds in general.

Here's the link to the issue on Mantis: clicky (http://scp.indiegames.us/mantis/view.php?id=1994)
Title: Re: Regarding SSE Builds
Post by: Zacam on September 19, 2009, 05:41:18 pm
Seems to be the CubeMap.

Previous versions of the cubemap where mip-mapped. WoolieWools cubemap is not.

I resaved the cubemap to have mip-mapping, since I think the error is in !resize function of trying to create the mip maps dynamically through the -mipmap flag. Which is strange because it should be barfing on a lot of the other non-mip mapped effects files if that were the case.

Try this cubemap: http://www.mediafire.com/file/mqo01mgjlyn/CubeMap.dds

Place in MediaVPs\data\effects. If that fixes the problem, I'll commit it to the MediaVPs SVN.