Hard Light Productions Forums
Modding, Mission Design, and Coding => FS2 Open Coding - The Source Code Project (SCP) => Topic started by: Nuke on August 16, 2013, 05:38:25 pm
-
i was bored the other day so i decided to thumb through the complete instruction set for my cpu. i noticed a set of extensions that i never heard of before, the advanced vector extensions. these are supposed to supersede the sse2 instructions that fso is usually built to support. it adds some cool stuff, like non-destructive operations, bigger floats, more floats per cycle, etc. i was curious if we would see any kind of performance improvement by supporting it, especially float happy things like collision detection.
http://en.wikipedia.org/wiki/Advanced_Vector_Extensions
-
http://www.virtualdub.org/blog/pivot/entry.php?id=347
It seems only Intel's compiler has support for multiple code paths allowing multiple SIMD instruction sets to co-exist within same executable. And even that only works on Intel CPU's. It would be daunting task for SCP to support more than two sets of executables, not to mention that simply switching instruction set from SSE2 to AVX often won't magically make the code faster. Sufficient manual code optimization would have to be done to fully utilize newer instruction sets, especially AVX. Otherwise you may actually see worse performance from AVX if you leave it to automatic optimization in the compiler.
This document may also be of interest: http://www.agner.org/optimize/vectorclass.pdf
As FSO is still and will probably remain good while single-threaded for most parts, higher SIMD instruction set support would probably be nice to get more performance out of our new(er) CPU's. But each instruction set could use at least some degree of manual optimization for optimal performance, it's not quite as simple as just throwing /arch:SSE4 or /arch:AVX to the compiler and rely on automatic optimization. But this probably would create a support nightmare where bugs may be present in one arch but not another due to said manual optimizations that deviate from one arch to another.
I could probably see SCP ditching SSE2 support in favor of SSE3. SSE4 would be appealing but both Intel and AMD played it stupid by creating SSE4, SSE4a, SSE4.1 and SSE4.2 instruction sets and CPU support is spotty at best. I would suspect SSE4 to be support nightmare because of that. If SCP is willing to support three sets of executables, SSE, SSE3 and AVX would be good choices. But again, how much work does that really entail in the manual optimization front?
-
Actually, at least for Visual Studio 2012, it is as simple as putting /arch:AVX in there, since the MS compiler can optimize for AVX automatically.
That still means nothing much in terms of performance though, since manual optimisation is still needed.
-
Actually, at least for Visual Studio 2012, it is as simple as putting /arch:AVX in there, since the MS compiler can optimize for AVX automatically.
I am aware of that, however I was referring to this: http://software.intel.com/sites/default/files/m/d/4/1/d/8/11MC12_Avoiding_2BAVX-SSE_2BTransition_2BPenalties_2Brh_2Bfinal.pdf
I am unsure whether /arch:AVX would be sufficient to properly optimize the code for AVX without manual work involved.
Also, in regards to Intel's compiler: http://www.polyhedron.com/web_images//intel/productbriefs/3a_SIMD.pdf
-
The Intel compiler is basically useless to us, since we do not have the 1600 USD even a single-user license costs (And we'd have to buy it twice, once for Win and once for Linux, and we'd have to make sure MacOS is still supported).
-
Yes I am aware of that. Stuff related to Intel compiler was posted for information purposes.