Hard Light Productions Forums

Modding, Mission Design, and Coding => FS2 Open Coding - The Source Code Project (SCP) => Topic started by: kgersen on July 26, 2011, 03:46:46 am

Title: Collision detection performance
Post by: kgersen on July 26, 2011, 03:46:46 am: Hi there,

I noticed in your master sticky post that collision detection has performance issue. Is this still true ?

I managed to download the code and build it. But I don't have much time to actually learn and play the game.

So I was wondering if you have any special mod or special missions (or benchmarks or whatever your call them) to stress test the code and to do profiling because launching the game and doing training mission #1 takes time and yield not much performance data...

Thx
Title: Re: Collision detection performance
Post by: The E on July 26, 2011, 04:49:18 am: Any mission that has a lot of stuff flying around will do. Artemis, Aristeia and Delenda Est (All from Blue Planet 2) for example will show this.
Title: Re: Collision detection performance
Post by: kgersen on July 26, 2011, 11:49:13 am: Artemis, you mean something like Battle of Artemis Station (http://www.moddb.com/mods/blue-planet-war-in-heaven/videos/battle-of-artemis-station) ? (I just Googled to find this).

if so, how can I access this mission directly when launching the game in profiling mode. Any game parameter to directly go there ?

my problem is that I can't and don't have time to play the mod from start to reach that specific mission.
Title: Re: Collision detection performance
Post by: The E on July 26, 2011, 11:50:32 am: Load the game with Blue Planet 2 installed. Select the BP2 campaign in the campaign room. Go to the techroom, mission simulator. Press Ctrl-Shift-S. It will be the first mission.
Title: Re: Collision detection performance
Post by: Dragon on July 26, 2011, 11:52:54 am: Actually, if you start BP2 campaign, you'll immidiately be taken to Artemis cutscene, since it's the opening.
Title: Re: Collision detection performance
Post by: Valathil on July 26, 2011, 12:19:16 pm: Besides we already know whats the problem with the collsion detection. Its a probable memory alignement issue in the bsp_data structure or caching issue combined with a massive cache miss issue in destroying collision pairs when there are lot of objects created and destroyed like many missiles. Swifty's working on something i heard
Title: Re: Collision detection performance
Post by: kgersen on July 26, 2011, 12:43:58 pm: thx a lot everyone, i'll try that.

@Valathil, I'm more here to study FS2 code rather than actually fix or optimize anything. At least for now since I've just downloaded the code yesterday...anything more so soon would be pretentious. But i'll keep you posted if I find anything.

to say it all, I come from the Free Allegiance (http://www.freeallegiance.org)development team (my handle is KGJV or TheCount) and I've been coding and modding its C++ engine for 10 years now. I'm currently fishing for ideas or code to improve some parts of Allegiance or eventually "merge" Allegiance within another similar game. FS2 SCP is one candidate for this. Looking at the collision code is just a convenient way to start my study.
Title: Re: Collision detection performance
Post by: Iss Mneur on July 26, 2011, 03:13:13 pm: Quote from: kgersen on July 26, 2011, 12:43:58 pm
to say it all, I come from the Free Allegiance (http://www.freeallegiance.org)development team (my handle is KGJV or TheCount) and I've been coding and modding its C++ engine for 10 years now. I'm currently fishing for ideas or code to improve some parts of Allegiance or eventually "merge" Allegiance within another similar game. FS2 SCP is one candidate for this. Looking at the collision code is just a convenient way to start my study.
Interesting. I have never heard of that one before, I will have to check it out when I have more time. Is its source available as well?

Either way, keep in mind what the FSO license is.
Title: Re: Collision detection performance
Post by: kgersen on July 26, 2011, 03:44:53 pm: Quote from: Iss Mneur on July 26, 2011, 03:13:13 pm
Quote from: kgersen on July 26, 2011, 12:43:58 pm
to say it all, I come from the Free Allegiance (http://www.freeallegiance.org)development team (my handle is KGJV or TheCount) and I've been coding and modding its C++ engine for 10 years now. I'm currently fishing for ideas or code to improve some parts of Allegiance or eventually "merge" Allegiance within another similar game. FS2 SCP is one candidate for this. Looking at the collision code is just a convenient way to start my study.
Interesting. I have never heard of that one before, I will have to check it out when I have more time. Is its source available as well?

Either way, keep in mind what the FSO license is.

yes the source is fully available (VS 2008 or 2010 C++, Windows only, DirectX 9.0 gfx and DirectPlay network code).
Microsoft released the source a few years ago with a special license (worst than FSO license since everything you do can be reused by M$...).
Check our wiki 1st if you're interested.
Title: Re: Collision detection performance
Post by: Sushi on July 26, 2011, 05:17:18 pm: Quote from: kgersen on July 26, 2011, 12:43:58 pm
to say it all, I come from the Free Allegiance (http://www.freeallegiance.org)development team (my handle is KGJV or TheCount) and I've been coding and modding its C++ engine for 10 years now. I'm currently fishing for ideas or code to improve some parts of Allegiance or eventually "merge" Allegiance within another similar game. FS2 SCP is one candidate for this. Looking at the collision code is just a convenient way to start my study.

KGJV! Fun to see a familiar name. :)
Title: Re: Collision detection performance
Post by: Swifty on July 26, 2011, 05:42:34 pm: I'm currently working on collision detection performance issues ATM. If you have any questions regarding the collision detection code, feel free to post or PM questions.
Title: Re: Collision detection performance
Post by: Parias on July 27, 2011, 03:34:53 pm: Quote from: kgersen on July 26, 2011, 12:43:58 pm
I'm currently fishing for ideas or code to improve some parts of Allegiance or eventually "merge" Allegiance within another similar game. FS2 SCP is one candidate for this. Looking at the collision code is just a convenient way to start my study.

Wow, I was a huge, hardcore Allegiance fan back in the day - something like this would be awesome. I hope you succeed in your efforts.
Title: Re: Collision detection performance
Post by: Sushi on July 27, 2011, 03:38:55 pm: Quote from: Parias on July 27, 2011, 03:34:53 pm
Quote from: kgersen on July 26, 2011, 12:43:58 pm
I'm currently fishing for ideas or code to improve some parts of Allegiance or eventually "merge" Allegiance within another similar game. FS2 SCP is one candidate for this. Looking at the collision code is just a convenient way to start my study.

Wow, I was a huge, hardcore Allegiance fan back in the day - something like this would be awesome. I hope you succeed in your efforts.

It would indeed be cool.

TBH, though, I think you'd have better luck writing a completely new game engine instead of trying to merge two fundamentally different decade-old games (the only thing in common, really, is the space setting).
Title: Re: Collision detection performance
Post by: karajorma on July 29, 2011, 04:44:36 am: Porting the features of Allegiance to FSO until they can run the game on our engine might work though.
Title: Re: Collision detection performance
Post by: Bobboau on July 30, 2011, 08:10:28 am: has anyone profiled the engine lately? do we know where exactly the performance hit is? is it in the collision detection proper (collide point with bsp, bsp with bsp, etc) or is it in the pair generation/collision culling section?
Title: Re: Collision detection performance
Post by: Swifty on July 30, 2011, 04:13:29 pm: I've lately been working on a sort and prune solution using quicksort such that the engine won't have to go through all pairs to brute force collision queries. It's a lot faster than iterating through all n^2 pairs but there are still some bugs with the diff. I'm putting off sharing it until I can clean it up.

We've also seen a significant portion of frametime going into the initial submodel bounding box checks. Submodels redundantly had their orientations and local offsets computed for each collision. A patch I made, which only does it per frame, has been committed into the code base.

Most of the bottleneck is now coming from the actual BSP queries. I guess for smaller models with low poly counts are fine but way bigger models, we end up getting a huge spike. Valathil did some profiling of the BSP queries as did I. Most of the bottleneck from model_collide_sub is not coming from the meat of the process (loading of points, triangle tests, or the bounding box tests) but just from accesses of the p pointer. I also wrote an non-recursive version of model_collide_sub thinking that the performance issues were coming from the function call overhead. I got the same results there.

Valathil's hunch was that it was a memory alignment issue. Based on that, I rewrote model_collide_sub and wrote a parser for bsp_data that would load all the data into well-defined structs to negate that problem. It works and calculates collision 100% correctly. The bad news is that it performs just as badly.

I'm therefore beginning to suspect that it's not a memory alignment issue but a cache issue. I think cache misses that are killing our performance in model_collide_sub, especially if you have models with huge ass triangle counts. It makes sense too. We're always bouncing all around the tree depth first search-style looking for intersecting bounding boxes and triangles.

So I think I might take advantage of the parser I wrote to play around with organizing the chunk data differently to find out which data storage layouts are better cache friendly (while keeping the actual tree structure intact). I might also combine that with different tree traversal strategies. All the while I've been reading some white papers on cache oblivious algorithms and data structures to give me some better understanding on the problem.
Title: Re: Collision detection performance
Post by: Bobboau on July 30, 2011, 05:14:16 pm: well, even if it does not improve performance, you should DEFINITELY include the parsed version of the BSP code, this problem will not get fixed with unreadable gobldygook like the dereferenced pointer dynamic data structure crap. So now you have a nice data structure rather than a massive data buffer, hopefully you have separated the polygons from the sort norms, sortnorms is where most of the movement happens, so if you can condense that, it will help with cache issues. how is the BSP tree constructed now? I'm guessing that you are not using new to allocate new sort norms, but rather allocating a solid block of them and pointing them at each other, I think the best way to organize such a tree would be with a heap like architecture, keep children of the same node together, this would require a breadth first construction, but the depth first reading would end up not moving around as much when it was actually getting run through. do keep in mind the tree depth is log(n) of the poly count, a 50,000 poly model might only have a tree depth of 16, PCS2 (should) be working at keeping this tree balanced so there are the same number of polies on either side of any given branch.

this might be an area where multi-threading could be applied. if you break the list of possible collisions apart into n sublists (where n is the number of CPUs) it might be possible to run each on it's own thread, you might have to completely rewrite the collision code to get this to work properly though.
Title: Re: Collision detection performance
Post by: kgersen on July 31, 2011, 05:02:33 am: I'm still discovering FS architecture & code so I can't comment more on it yet but I can talk about how Allegiance handles collision detection.

Allegiance being a multiplayer only game with a dedicated server, only the server actually computes the collisions. To do this efficiently on 10 years ago hardware and allowing up to 200 players per game, they had to really optimize the collision code. Even more since there is no 'beam' weapon but only 'particle' weapons firing bullets (called projectiles) each been handled as a distinct object in the game logic code (and so in the collision code too).

The 1st thing they did is to not use the 3D models directly but instead use specially computed models made of convex hulls (http://en.wikipedia.org/wiki/Convex_hull). These convex hulls use variable n-polygons instead of triangles and are convex (orly?) which speed up greatly collision detection.
At compile time, the 3D models are transformed using QHull (http://www.qhull.org/) and a CVH (convex hull) file is generated for each 3D model file.
Also instead of using a classical bounding box, the CVH files include a bounding ellipsoid of each object, optimizing even more the initial computation (that same bounding ellipsoid is also used for energy shields).

Then the code uses and updates a K-D tree (http://en.wikipedia.org/wiki/K-d_tree) to store the objects and compute the collisions more efficiently as illustrated here (http://screamyguy.net/collision/index.htm).
The collision code only uses CVH files and not the more complex 3D model files.

so to sum up , current optimizations are:

- bounding ellipsoids instead of bounding boxes
- special pre-computed 'convex hulls' models instead of the 3D gfx models
- k-d tree structure to store and sort the objects, updated every frame

There are more optimizations beyond that but they are more Allegiance specific (like 'sectoring' and separating moving and non moving objects).
Title: Re: Collision detection performance
Post by: Bobboau on July 31, 2011, 10:45:58 am: I considered suggesting the convex hull optimization, but I assumed it would require a major rewrite of the game media, this QHull (library?) intrigues me...

I also started describing something very similar to a k-d tree, but then I realized that addresses pair generation, and that is apparently not currently where most of the slowdown occurs.
Title: Re: Collision detection performance
Post by: Luis Dias on August 09, 2011, 03:44:28 pm: Convex hulls wouldn't work in FS ships. Consider the Collossus

EDIT: You could however, without much media rewrite fuss, use the other simplified meshes from the same ships to calculate impacts, instead of the full mesh (I don't know if FSO already does this)
Title: Re: Collision detection performance
Post by: Nuke on August 09, 2011, 08:02:42 pm: i always did like the collision mesh idea. in fact it is possible to them now with existing techniques and no code changes. but its something that has to be done mod side. could imagine it being done to the mediavps as a whole, though im not sure if it would have any unforeseen side effects. now if pcs2 would set this up automatically given a subobjectname-collision for each peice of ship, that would be awesome. as it stands you have to play with a bunch of subobject properties.
Title: Re: Collision detection performance
Post by: Swifty on August 09, 2011, 08:56:24 pm: I actually have a patch that does exactly that. You would just add a submodel to a POF in PCS2 with a -collision suffix and FSO will do a collision query against that model instead of the real one.

I whipped it up for Diaspora. We tried using it but it was demanded too much work from our artists to get working. I'm sure other people in the community would appreciate it but it's kinda buggy at the moment.
Title: Re: Collision detection performance
Post by: Nuke on August 09, 2011, 10:27:47 pm: it works and i like it, but it seems a little bit like a hack at worst and at best not an optimal solution for the long term. my concern here is that a lot of unnecessary data would be contained in the pof file/model data structure. a collision mesh doesnt need texture and uv mapping info, and a non-collidable mesh does not need to have complete collision data. all subobjects are essentially treated the same and you currently only apply flags to disable features. any data generated would essentially be wasted space.

a proper solution would be a newer optimized pof chunk formats and optimized model data structure. like a unified object class that contains for each subobject, visible geometry (with texture data, uv space(s), ibxes, lod stack, etc), collision geometry (polygonal object, collision trees, etc), the class would also contain a header containing hierarchy info and transform data, radius, bounding box, etc. this class is stored in the pof as a new chunk of course the actual structure would need to be dependent on how the innards of the engine actually work. all data but the header would be optional of course, you may want non-collideable geometry like fighter bay force fields, you may want to omit everything and use the object as a node for animation purposes, so all that you have is the header data.

another idea to consider is perhaps to make the model format its own library and have it shared with pcs2, so that when you make changes to the model format, you just update the library and you just have to rebuild pcs2 for compatibility. for artisitc load i dont see making a colision mesh any more complicated than an lod. just a one minute poly chop job. im sure many modders would just want you to use the second lod as a collision mesh. much of this would be a huge workload, but it might pay off in the long run. the way i see it the current model format is somewhat of a indifference to both rendering and collision detection. we just need to maintain compatibility with the old format for reverse compatibility, and its well suited to retail data and old pre-htl mods. but to run the engine with what it could do now, an optimized model format would improve performance on all fronts.
Title: Re: Collision detection performance
Post by: Bobboau on August 09, 2011, 11:05:36 pm: Quote from: Luis Dias on August 09, 2011, 03:44:28 pm
Convex hulls wouldn't work in FS ships. Consider the Collossus

you use more than one in a composite.
Title: Re: Collision detection performance
Post by: jr2 on August 11, 2011, 03:19:34 pm: @This whole idea: how would you detect collisions on subsystems and turrets then?
Title: Re: Collision detection performance
Post by: Bobboau on August 11, 2011, 09:04:25 pm: same way you do now, those distinctions are not effected by the low level stuff.
Title: Re: Collision detection performance
Post by: Nuke on August 11, 2011, 09:06:24 pm: Quote from: jr2 on August 11, 2011, 03:19:34 pm
@This whole idea: how would you detect collisions on subsystems and turrets then?

collision detection involves check every object in the game with every other object for collisions. it would be slow to do full polygon-accurate tests on such a scale (the maths are mind/cpu boggling) so what you do is start with a simple test. check to see if the raddi of various objects intersect. a radius test is far cheaper than checking otrees or convex hull or whatever system you have in place (iirc freespace uses bsp algorithms?). but what it does is culls out all the objects that really are not likely to collide. you get a much shorter list of objects which need more scrutiny. so you do a more expensive test with those to cull the list further, check for bounding box collisions, and toss further objects from the list. so that when you start doing mathematically expinsive tests, the list of objects is tiny, maybe 2 or 3 objects at the most. thats the theory anyway. freespace handles collisions between certain kinds of objects differently. weapons on weapon collisions only really care about the radius test, where as ship on ship collisions require a lot more work to look right.

we still consider submodels as parts of the ship so they are only tested for collisions after the ship has tested positive for potential collision. way i would do things is to treat every submodel in the game as a potential collision object, with its own radius, bbox and transform, this might seem slower at first because youre making the object list bigger. but what your doing is eliminating a lot of the list in early radius checks. so if a ship has 50 turrets, and your checking them against an incoming bomb. instead of having to recursively test all of those turrets after the ship tests positive, you might get the ship hull, one turret and one turret barrel in the list, and the rest of the turrets can be ignored. so fewer objects to run high precision collision tests on. if you get multiple positives then the closer collision position vector wins.

this kind of mentality could also be applied to rendering, where submodels get depth sorted and loded based on their distance from the camera and not their parent's distance (so when staring at the front end of a 6km capitol ****, the 2k poly tail gun 6klicks away doesnt render in full detail). it would also make subobject animation better because it would allow translating through the extra xform data in the submodel. this is why id like to see new object formats in our pof files. of course im not holding my breath on this. the number of changes to the engine to do something like this is huge. thousands of coder hours. but if i were writing a new game engine, that is how i would do things.
Title: Re: Collision detection performance
Post by: Bobboau on August 11, 2011, 09:12:50 pm: "freespace uses bsp algorithms"

technically it's sort of a BSP/octree hybrid, but yeah basically.

"way i would do things is to treat every submodel in the game as a potential collision object, with its own radius, bbox and transform"
problem with that is you just made the first list of possible collisions not 50 items longer, but 50 TIMES larger, that first test is n^2 complexity, if you have a static position relationship between a collection of objects, you can cull a huge swath of tests by testing the collection rather than every individual object independently, especially when in the vast majority of cases there will be no collision.
Title: Re: Collision detection performance
Post by: Nuke on August 11, 2011, 10:27:46 pm: surely there are other less expensive tests you can do to cull colliding objects from collision lists. like objects that did not collide in the last frame and are moving away from eachother can be eliminated. you can direction eliminate things, ake velocity and make a huge plane prependicular to it, and cull everything behind that plane. if something was heading up from behind it would detect collision with you when its turn to be tested came up. coplexity will always be n^2 so your best bet is to make n smaller with trivial operations. then make the non-trivial operations simpler by removing the need to consider dynamics of child objects.
Title: Re: Collision detection performance
Post by: Bobboau on August 11, 2011, 10:42:13 pm: the test isn't the expensive part it's the 'touching them at all' aspect that is the killer no matter how simple and fast a test is you will still have n^2 complexity, which will overpower any optimization you put into it, unless you can avoid testing every object with every other object, which you can't when they can at any moment turn around and hit something completely different.
Title: Re: Collision detection performance
Post by: Sushi on August 12, 2011, 02:05:34 am: Quote from: Nuke on August 11, 2011, 10:27:46 pm
surely there are other less expensive tests you can do to cull colliding objects from collision lists. like objects that did not collide in the last frame and are moving away from eachother can be eliminated. you can direction eliminate things, ake velocity and make a huge plane prependicular to it, and cull everything behind that plane. if something was heading up from behind it would detect collision with you when its turn to be tested came up. coplexity will always be n^2 so your best bet is to make n smaller with trivial operations. then make the non-trivial operations simpler by removing the need to consider dynamics of child objects.

There actually already are a number of optimizations that cull collision pairs exactly like you describe. They cause some grief of their own, though. :)
Title: Re: Collision detection performance
Post by: Swifty on August 12, 2011, 12:54:54 pm: The best optimization for broad phase culling is probably sort and sweep or object k-d trees. In terms of what's easier to implement, I chose sort and sweep in my current optimization build. As I said previously in the thread, it lessened the broad phase culling overhead but it still has some pretty noticeable bugs prompting me to give it some more time in the oven.

Treating each submodel as it's own object just doesn't make any sense. You already have a basic test to cull out all submodels in a given set and that's the radius test for an object. You're not going to make collision detection any faster from eliminating the broad to narrow hierarchy. That's like saying traversing down an k-d tree is too expensive so let's just test all the triangles.

Besides, the place where we figure out which submodels should be tested by doing quick outs on their bounding boxes, mc_check_subobj, isn't very expensive anymore. I recently committed an optimization to that function that calculates the transformation of each subobject per frame instead of per collision query. Nuke's "optimization" wouldn't do anything but make things more expensive.
Title: Re: Collision detection performance
Post by: Luis Dias on August 12, 2011, 02:59:48 pm: Soo..... do you have any ideas going forward?
Title: Re: Collision detection performance
Post by: The E on August 12, 2011, 03:12:35 pm: He just told you what ideas he has, and what he is working on.
Title: Re: Collision detection performance
Post by: Swifty on August 12, 2011, 03:13:53 pm: Quote from: Luis Dias on August 12, 2011, 02:59:48 pm
Soo..... do you have any ideas going forward?

Did you even read any of my posts in this thread?
Title: Re: Collision detection performance
Post by: Luis Dias on August 12, 2011, 03:19:40 pm: Oh sorry, I thought you meant that you had an almost finished build. I was merely asking the academic question of "what ideas could be placed in the future", but now I see you are still worried about these bugs in the current solution. Sorry again.
Title: Re: Collision detection performance
Post by: Nuke on August 12, 2011, 10:42:32 pm: Quote from: Swifty on August 12, 2011, 12:54:54 pm
The best optimization for broad phase culling is probably sort and sweep or object k-d trees. In terms of what's easier to implement, I chose sort and sweep in my current optimization build. As I said previously in the thread, it lessened the broad phase culling overhead but it still has some pretty noticeable bugs prompting me to give it some more time in the oven.

Treating each submodel as it's own object just doesn't make any sense. You already have a basic test to cull out all submodels in a given set and that's the radius test for an object. You're not going to make collision detection any faster from eliminating the broad to narrow hierarchy. That's like saying traversing down an k-d tree is too expensive so let's just test all the triangles.

Besides, the place where we figure out which submodels should be tested by doing quick outs on their bounding boxes, mc_check_subobj, isn't very expensive anymore. I recently committed an optimization to that function that calculates the transformation of each subobject per frame instead of per collision query. Nuke's "optimization" wouldn't do anything but make things more expensive.

youve got the code figured out better than i do. only collision detection engine i did was 2d and that one was kinda buggy. i was just under the impression that it was faster to do a large number of radius checks than a small number of bbox or octree/bsp checks. but il deffer to your expertise on the matter.