Author Topic: Collision detection  (Read 11314 times)

0 Members and 1 Guest are viewing this topic.

I consider starting to work a little on the collision detection, a few questions:
1.)There's an old and a new system, which should I actually try to improve, which one is commonly used?
2.)As far as I can see both systems use a O(n²) algorithm, is there any specific reason why it wasn't changed to an octtree (which should be O(n log n) )?
3.) Can I use C++11 or should I avoid that?

 

Offline m!m

  • 211
I'll try to answer your questions as good as I can:
  • The old system is only there for troubleshooting purposes, you can ignore that and concentrate on the new system (it was written by Swifty so if you have questions you can ask him)
  • I can't speak for Swifty here but I'd guess that building an octree is not really an option as FS has a highly dynamic scene which would mean that you'd have to practically rebuild the octree very often
  • If you can get it to be compatible with older compilers it's fine (I know it's stupid but for now we need to keep it that way...)

 

Offline The E

  • He's Ebeneezer Goode
  • Moderator
  • 213
  • Nothing personal, just tech support.
    • Steam
    • Twitter
Regarding C++11: Code it as you want it. Then, after it's gotten to a state where it's a definitive improval over the old code, then we can worry about the old compilers.
If I'm just aching this can't go on
I came from chasing dreams to feel alone
There must be changes, miss to feel strong
I really need lifе to touch me
--Evergrey, Where August Mourns

 
Note that weapon blobs and beams will be popping in and out of existence all the time during battles.

I've also been doing some work in the area, but I haven't touched the actual logic for the collisions themselves. I'd be very interested in any performance improvements, particularly if they result in thread-safety.

 
Thanks for the answers so far. That the tree has to be adjusted each frame is normal, though the complexity is still only n log n if you completely rebuild it from scratch (for n elements, use a search - log n - to determine were it fits).
Of course there is some issue with constant factors, it's not only theory, but for a large amount of objects it should be worth it. I'd probably have to test if we ever get such a large amount of objects, but I'd guess yes.
I'm currently reading through the code to better understand how it works and in what way to change it.

 

Offline pecenipicek

  • Roast Chicken
  • 211
  • Powered by copious amounts of coffee and nicotine
    • Minecraft
    • Skype
    • Steam
    • Twitter
    • PeceniPicek's own deviantart page
Thanks for the answers so far. That the tree has to be adjusted each frame is normal, though the complexity is still only n log n if you completely rebuild it from scratch (for n elements, use a search - log n - to determine were it fits).
Of course there is some issue with constant factors, it's not only theory, but for a large amount of objects it should be worth it. I'd probably have to test if we ever get such a large amount of objects, but I'd guess yes.
I'm currently reading through the code to better understand how it works and in what way to change it.
look no further than the massive battle / the blueplanet2 version to see the engine crawl from collision detection among other things...
Skype: vrganjko
Ho, ho, ho, to the bottle I go
to heal my heart and drown my woe!
Rain may fall and wind may blow,
and many miles be still to go,
but under a tall tree I will lie!

The Apocalypse Project needs YOU! - recruiting info thread.

 
Just in case you wondered, I'm still working on this. And I have a  new question: ;)
Where does the actual model factor into the collision detection? The code that I looked at seems to treat an object as an cuboid (specifically float obj_get_collider_endpoint(int obj_num, int axis, bool min) ). I don't want to cut this out by accident (of course I'll notice the different behaviour during testing :> ) So far I'm reusing some code and doing something similar, as far as I can see.

I'm roughly 40% done. The sorting into the subtrees is still missing (that's roughly 10% for me ;) ), though I still got a few hours where I can do something, might be able to finish this and go into testing. Next 40% is testing for performance and bugs, the last 10% is for nice comments and making the code prettier. We will see how this goes and if it was actually useful work. ;)

[edit]
Nevermind my question, I found what I looked for. :)
« Last Edit: March 04, 2014, 10:18:33 am by Uchuujinsan »

 

Offline Rga_Noris

  • 29
  • What?
Awesome. Best of luck. Performance boosts are always applauded!
I think I'll call REAL Mahjong 'Chinese Dominoes', just to make people think I'm an ignorant asshat.

 

Offline Luis Dias

  • 211
no kidding. if Uchuujinsan's efforts are fruitful they could really boost FS2 to a new level of performance.

 

Offline JGZinv

  • 211
  • The Last Dual! Guardian
    • The FringeSpace Conversion Mod
Amen on that, I'll cheer for your success as well.
True power comes not from strength, but from the soul and imagination.
Max to PCS2 to FS2 SCP Guide
The FringeSpace Conversion Mod

 

Offline Swifty

  • 210
  • I reject your fantasy & substitute my own
The broadphase collision detection in place is a sort and prune system. I wrote it so that the engine does a quicksort on all object endpoints for each world aligned axis to find overlapping pairs. That's then passed on to the narrowphase. The broadphase algorithm in place should already be O(n log n)

Exactly what sort of algorithm are you trying to implement for the collision detection system? Is it a broadphase or narrowphase algorithm?

 
I was trying to improve the broadphase detection. It wasn't quite clear to me in the beginning that these were seperate systems, I only noticed that after working on it.
I'm actually not sure if it's O(n log n) average case, while the quicksort is O(n log n) average case, if the following is.. I'm not sure.
Code: [Select]
for ( i = 0; i < (*list).size(); ++i ) {
overlapped = false;

min = obj_get_collider_endpoint((*list)[i], axis, true);
//max = obj_get_collider_endpoint((*list)[i], axis, false);

for ( j = 0; j < overlappers.size(); ) {
//overlap_min = obj_get_collider_endpoint(overlappers[j], axis, true);
overlap_max = obj_get_collider_endpoint(overlappers[j], axis, false);
if ( min <= overlap_max ) {
overlapped = true;

if ( overlappers.size() == 1 && first_not_added ) {
first_not_added = false;
overlap_list_out->push_back(overlappers[j]);
}

if ( collide ) {
obj_collide_pair(&Objects[(*list)[i]], &Objects[overlappers[j]]);
}
} else {
overlappers[j] = overlappers.back();
overlappers.pop_back();
continue;
}

++j;
}

if ( overlappers.size() == 0 ) {
first_not_added = true;
}

if ( overlapped ) {
overlap_list_out->push_back((*list)[i]);
}

overlappers.push_back((*list)[i]);
}
I mean, worst case this clearly runs in O(n²), if all the object are in the same place on one axis then both loops will run n times. But average case? Not sure. I'd say it's something weird like O(n^(4/3))
Note how I commented out some code above (I'm not sure if it's up to date) because it doesn't seem to do anything.
Well, O(n²) or O(n log n) is sometimes only theoretical, because in actual performance constant factors matter. I have to say that the general algorithm you used seems like a pretty nice idea. :)

And thanks for answering. :)

 

Offline Swifty

  • 210
  • I reject your fantasy & substitute my own
I am aware of the worst-case runtime for this algorithm but in general it was a significant improvement over the previous system which was brute force run through all pairs. I profiled it quite extensively to make sure that it offered better performance and it has already been battle tested for a major release (Diaspora). I'm sure your system may be an improvement on the broadphase but I feel like you're trying to assert that my algorithm is major cause of slowdown which it is not based off of the profiling results.

And honestly, the solution to the problem you're describing is to simply find the axis with the most variance and perform the sort and prune on that axis first. Most of the non-overlapping pairs will have been already eliminated after that point. We're definitely not doing that but that seems to be a simple enough modification to add.
« Last Edit: March 05, 2014, 09:17:12 pm by Swifty »

 

Offline Bobboau

  • Just a MODern kinda guy
    Just MODerately cool
    And MODest too
  • 213
and it might be worth investigating plugging different sorting algorithms in it, it might be that quicksort is not the best choice.
also trying multithreaded sorting might possibly help, it is disjoint data so running the different axis sorts in different threads might prove worth while. a merge sort using a linked list structure with a different link for each dimension could be used. you would want to keep the three sorting threads around all the time, but blocked when not needed using semaphores or somesuch. mutlithreading is something the codebase could use a bit more of in terms of performance tuning.
Bobboau, bringing you products that work... in theory
learn to use PCS
creator of the ProXimus Procedural Texture and Effect Generator
My latest build of PCS2, get it while it's hot!
PCS 2.0.3


DEUTERONOMY 22:11
Thou shalt not wear a garment of diverse sorts, [as] of woollen and linen together

 
and it might be worth investigating plugging different sorting algorithms in it, it might be that quicksort is not the best choice.
also trying multithreaded sorting might possibly help, it is disjoint data so running the different axis sorts in different threads might prove worth while. a merge sort using a linked list structure with a different link for each dimension could be used. you would want to keep the three sorting threads around all the time, but blocked when not needed using semaphores or somesuch. mutlithreading is something the codebase could use a bit more of in terms of performance tuning.

I'll have a look at the potential for multithreaded broadphase detection this weekend. All my efforts so far have been on the narrowphase side.

 
@swifty
I'm sorry if I gave the wrong impression. The current algorithm is clearly superior to the old one, no doubt about that. The stuff about O(n²) or O(n log n) or O(n^(4/3)) was purely a theoretical consideration, I actually don't know if an alternative would make it faster in practice. I'm just trying, no guarantees for success.

I wasn't trying to blame something, I only wanted to analyze and see what's going on, to better understand what's happening. I'm new to the code after all. As I said, when I started working on the collision it wasn't even clear to me that it was split into broad/narrow phase detection, that might have led me to focus on the wrong part of the code, a part that might actually not be the main issue, even if improvements are possible.

@Bobboau
I think quicksort is a reasonable choice and afaik not necessarily slower than alternatives, but could be worth a try.
Unfortunately I'm not experienced with threading, but it seems like flaming_sword is. :)

 
Funnily enough, it seems that the best place for multithreading in broadphase is in the quicksort... :nervous:

The trick is to do it in a way that avoids too much overhead (defeating the purpose). I'll try to get some performance figures once I get it working.

 
One the one hand, I got it to work.

On the other hand, it's actually slightly slower with 4 threads and worse with more or fewer. :banghead:

There's potential, at least.

 

Offline Luis Dias

  • 211
thats amazingly cruel. what is the problem, too much overhead destroying any advantage gained?

 
I currently don't have that much time, but did some work on this anyway. Specifically, I started with some performance tests with the vs2012 profiler, in the asteroid mission where you first find the iceni (I didn't find a specific blue planet mission that I should try instead, I'm open to suggestions :) )
While there are not as many objects as in some current mods, I thought it's a good start.

Version: Release (no SSE, because reasons)
Inclusive samples in:
game_do_frame: 13358
obj_move_all: 2012
obj_sort_and_collide: 995
->obj_find_overlap_colliders: 786 (->obj_collide_pair: 452)
->obj_quicksort_colliders: 209

That means sort_and_collide excluding obj_collide_pair (which is the narrow phase) had 543 samples or ~55% of sort_and_collide or ~27% of obj_move_all or ~4% of game_do_frame

Another thing I noticed is that obj_collide_pair has roughly 1/3 to 1/2 of its samples in operator[] of the hashmap (~40% to be more precise)

Well, that was just a very small mission, I probably should do this again with a larger one to see how it scales...

So my current plan when I have time:
Try if caching get_collider_endpoint results will lead to a performance improvement with the current system (might even do this today)
Fix a very weird bug with the alternative system I tried to implement.
Optimize the alternative system
Do some profiling with different missions (I need help here, tell me which one I should choose :> )
Do some profiling with SSE2 builds.

Btw, what do you actually use for profiling? Just curious.