Hard Light Productions Forums
Modding, Mission Design, and Coding => FS2 Open Coding - The Source Code Project (SCP) => Topic started by: Uchuujinsan on February 08, 2014, 03:34:02 am
-
I consider starting to work a little on the collision detection, a few questions:
1.)There's an old and a new system, which should I actually try to improve, which one is commonly used?
2.)As far as I can see both systems use a O(n²) algorithm, is there any specific reason why it wasn't changed to an octtree (which should be O(n log n) )?
3.) Can I use C++11 or should I avoid that?
-
I'll try to answer your questions as good as I can:
- The old system is only there for troubleshooting purposes, you can ignore that and concentrate on the new system (it was written by Swifty so if you have questions you can ask him)
- I can't speak for Swifty here but I'd guess that building an octree is not really an option as FS has a highly dynamic scene which would mean that you'd have to practically rebuild the octree very often
- If you can get it to be compatible with older compilers it's fine (I know it's stupid but for now we need to keep it that way...)
-
Regarding C++11: Code it as you want it. Then, after it's gotten to a state where it's a definitive improval over the old code, then we can worry about the old compilers.
-
Note that weapon blobs and beams will be popping in and out of existence all the time during battles.
I've also been doing some work in the area, but I haven't touched the actual logic for the collisions themselves. I'd be very interested in any performance improvements, particularly if they result in thread-safety.
-
Thanks for the answers so far. That the tree has to be adjusted each frame is normal, though the complexity is still only n log n if you completely rebuild it from scratch (for n elements, use a search - log n - to determine were it fits).
Of course there is some issue with constant factors, it's not only theory, but for a large amount of objects it should be worth it. I'd probably have to test if we ever get such a large amount of objects, but I'd guess yes.
I'm currently reading through the code to better understand how it works and in what way to change it.
-
Thanks for the answers so far. That the tree has to be adjusted each frame is normal, though the complexity is still only n log n if you completely rebuild it from scratch (for n elements, use a search - log n - to determine were it fits).
Of course there is some issue with constant factors, it's not only theory, but for a large amount of objects it should be worth it. I'd probably have to test if we ever get such a large amount of objects, but I'd guess yes.
I'm currently reading through the code to better understand how it works and in what way to change it.
look no further than the massive battle / the blueplanet2 version to see the engine crawl from collision detection among other things...
-
Just in case you wondered, I'm still working on this. And I have a new question: ;)
Where does the actual model factor into the collision detection? The code that I looked at seems to treat an object as an cuboid (specifically float obj_get_collider_endpoint(int obj_num, int axis, bool min) ). I don't want to cut this out by accident (of course I'll notice the different behaviour during testing :> ) So far I'm reusing some code and doing something similar, as far as I can see.
I'm roughly 40% done. The sorting into the subtrees is still missing (that's roughly 10% for me ;) ), though I still got a few hours where I can do something, might be able to finish this and go into testing. Next 40% is testing for performance and bugs, the last 10% is for nice comments and making the code prettier. We will see how this goes and if it was actually useful work. ;)
[edit]
Nevermind my question, I found what I looked for. :)
-
Awesome. Best of luck. Performance boosts are always applauded!
-
no kidding. if Uchuujinsan's efforts are fruitful they could really boost FS2 to a new level of performance.
-
Amen on that, I'll cheer for your success as well.
-
The broadphase collision detection in place is a sort and prune system. I wrote it so that the engine does a quicksort on all object endpoints for each world aligned axis to find overlapping pairs. That's then passed on to the narrowphase. The broadphase algorithm in place should already be O(n log n)
Exactly what sort of algorithm are you trying to implement for the collision detection system? Is it a broadphase or narrowphase algorithm?
-
I was trying to improve the broadphase detection. It wasn't quite clear to me in the beginning that these were seperate systems, I only noticed that after working on it.
I'm actually not sure if it's O(n log n) average case, while the quicksort is O(n log n) average case, if the following is.. I'm not sure.
for ( i = 0; i < (*list).size(); ++i ) {
overlapped = false;
min = obj_get_collider_endpoint((*list)[i], axis, true);
//max = obj_get_collider_endpoint((*list)[i], axis, false);
for ( j = 0; j < overlappers.size(); ) {
//overlap_min = obj_get_collider_endpoint(overlappers[j], axis, true);
overlap_max = obj_get_collider_endpoint(overlappers[j], axis, false);
if ( min <= overlap_max ) {
overlapped = true;
if ( overlappers.size() == 1 && first_not_added ) {
first_not_added = false;
overlap_list_out->push_back(overlappers[j]);
}
if ( collide ) {
obj_collide_pair(&Objects[(*list)[i]], &Objects[overlappers[j]]);
}
} else {
overlappers[j] = overlappers.back();
overlappers.pop_back();
continue;
}
++j;
}
if ( overlappers.size() == 0 ) {
first_not_added = true;
}
if ( overlapped ) {
overlap_list_out->push_back((*list)[i]);
}
overlappers.push_back((*list)[i]);
}
I mean, worst case this clearly runs in O(n²), if all the object are in the same place on one axis then both loops will run n times. But average case? Not sure. I'd say it's something weird like O(n^(4/3))
Note how I commented out some code above (I'm not sure if it's up to date) because it doesn't seem to do anything.
Well, O(n²) or O(n log n) is sometimes only theoretical, because in actual performance constant factors matter. I have to say that the general algorithm you used seems like a pretty nice idea. :)
And thanks for answering. :)
-
I am aware of the worst-case runtime for this algorithm but in general it was a significant improvement over the previous system which was brute force run through all pairs. I profiled it quite extensively to make sure that it offered better performance and it has already been battle tested for a major release (Diaspora). I'm sure your system may be an improvement on the broadphase but I feel like you're trying to assert that my algorithm is major cause of slowdown which it is not based off of the profiling results.
And honestly, the solution to the problem you're describing is to simply find the axis with the most variance and perform the sort and prune on that axis first. Most of the non-overlapping pairs will have been already eliminated after that point. We're definitely not doing that but that seems to be a simple enough modification to add.
-
and it might be worth investigating plugging different sorting algorithms in it, it might be that quicksort is not the best choice.
also trying multithreaded sorting might possibly help, it is disjoint data so running the different axis sorts in different threads might prove worth while. a merge sort using a linked list structure with a different link for each dimension could be used. you would want to keep the three sorting threads around all the time, but blocked when not needed using semaphores or somesuch. mutlithreading is something the codebase could use a bit more of in terms of performance tuning.
-
and it might be worth investigating plugging different sorting algorithms in it, it might be that quicksort is not the best choice.
also trying multithreaded sorting might possibly help, it is disjoint data so running the different axis sorts in different threads might prove worth while. a merge sort using a linked list structure with a different link for each dimension could be used. you would want to keep the three sorting threads around all the time, but blocked when not needed using semaphores or somesuch. mutlithreading is something the codebase could use a bit more of in terms of performance tuning.
I'll have a look at the potential for multithreaded broadphase detection this weekend. All my efforts so far have been on the narrowphase side.
-
@swifty
I'm sorry if I gave the wrong impression. The current algorithm is clearly superior to the old one, no doubt about that. The stuff about O(n²) or O(n log n) or O(n^(4/3)) was purely a theoretical consideration, I actually don't know if an alternative would make it faster in practice. I'm just trying, no guarantees for success.
I wasn't trying to blame something, I only wanted to analyze and see what's going on, to better understand what's happening. I'm new to the code after all. As I said, when I started working on the collision it wasn't even clear to me that it was split into broad/narrow phase detection, that might have led me to focus on the wrong part of the code, a part that might actually not be the main issue, even if improvements are possible.
@Bobboau
I think quicksort is a reasonable choice and afaik not necessarily slower than alternatives, but could be worth a try.
Unfortunately I'm not experienced with threading, but it seems like flaming_sword is. :)
-
Funnily enough, it seems that the best place for multithreading in broadphase is in the quicksort... :nervous:
The trick is to do it in a way that avoids too much overhead (defeating the purpose). I'll try to get some performance figures once I get it working.
-
One the one hand, I got it to work.
On the other hand, it's actually slightly slower with 4 threads and worse with more or fewer. :banghead:
There's potential, at least.
-
thats amazingly cruel. what is the problem, too much overhead destroying any advantage gained?
-
I currently don't have that much time, but did some work on this anyway. Specifically, I started with some performance tests with the vs2012 profiler, in the asteroid mission where you first find the iceni (I didn't find a specific blue planet mission that I should try instead, I'm open to suggestions :) )
While there are not as many objects as in some current mods, I thought it's a good start.
Version: Release (no SSE, because reasons)
Inclusive samples in:
game_do_frame: 13358
obj_move_all: 2012
obj_sort_and_collide: 995
->obj_find_overlap_colliders: 786 (->obj_collide_pair: 452)
->obj_quicksort_colliders: 209
That means sort_and_collide excluding obj_collide_pair (which is the narrow phase) had 543 samples or ~55% of sort_and_collide or ~27% of obj_move_all or ~4% of game_do_frame
Another thing I noticed is that obj_collide_pair has roughly 1/3 to 1/2 of its samples in operator[] of the hashmap (~40% to be more precise)
Well, that was just a very small mission, I probably should do this again with a larger one to see how it scales...
So my current plan when I have time:
Try if caching get_collider_endpoint results will lead to a performance improvement with the current system (might even do this today)
Fix a very weird bug with the alternative system I tried to implement.
Optimize the alternative system
Do some profiling with different missions (I need help here, tell me which one I should choose :> )
Do some profiling with SSE2 builds.
Btw, what do you actually use for profiling? Just curious.
-
Play 'Her Finest Hour' in Tenebra.
-
thats amazingly cruel. what is the problem, too much overhead destroying any advantage gained?
It was a first attempt, so the code can likely be improved (I have an alternative method I can try).
Btw, what do you actually use for profiling? Just curious.
http://www.hard-light.net/forums/index.php?topic=70745.0
http://www.hard-light.net/wiki/index.php/Artemis_Station
War in Heaven - First cutscene mission, a few minutes long, has a bit of everything and fairly consistent. Just sit back and watch the FPS and profiling numbers.
The time spent in the hashmap operator[] is an interesting result. I may be able to do something with this...
-
Oh, damn, the BP massive battle mission is also really great for collision detection. Tons of missiles flying around mean lots of collision pairs.
-
Oh, damn, the BP massive battle mission is also really great for collision detection. Tons of missiles flying around mean lots of collision pairs.
The only problem being that in parts it's such an extreme case that the FPS counter hits the floor and it's hard to tell if your changes have made any difference. :p
-
Oh, damn, the BP massive battle mission is also really great for collision detection. Tons of missiles flying around mean lots of collision pairs.
The only problem being that in parts it's such an extreme case that the FPS counter hits the floor and it's hard to tell if your changes have made any difference. :p
i'd like to point out that (at least from my highly anecdotal and utterly unrepeated test), your gpu ram might be a bottleneck for that particular mission.
i've watched the video ram usage on the card through msi afterburner to see just what happens and the thing happily climbed to 1.4-1.5GB, on a nvidia GTX660 with 2GB.
the performance was smooth-ish and hovering around 6 fps minimum.
if you want some other things that i could test for you, i'd be happy to help.
-
Hey BP guys. I think BP Massive Battle wouldn't be as slow as it is if you didn't put all your cap ships so close to each other.
(http://i.imgur.com/oY38Yri.jpg)
And from looking at these results, trying to optimize the broadphase collision detection isn't going to do much.
-
Not putting them so close together goes against our artistic integrity :P
But your point is well taken, BP Massive Battle is certainly a degenerate edge case, not the norm.
-
I think it might be prudent to compute tighter oriented bounding boxes for ships and use that to see if we need to narrowphase check because the simple radius check we do to prevent unecessary narrowphase checks is completely failing us here.
-
A bounding box would be a great substitute. The Karuna (for example) has 1300m and perhaps just 300m both in height and width. Even if 400 in both directions, the resulting volume is ~ 200x10^6m, while a sphere has a volume somewhere around 115 x 10^7, which is 6 times greater.
Given how most capships in FS are narrow, this order of mag difference is probably shared by most of them.
Perhaps if the bounding box coding is too expensive to run comparatively with the sphere, a three tier could do the trick?
If the projectile is within the bounding sphere, THEN run the bounding box calculation, if it is inside the bb, THEN do the narrowphase checks?
-
I've seen talk of using collision meshes rather than the full-detail model for collision detection (possibly just one of the LODs). Is that a practical option?
-
The collisions on the models themselves are fine. It's just we're doing too many unnecessary ones which is why there needs to be a more strict test in-between the broadphase and the narrowphase.
-
The collisions on the models themselves are fine. It's just we're doing too many unnecessary ones which is why there needs to be a more strict test in-between the broadphase and the narrowphase.
This probably sounds delinquent and naive, but couldn't collision checks be predictively pruned?
I'm talking about trying to remove pairs where a good assumption can be made that collision won't happen with the initial parameters of the objects. The problem of course is that ships can maneuver, so can some projectiles... however there's still a certain volume of space that the ship will be inside after a given time.
After some reading it seems "sweeping" collision detection algorithms do what I had in mind, but tend to be more resource intensive than the discrete reactive collision detection method in use currently... still I wonder if the two systems could be combined? Instead checking against every single projectile in existence, being able to preemptively cull a lot of collisions check could be worth the trade-off:
http://seb.ly/2010/01/predictive-collision-detection-techniques/
-
We already do a lot of preemptive culling. There are some types of collisions that are rejected outright (primary-on-primary, primary-on-beam), others are rejected based on relative movement (primary weapon blast moving away from a secondary weapon), this is all part of what is called the broadphase rejection pass.
Basically, there is no way around the requirement that each object has to be checked against each other object at least once. You can eliminate subsequent checks, and gain a bit of performance there, and that's what the broadphase pass is for.
The issue here, however, is the narrowphase pass, the one where we get into trying to resolve potential collisions against models and objects that couldn't be ruled out in the broadphase pass.
-
The collisions on the models themselves are fine. It's just we're doing too many unnecessary ones which is why there needs to be a more strict test in-between the broadphase and the narrowphase.
This probably sounds delinquent and naive, but couldn't collision checks be predictively pruned?
I'm talking about trying to remove pairs where a good assumption can be made that collision won't happen with the initial parameters of the objects. The problem of course is that ships can maneuver, so can some projectiles... however there's still a certain volume of space that the ship will be inside after a given time.
After some reading it seems "sweeping" collision detection algorithms do what I had in mind, but tend to be more resource intensive than the discrete reactive collision detection method in use currently... still I wonder if the two systems could be combined? Instead checking against every single projectile in existence, being able to preemptively cull a lot of collisions check could be worth the trade-off:
http://seb.ly/2010/01/predictive-collision-detection-techniques/
It's already being done. If two objects are far away from each other, then the code looks at their maximum speeds (and lifetimes for weapons, I believe) and determines whether they could ever collide or not. If they could, then it sets some kind of timestamp for when they should start checking against each other again, and skips checks between that pair until then.
-
Basically, there is no way around the requirement that each object has to be checked against each other object at least once. You can eliminate subsequent checks, and gain a bit of performance there, and that's what the broadphase pass is for.
If, however, the broadphase can cull more 70-90% of volume per capship than it is currently by doing boundingboxes rather than spheres, wouldn't it diminish by a great lot (say 5 times?) the amount of narrowphase checks you'd be asking the cpu to make?
-
The collisions on the models themselves are fine. It's just we're doing too many unnecessary ones which is why there needs to be a more strict test in-between the broadphase and the narrowphase.
This probably sounds delinquent and naive, but couldn't collision checks be predictively pruned?
I'm talking about trying to remove pairs where a good assumption can be made that collision won't happen with the initial parameters of the objects. The problem of course is that ships can maneuver, so can some projectiles... however there's still a certain volume of space that the ship will be inside after a given time.
After some reading it seems "sweeping" collision detection algorithms do what I had in mind, but tend to be more resource intensive than the discrete reactive collision detection method in use currently... still I wonder if the two systems could be combined? Instead checking against every single projectile in existence, being able to preemptively cull a lot of collisions check could be worth the trade-off:
http://seb.ly/2010/01/predictive-collision-detection-techniques/
It's already being done. If two objects are far away from each other, then the code looks at their maximum speeds (and lifetimes for weapons, I believe) and determines whether they could ever collide or not. If they could, then it sets some kind of timestamp for when they should start checking against each other again, and skips checks between that pair until then.
The predictive collision detection happens in the narrowphase and the results are cached (and always skipped when the initial check says they'll never, ever collide).
Basically, there is no way around the requirement that each object has to be checked against each other object at least once. You can eliminate subsequent checks, and gain a bit of performance there, and that's what the broadphase pass is for.
If, however, the broadphase can cull more 70-90% of volume per capship than it is currently by doing boundingboxes rather than spheres, wouldn't it diminish by a great lot (say 5 times?) the amount of narrowphase checks you'd be asking the cpu to make?
You have to be careful there. You're moving a chunk of the narrowphase check into the broadphase.
What the broadphase currently does is a quicksort of objects by position (using one side of the spherical boundary) along an axis, the do overlap checking (with the other side). Non-overlapping objects are culled and the process repeated for the other axes.
When there is overlap with all 3 axes, we move to narrowphase, with all that fancy predictive checking, bounding box checking, etc.
If you skip the overlap check, but do the bounding box check many more times, there's a risk that you get worse performance because those bounding box checks could take longer.
Also, quicksort is typically O(n log n) while with just a bounding box check you have O(n^2).
-
well, you are already checking a dimension on each of the three axes, the problem is you are checking radius (If I am not mistaken), you could swap out the bounding box dimensions and I don't see how it could affect performance.
-
Presumably it would be an oriented bounding box, not just a simple AABB, and OBB x OBB collision checks are significantly more costly than Sphere x Sphere.
Edit: but if you computed the AABBs of all the OBBs beforehand, then you could do AABB x AABB, which is in theory cheaper than Sphere x Sphere since there's no need to square the distances on each axis. Although if the AABB of the OBB is bigger than the sphere, idk...
-
Exactly, checking for overlap of bounding boxes is super fast and simple only on aligned/non-oriented boxes. Less so when the boxes can be in arbitrary orientations. I'd expect radius overlap checking to be quite a bit faster.
-
It's already doing a "bounding box" anyway, the radius description is faulty if what it really happens is what is described in this thread, which is the engine looks at the center position of the model and checks all the values of X - radius to X + radius, then Y then Z. That's a cube. Now you can say "yeah but look this cube is amazingly cheap", except for when? Except for the most problematic cases when it's not cheap at all, which is exactly when it becomes absolutely useless, for instance, in many BP cases.
Now what is the point of having an algorithm that works amazingly well when the scenario is simple, the warships are far away from each other, etc. but when we need the engine to get really optimized, "oops the optimizations just don't work here"? Why bother then? conversely, so what if the bounding boxes are costlier than the simple cube check that the engine does now, if when things get really complex and gritty it is able to perform admirably better? Isn't that the whole point of having an optimization algorithm?
The difference is not between having simple radius checks and bounding boxes, the real difference is between having these bounding boxes calculations and narrowphase full mesh intersection calculations.
Why do I say this? Simple. A bounding box will have at least 10 times less volume than the simple radius cube. This means that for each second, 90% of the time, instead of having a narrowphase check every time a bullet entered the radius of a ship, you'd have only a bounding box calculation. Unless this bounding box calculation is almost as costly as the narrowphase check, you are winning computation time here.
My two cents.
-
Pretty sure it's spheres, not cubes.
-
A sphere is really cheap for collision detection: All you need to do is
1. Take the square of the distance between the centres of the models
2. Take the sum of the squares of the model radii
3. Compare the two
This is 3 subtractions, 3 multiplications and 2 additions for (1),
2 multiplications and 1 addition for (2),
1 subtraction and a conditional jump for (3).
If you memoise the squares of the model radii, you can shave 2 multiplications off the algorithm.
Any other check for intersection will be much more expensive. It might be worth it, to cut down on the number of narrow-phase checks, but this is exactly the kind of situation where you have to try it out and profile.
Even if you find a case where a different algorithm is faster, it's quite possible that a different setting (ships less clustered, fewer missiles...) shows a regression with the new algorithm.
-
The problem is that, and this may be surprising, most of the objects in your typical mission are not sphere-shaped. Sphere testing produces lot of false positives in our case.
-
I see where I got wrong. Sorry.
The problem is that, and this may be surprising, most of the objects in your typical mission are not sphere-shaped. Sphere testing produces lot of false positives in our case.
yeah that was the gist of my ramblings.
-
The problem is that, and this may be surprising, most of the objects in your typical mission are not sphere-shaped. Sphere testing produces lot of false positives in our case.
Sure, but everything smaller than a cruiser is effectively sphere-shaped (at least in vanilla Freespace). Only capships really have concave shapes at the resolution where a smaller object can legitimately pass inside their bounding box, and they are the only thing sufficiently oblong to not be very close to colliding when their bounding sphere intersects another model.
-
The main problem of the collision code is that the actual collision detection cannot be multi-threaded which would give us a pretty big performance boost. Swifty already pointed that the broadphase is not really a big bottleneck and I can confirm that using data from previous profiling tests. The issue lies within the obj_collide_pair function which does the collision detection math and the actual handling of the collision in one step which makes multi-threading impossible.
A solution would be to detect the collision using multiple threads, store that information and then handle it within the main engine thread.
-
The way I read the code it is a bounding box, not a sphere - well, basically a box that includes the calculated sphere, making it a cube with an edgelength of 2xradius
if ( min ) {
return pos->a1d[axis] - Objects[obj_num].radius;
} else {
return pos->a1d[axis] + Objects[obj_num].radius;
}
If performed for each axis, this piece of code from get_collider_endpoint describes a cube/cuboid as the intersection of 6 planes in space (the min and max plane for each axis).
I got a few hours time today, let's see if I can do something productive. ;)
-
The problem is that, and this may be surprising, most of the objects in your typical mission are not sphere-shaped. Sphere testing produces lot of false positives in our case.
Sure, but everything smaller than a cruiser is effectively sphere-shaped (at least in vanilla Freespace). Only capships really have concave shapes at the resolution where a smaller object can legitimately pass inside their bounding box, and they are the only thing sufficiently oblong to not be very close to colliding when their bounding sphere intersects another model.
No, most ships are very definitely not sphere-shaped. Take the Karuna, for instance: it's much, much longer than it is wide or tall.
-
The problem is that, and this may be surprising, most of the objects in your typical mission are not sphere-shaped. Sphere testing produces lot of false positives in our case.
Sure, but everything smaller than a cruiser is effectively sphere-shaped (at least in vanilla Freespace). Only capships really have concave shapes at the resolution where a smaller object can legitimately pass inside their bounding box, and they are the only thing sufficiently oblong to not be very close to colliding when their bounding sphere intersects another model.
No, most ships are very definitely not sphere-shaped. Take the Karuna, for instance: it's much, much longer than it is wide or tall.
The Karuna is a) not vanilla FS, and b) a capship.
-
The problem is that, and this may be surprising, most of the objects in your typical mission are not sphere-shaped. Sphere testing produces lot of false positives in our case.
Sure, but everything smaller than a cruiser is effectively sphere-shaped (at least in vanilla Freespace). Only capships really have concave shapes at the resolution where a smaller object can legitimately pass inside their bounding box, and they are the only thing sufficiently oblong to not be very close to colliding when their bounding sphere intersects another model.
No, most ships are very definitely not sphere-shaped. Take the Karuna, for instance: it's much, much longer than it is wide or tall.
The Karuna is a) not vanilla FS, and b) a capship.
retail examples then?
Cruisers: Rakshasa, Cain, Zephyrus, Anuket, Aeolus, Leviathan, Fenris
Corvettes: Deimos, Moloch
Destroyers: Hatshepsut, Typhon (to a lesser extent ), Orion
these are just off the top of my head.
Also, EvidenceOfFault, for the record, these slowdowns do not happen due to there being tons of fighters in close formations. They happen because of tons of everything else in close formations :p
One question about such ships, would it for example make sense to keep the spherical check whilst their radii do not intersect, then switch to bounding box based checks if the the distance between them < combined radii?
-
retail examples then?
Cruisers: Rakshasa, Cain, Zephyrus, Anuket, Aeolus, Leviathan, Fenris
Corvettes: Deimos, Moloch
Destroyers: Hatshepsut, Typhon (to a lesser extent ), Orion
these are just off the top of my head.
One question about such ships, would it for example make sense to keep the spherical check whilst their radii do not intersect, then switch to bounding box based checks if the the distance between them < combined radii?
Can you actually read? cap-ship
Edit: To say something constructive: I suspect the worst-case is currently when lots of small craft and missiles swarm around a large capship (i.e., they are inside its bounding sphere). The problem I see is that many of them will also be inside its bounding box, because fighters tend to fly very close to capships in dogfight, and the capships are not exclusively convex shapes. The place where bounding boxes would help most is capship-capship collision detection, which is (hopefully) not limiting the frame rate.
Anyway, I haven't actually measured anything, nor looked at the current code a lot, so I'm just going to recommend actually benchmarking any idea somebody wants to implement.
-
Okay, no. This is not going to be an argument we will be having in this thread. Evidence of Fault, please consider that we actually know what we are talking about. Your point about it being less of an issue to approximate fighters as spheres than it is to do the same for bigger models is correct. But, whether it makes sense to leave the existing behaviour in place for fighters and use a more efficient one for capitals is something we will need to test. Feel free to implement such test cases.
But an argument based on "In retail, this is sufficient" is not going to fly, because we're not in a position where we can only worry about retail anymore.
-
Random idea: approximate non-aligned bounding boxes with several aligned cubes. I'd expect that you can perform quite a few aligned box tests with the price of one non-aligned box test, and while the approximation would be somewhat wasteful, the volume the cubes would occupy would still be a whole lot smaller than the radius sphere in many cases.
[attachment deleted by an evil time traveler]
-
In colloquial usage I'm pretty sure everything cruiser up gets called a capship.
-
I have to repeat that I'm pretty sure that the broadphases uses a cube for sorting, not a sphere.
But some good news! After some fails on my part, I finally managed to do a rather trivial thing: Caching the obj_get_collider_endpoint results, which cut the performance impact of the broadphase by roughly ~40%, given one measurement (due to low sample size, this might have some inaccuracies). Unfortunately, because the broadphase detection isn't actually an issue as noted before, that doesn't really help us THAT much. :/
But at the very least, there's a positive result ;)
I'm not sure if I should commit the code yet, though, because the source file is due to some stuff I did there quite a mess...
-
Post a test build then. We'll see what this thing is worth.
-
Like this? It's a pretty simple code change :>
[attachment deleted by an evil time traveler]
-
So previously you were computing x+r, x-r, etc., in every comparison, and now you are computing them before the comparisons? Mmm. How costly would it be to do what I was talking about earlier... computing the AABB of the OBB (or some tighter approximation of the AABB of the oriented model, even), and caching that? Obviously a long ship oriented diagonally would not see much improvement, but maybe it would be worth it?
-
We have to use an AABB for the broadphase detection, otherwise the binary search/sort of quicksort won't work. The AABB however can probably be smaller with low additional cost, assuming we do some pre-calculations during loading (or even save it along with the model). Currently it's effectively a cube that completely surrounds a sphere given by the objects radius, which is not very precise. Using an intermediate step between the absolutely precise narrowphase and the broadphase imho sounds promising.
Basically, narrowing down the AABB in the general broadphase, using a OBB or something similar at the beginning of the narrowphase for further filtering, and only then doing the actual precise narrowphase collision.
-
I have to repeat that I'm pretty sure that the broadphases uses a cube for sorting, not a sphere.
So, if I understand this correctly, what the engine does is:
1. Sort objects by one coordinate
2. Detect whether two objects are closer together along that coordinate than the sum of their radii
3. If so, check whether they are also closer together than the sum of their radii in the other two dimensions.
This would be a bounding cube, aligned to the coordinate system.
If so, we can get from the bounding cube to a bounding cylinder by checking whether the two objects that can overlap in the first dimension (x) can overlap in y and z by: deltay²+deltaz² > r1² + r² . This is almost as cheap computationally, and could reduce conditional jumps, if the current algorithm checks both y and z dimensions independently.
On any architecture where the vector-units are big enough to hold and compute all three coordinates at once, checking for the full sphere after checking for overlap in the first dimension would take practically no extra time.
-
Well, not quite. Some details are actually different, but apart from them there is a main difference:
1. Sort objects by one coordinate
2. Detect whether two objects are closer together along that coordinate than the sum of their radii
3. For all objects were this is true, sort by the second coordinate
4. Detect whether two objects are closer together along that coordinate than the sum of their radii
5. For all objects were this is true, sort by the third coordinate
6. Detect whether two objects are closer together along that coordinate than the sum of their radii
7. For all objects were this is true, initiate narrow phase
If you substitute 1. 3. or 5. (even only one of them) with some sphere approximation, you can no longer sort them, making it a n² algorithm average case, which isn't an improvement.
[edit]
Note that the broadphase detection currently only uses less than 15% of the run time of the collision system (according to the latest profiling I did)
We don't use an explicit AABB either, it's just an implicit result of the algorithm.
-
Cool
ok so in basic terms how does the narrowphase operate
-
The way I read the code it is a bounding box, not a sphere - well, basically a box that includes the calculated sphere, making it a cube with an edgelength of 2xradius
if ( min ) {
return pos->a1d[axis] - Objects[obj_num].radius;
} else {
return pos->a1d[axis] + Objects[obj_num].radius;
}
If performed for each axis, this piece of code from get_collider_endpoint describes a cube/cuboid as the intersection of 6 planes in space (the min and max plane for each axis).
I got a few hours time today, let's see if I can do something productive. ;)
Thank god I thought I was going stoopid for not being unable to see this :).
BTW, I like the "several bounding cubes" idea, feels really cheap computationally and the volume is definitely better, but I'm not sure it's easy to implement.
-
A solution would be to detect the collision using multiple threads, store that information and then handle it within the main engine thread.
This is what I'm working on. The tricky part is getting it to run faster given the introduced overhead. :nervous:
-
I'm implementing a general multi-threading framework for our engine before I try to convert the code.
What threading-framework are you using?
-
I just took a look at the actual collision detection code and all I can say is: That thing is a mess!
It will take a pretty big effort to get it to work with multiple threads, maybe even as much work as including an external physics engine like Bullet (http://bulletphysics.org/wordpress/) and converting FSO to use that.
Thoughts?
-
Better to leave that for another day, I think.
Don't get me wrong, the possibilities of having an actual physics engine are exciting, but let's get the near-term big things sorted first.
-
Yeah, I wasn't entirely serious but after looking at the collision code I just came to the conclusion that deleting it would be a good idea ;)
-
I'm implementing a general multi-threading framework for our engine before I try to convert the code.
What threading-framework are you using?
SDL2, based on antipodes with updates from trunk.
Mine works, but runs at lower fps. I'm just messing with scheduling and caching at the moment.
-
you using a thread pool? what sort of synchronization mechanism you using? and where?
-
It's currently a bit ad-hoc, just to get the game to run and not crash. :P
The main thread gets the list of collisions to evaluate (object pairs, basically) from broadphase (the logic of the current predictive evaluation and caching are preserved) and passes it to threads (number passed in via command line option) that aren't busy. When all list items have been passed and no threads are busy, those pairs that have actually collided are processed by the main thread (the part that's more or less impossible to make parallel unless you rewrite big portions of the engine).
There's other gotchas around things like script hooks and rendering calls around fireball creation that will crash the game if executed in parallel.
Oh yeah, there was also a rewrite of the model collision code to make it threadsafe and rewrite of the collision functions for threadsafety and separation of evaluation vs. processing. :P
My current build uses so much RAM it's not funny, since you now have to store the relevant information for every collision for later processing instead of using it immediately.