Author Topic: My Game Engine is too Lame (Read 60479 times)

z64555 · **Reply #80 on:** October 18, 2012, 06:36:06 pm

That... shouldn't be the case. Derived classes are supposed to inherit every method and member, with the exception of the parent class's constructors/deconstructors.

If you have MSVC, try seeing what is accessible by using the autocomplete... basically just do the (*iter) -> and wait for MSVC to pop out a window with all possible options.

The compiler might be getting confused with all the DoStuff() functions.

Aardwolf · **Reply #81 on:** October 19, 2012, 03:14:43 pm

No it shouldn't, but apparently it is. The issue is specifically that I can't access those protected members via a pointer or object... unless the pointer/object's type is the same as the "this" object. IIRC in C# and Java it works "right".

And I've been using MSVC this whole time.

Meanwhile, I downloaded the OpenCL SDK and am looking into using it to hardware-accelerate my "SolveConstraintGraph" function. And also collision detection, but I expect that will be straightforward.

Edit: Some pseudocode for the batcher...

Code: [Select]

given a collection of constraints A    // all constraints (or all constraints yet to be added to a batch)

i = 0
while A is not empty

 start an empty collection of nodes U   // used nodes
 start an empty collection of constraints N  // constraints which will be processed by some other batch
 start an empty collection of constraints B[i]  // constraints for this batch

 for each constraint C in A
  if either of the two nodes of C are in U
   add C to N
  else
   add C to B[i]
   add both nodes to U

 A = N

 increment i

This will guarantee that every constraint gets processed exactly once... but what I think I want to be doing (and what I'm doing currently in my single-threaded pure-CPU implementation) is having each one process at least once, at most some number of times defined by a constant, and quitting early if nothing is "woken up". I think I can manage the first two parts, but I'm not sure I can do a "if nothing is woken up" check without having to pass the data back to the CPU after each batch.

The exact way "waking up" works is like this: when an edge (constraint) processes, if may or may not apply an impulse to the two nodes (rigid bodies) it connects... if it does, any adjacent edges are "awake" for the next iteration¹... the edge which just processed may be woken up again by another edge, but it doesn't wake itself up.

Actually now that I think about it, the way I have it set up now might be waking edges up unnecessarily... if one edge wakes up its neighbors, but the neighbors process later in the same iteration, the neighbors can go back to sleep the next iteration. Then again, maybe doing more or less processing of constraints is irrelevant, since more often than not the reason it stops is because it hit the max number of iterations, rather than because the entire subgraph is asleep.

¹Iteration as in "process all the constraints, repeatedly", not "process subsequent batches of constraints".

Aardwolf · **Reply #82 on:** October 24, 2012, 02:09:16 pm

I've multithreaded both the collision detection and the constraint graph solver! Now the physics is down to 20.8 milliseconds!

I'm thinking OpenGL transform feedback would be a very handy tool, and I should learn how to do it. I've had some code sitting in VertexBuffer.cpp for a while now which was based on an example I found somewhere, but it wasn't going anywhere... probably because I was trying to make something general, and I'm not ready yet. I need to do some prototyping and experimenting and stuff... make a specific thing that uses transform feedback, get it working, then adapt it. Then maybe eventually I can generalize from there.

z64555 · **Reply #83 on:** October 24, 2012, 02:18:27 pm

Awesome. What do you mean by the "OpenGL transform feedback?"

Aardwolf · **Reply #84 on:** October 24, 2012, 04:06:35 pm

It's an extension that became part of core GL in version 3.0... Basically it lets me access the output of a vertex shader (before clipping). Also it lets me skip rasterization... so it's like a primitive compute shader.

Will update this post later with a code snippet I'm working on.

Edit: no code snippet, because it's not presentable. But I managed to get access to the outputs of the vertex shader from c++

z64555 · **Reply #85 on:** November 05, 2012, 08:05:21 am

So now it's the OpenGL code that's giving you trouble? Snippets or it didn't happen!

Aardwolf · **Reply #86 on:** November 10, 2012, 10:29:16 pm

Making progress now... I've got most of the "new" stuff taken care of!

I've got a HardwareAcceleratedComputation class, which I construct with a Shader, a map<string, string> to map the outputs of that shader to named members of a VertexBuffer, and a VertexBuffer which serves as a prototype for the output.

I also changed how I'm doing it, so now instead of trying to get the data off of the GPU after every operation, my VertexBuffer wrapper now has an UpdateDataFromGL method. And I disabled some stuff that used query objects to find out how many primitives the shader output, because I know for a fact it's going to output the same number of primitives as I gave it. So that made it faster, too.

But I still don't actually have the physics running on the GPU. I've got it mostly planned out, though. It looks like I'm going to be using what GLSL calls a samplerBuffer, which seems like it's the last "new" thing for me. Everything else is stuff I already know how to do.

Aardwolf · **Reply #87 on:** November 12, 2012, 02:36:34 pm

Aughhh it's horribly un-working and I don't know why

Edit: now it's slightly less horribly un-working! Apparently some unnecessary-looking code which was part of an example I was working from was necessary after all. If I don't use an OpenGL query object to count how many primitives my vertex shader outputs, and then get the result from the query object, the GPU-side computations get borked somehow. So I put that part back in and now that's working again.

But now I'm having another problem. I'm using integer-typed vertex attributes, and for some reason the values the vertex shader is seeing aren't the same as the values I give it:

Code: [Select]

0x00000000 --> 0x00000000
0x00000001 --> 0x3F800000
0x00000002 --> 0x40000000
0x00000003 --> 0x40400000
0x00000004 --> 0x40800000
0x00000005 --> 0x40A00000
0x00000006 --> 0x40C00000
0x00000007 --> 0x40E00000
0x00000008 --> 0x41000000
0x00000009 --> 0x41100000
0x0000000A --> 0x41200000
0x0000000B --> 0x41300000
0x0000000C --> 0x41400000
0x0000000D --> 0x41500000
0x0000000E --> 0x41600000
0x0000000F --> 0x41700000
0x00000010 --> 0x41800000
0x00000011 --> 0x41880000
0x00000012 --> 0x41900000

Note, to get these values back I converted through float so there may be roundoff, but the numbers definitely are that big.

I can't tell what the pattern is, help

Edit II: I figured out what the pattern is! The numbers are being converted to float, then being read as int! (This is separate from the deliberate conversion I did to get them back out.)

I'm not sure why it's doing it, or how to fix it, but the workaround I came up with before I figured out the pattern is working for now: using float instead of int.

z64555 · **Reply #88 on:** November 13, 2012, 11:15:56 pm

Which OGL functions are you using? IIRC, many of them use GLfloat's...

Aardwolf · **Reply #89 on:** November 14, 2012, 02:45:39 am

Hoo boy, there's a bunch. Let's see...

glBufferSubData with type GL_INT to create the buffer
glTexBuffer with format GL_R32I to use that buffer as a buffer texture... I think I tried it with GL_RGB32I and/or GL_RGBA32I as well
An isamplerBuffer in the shader to access it in the shader

It's weird that it wasn't working, but I'm working around it and I'm getting closer and closer to doing the constraints on the GPU.

Right now I've got a not-quite-MassInfo struct in the constraint eval shader, which is being loaded from a texture. Also the texture has the correct data in it, and as far as I can tell it's being converted properly by the shader. Unlike my C++ struct which has a mass, a CoM, and a 3x3 matrix to store the MoI, the struct in the shader contains the inverse of the mass, the CoM (in world coords), and the inverse of the MoI matrix. But I called it MassInfo anyway

Also I've made structs which contain all the other variables the constraints will need, and functions to load them from a specified starting texel of a buffer texture. But I haven't got the data put into that buffer yet. And then I'll still have to actually port the constraints' DoConstraintAction functions to GLSL.

But I'm close

Uchuujinsan · **Reply #90 on:** November 15, 2012, 09:45:42 am

Quote from: z64555 on October 18, 2012, 06:36:06 pm

That... shouldn't be the case. Derived classes are supposed to inherit every method and member, with the exception of the parent class's constructors/deconstructors.

If you have MSVC, try seeing what is accessible by using the autocomplete... basically just do the (*iter) -> and wait for MSVC to pop out a window with all possible options.

The compiler might be getting confused with all the DoStuff() functions.

Actually, it should be the case that this doesn't compile.
DoStuff is protected in RigidBody. CollisionGroup doesn't derive from RigidBody, so it doesn't have access to its protected members. If that's different in java or c#, those two are imho broken.

There is a proper solution:

Code: [Select]

class Foo {
private: //only use protected if it's required to explicitly call this from a base class
   virtual void DoStuff()
   {
      cout<<"foo::DoStuff"<<endl;
   }
public:
   void DoStuffPolymorphic()
   {
      DoStuff();
   }
};

class Bar : public Foo {
protected:
   void DoStuff()
   {
      cout<<"bar::DoStuff"<<endl;
   }
};

class Bar2 : public Foo {
   Foo* b;
public:
   Bar2() {
      b = new Bar();
   }
   void Blubb() {
      b->DoStuffPolymorphic(); //will call Bar:DoStuff
   }
};

Aardwolf · **Reply #91 on:** November 15, 2012, 02:15:36 pm

But... wha

That's even worse, a protected method should not be able to override a private method!

DoStuff is supposed to be protected so that only classes derived from CollisionObject can access it. Making a public DoStuffPolymorphic method wrecks that. And making it protected wouldn't help, because then the derived classes couldn't call it.

Uchuujinsan · **Reply #92 on:** November 21, 2012, 07:08:48 pm

OK, I have to admit I didn't read your original post thoroughly enough. More on that later.

Quote

But... wha That's even worse, a protected method should not be able to override a private method!

public/private/protected is access control. You are not accessing Foo:DoStuff. By declaring it virtual you explicitly allow it to be overwritten. Extending access control to overwriting virtual functions would make private virtual functions simply impossible.

Quote

DoStuff is supposed to be protected so that only classes derived from CollisionObject can access it. Making a public DoStuffPolymorphic method wrecks that. And making it protected wouldn't help, because then the derived classes couldn't call it.

Uh yeah, that's the part were I didn't read properly.
So, a correction:

Code: [Select]

#include <iostream>

using namespace std;

class Foo {
protected:
   virtual void DoStuff()
   {
      cout<<"foo::DoStuff"<<endl;
   }
protected: //note the protected here
   void DoStuffPolymorphic(Foo* test)
   {
      test->DoStuff();
   }
};

class Bar : public Foo {
private:
   void DoStuff()
   {
      cout<<"bar::DoStuff"<<endl;
   }
};

class Bar2 : public Foo {
   Foo* b;
public:
   Bar2() {
      b = new Bar();
   }
   void Blubb() {
      DoStuffPolymorphic(b); //calls bar::DoStuff()
   }
};

int main()
{
   Bar2 test;
   test.Blubb();
}

[edit]
C-style casts are EVIL btw.

[edit2]
Note that overwriting a private method with a public method still doesn't allow the following:

Code: [Select]

class Foo {
   virtual void DoStuff() {}
};

class Bar : public Foo {
public:
   void DoStuff() {}
};

int main() {
Foo* f = new Bar();
//f->DoStuff(); //error C2248: 'Foo::DoStuff' : cannot access private member declared in class 'Foo'
static_cast<Bar*>(f)->DoStuff(); //ok, Bar::DoStuff is public
}

I'm leaving the deletes out to make the code shorter, btw, I'm not forgetting them.

Aardwolf · **Reply #93 on:** November 22, 2012, 02:59:29 pm

@Uchuujinsan: Meh. I'm coming from a C#/Java background, wherein stuff works the way I'd expect. And where private + virtual isn't a thing.

Anyway, gotta make GPU physics go faster

My original scheme had two shader programs: one which evaluated constraints and output the new linear & angular velocity data for the two rigid bodies involved, and a second which updated a master array of rigid bodies' velocity data, copying from the previous state or from the outputs of the first shader according to an index.

My second and current scheme has one shader program which is kind of a combination of the two shader programs in the first scheme. It updates rigid bodies' entries in the master array, using an index to determine which ones should be copied from the previous state and which ones are changed. But instead of copying the changed ones from another array, it does the computation on the spot. The disadvantage of this is that it has to process each constraint twice... but it was still faster than the first scheme.

I've got an idea for a third scheme which I think might be even faster. It would be just a single shader program, with a geometry shader to evaluate the constraints and emit what are basically RGBA32F pixels/texels... to write a specific vec4 to a specific pixel/texel of the master velocity data array. If the constraint doesn't apply an impulse, it wouldn't emit any pixels. If it applies an impulse, it would write the new linear and angular velocities of the two involved rigid bodies to the appropriate places in the array.

One thing that has been an issue with both of the schemes I've tried, and I believe will still be an issue with the third scheme, is that I can't use the same buffer as both an input and an output. So I've been having to switch back and forth between two buffers instead. I don't know for sure, but I somewhat suspect this is contributing to how long it takes to process.

I have determined that the amount of time it takes is basically proportional to the number of batches processed, i.e. iterations * batches.size() in the snippet below. This seems to be independent of the number of rigid bodies, or the average number of constraints per batch.

Code: [Select]

for(unsigned int i = 0; i < iterations; ++i)
 for(unsigned int j = 0; j < batches.size(); ++j)
 {
  // do constraint shader stuff
  glActiveTexture(GL_TEXTURE1);
  glBindTexture(GL_TEXTURE_BUFFER, active_vtex);
  glTexBufferEXT(GL_TEXTURE_BUFFER, GL_RGBA32F, active_vdata);
  glUniform1i(u_velocity_data, 1);

  GLDEBUG();

  // set up outputs for transform feedback
  glBindBufferRange(GL_TRANSFORM_FEEDBACK_BUFFER, 0, inactive_vdata, 0,          num_rigid_bodies * 4 * sizeof(float));
  glBindBufferRange(GL_TRANSFORM_FEEDBACK_BUFFER, 1, inactive_vdata, num_rigid_bodies * 4 * sizeof(float), num_rigid_bodies * 4 * sizeof(float));

  GLDEBUG();

  glBeginTransformFeedback(GL_POINTS);

   glDrawArrays(GL_POINTS, num_rigid_bodies * j, num_rigid_bodies);
  
  glEndTransformFeedback();

  GLDEBUG();

  glFlush();

  // change which direction the copying is going (back and forth)... can't use one buffer as both input and output or it will be undefined behavior!
  swap(active_vdata, inactive_vdata);
  swap(active_vtex, inactive_vtex);
 }

Little help?

Edit: I was forced to replace glFlush(); with glFinish();, and now it is even slower

Edit II: Apparently I can leave those all as glFlush, and add a glFinish before calling glGetBufferSubData (which is one of the first things I do immediately after the snippet I pasted)... so it's not as bad. But I still need to call glFlush every iteration, and I still need to swap back and forth between two buffers

Aardwolf · **Reply #94 on:** December 12, 2012, 10:58:24 pm

So I've shelved the GPU physics idea, because I couldn't get it to be faster than the pure-CPU implementation.

I was like "I want to work on gameplay again!", and so I did... but then I realized I had a serious character animation/physics problem... because my cheaty character physics conserves linear but not angular momentum, it is possible for a bug to be standing with its center of mass floating off the side of a cliff, supported by only one leg, upright, without falling. Which is bad.

Also there are problems when the large artillery bugs... they almost always float with only one or two legs in contact with the ground, and when they turn around the physics causes ridiculous bouncing. Which is also bad.

So I've determined that I need to go back and do physics-based movement... for the bugs, at least.

Here's my plan:

Dood limb with controllable joint torques specified in "parent" bone's coordinate system
Achieve a desired joint orientation
- for chained joints
- Account for varying load (PID?)
Load feedback of some sort (see above)
Be able to do arbitrary animated behaviors, like stepping (lift), stepping (push), stab/slash, holding and aiming a gun... all presumably with the base of the limb fixed in place
Multiple limbs combined on a single Dood
- Balancing (on a flat surface)
- Turning & walking
- Jumping & landing (as far as when to do this, this item really could just go anywhere)
- Dynamically selecting where to step, OR attempting to step and reaction to not being able to step
- Sloped surfaces
- Curved/nonplanar surfaces
- Confined spaces

Does that seem like a reasonable way to approach this? Anyone knowledgeable about IK / physics based movement here able to weigh in?

z64555 · **Reply #95 on:** December 14, 2012, 09:00:22 pm

You seem to be going back to your original plan just to go through the madness all over again. I'd say try sticking with the traditional animation sequences that just move the limbs, and have then engine move the whole body system relative to the rate the animation is played.

Essentially, make everything behave like a box that slides across the ground. The box's normal vector is an interpolation of the intersections of the box's floor edges, so if one edge of the box is off a cliff, and the other edge is on something solid, you can calculate torques and rotations on the box's center of mass to make it tilt and then fall off the cliff.

You'll still come across some funny situations, but it'll be a start...

Aardwolf · **Reply #96 on:** December 14, 2012, 10:38:20 pm

But I've been working on per-bone physics all year... I don't want to ditch it to go back to abstracted blobs, or boxes.

And it's not just because of how much time I spent working on it, it's also the fact that it works and it's awesome.

...but the cheaty character animation system I put on top of it isn't cutting it for these new, larger bugs. That said, I can't imagine how treating them as boxes would be much better. The goal isn't to make them fall down when they try to stand off to the side of a cliff... that just happens to be one of the most obvious manifestations of the problems my current system has. Rather, the goal is to make them walk properly on varied terrain.

Bleh.

Anyway, I'm going ahead with the plan I posted, and I've finished item #1. So now I've got a severed bug leg with the hip joint frozen in place in mid air, and I'm trying to get it to do a pose. Basically item #2, including the "for chained joints" part. The implementation I came up with almost does it, but it'll be constantly flipping back and forth between two nearly-correct states, instead of smoothly reaching and maintaining the desired pose. And if I give it a shove, it wobbles and takes a while to come to a stop.

Aardwolf · **Reply #97 on:** December 22, 2012, 12:58:21 am

I've determined that rather than try to design a data format capable of representing a generic pose/animation/behavior for a limb, I should do it the OOP way. I'm thinking I'll make a LimbAction class, with subclasses for the different behaviors a limb might have to do.

I've got a setup where I can tell this "robot arm" what orientation I want for each of its bones (in world coords, but I could probably change that), and it computes the torques necessary to do that, starting from the claw end and summing up the torques at each joint... because the torques are applied equally and oppositely to the two constrained bones, every torque I apply at one joint has to be undone (or redone?) at the next joint up in the chain. The setup works, mostly. If I give it a violent enough shove, it will flail wildly and take several seconds to get back to behaving nicely. But I reckon I could just detect for that, and switch to another LimbAction when that happens.

I don't know how I'm going to do "load feedback", but I know I will eventually need something to tell which legs are supporting the weight of a dood.

Edit:

I've revised the setup so that the desired orientations are specified on a per-joint basis instead of in world coords. Also, I'm using PID controllers! There was a lot of frustration because of transformations being done wrong, but now I've got it so it can get to an arbitrary pose or simple animation, and doesn't flail anymore. So now I can move on to making a LimbAction for stepping, and figuring out load feedback.

Aardwolf · **Reply #98 on:** January 19, 2013, 05:49:56 pm

I'm considering ditching PID-based motor control

I tried making a crab bug use 6 of these PID-based Limb things, and it failed miserably. The pair of limbs which the RobotArm was based on were the closest to working, but they weren't working acceptably either. The other two pairs of limbs were flailing wildly, while the "good" pair of limbs was unable to hold the correct pose.

Maybe it's just a matter of tuning the PID coefficients, but there are 3 coefficients for each of 3 (or 5, depending how I treat them) motors for each of 3 unique limbs (3 because paired limbs can safely just use the same coefficients)... and I haven't the slightest idea how to tune them.

A history of the motor control schemes I've tried:

0. Completely non-physical dood animations. Doods were approximated as capsules.

1. As part of the joint constraint, the relative angular velocity of the two bones in a joint are constrained in order to get the relative orientation to a certain value by the next tick.

Problems with this:

I couldn't figure out how to get the player to stand upright. Maybe this doesn't matter anymore
"Joint fighting": a torque applied at one joint to 'set' its relative angular velocity means that the adjacent joints' relative angular velocities will no longer be what they were set to. Slow convergence with a potential for positive feedback.

2. Cheaty system where each bone has a desired position and orientation, and sets its linear and angular velocity in order to get there, while properly conserving linear momentum.

Problems with this:

It didn't conserve angular momentum. That meant that bugs could hang off to the side of a cliff with only one limb supporting them.
With the larger bugs, the non-physical way the bugs turned to face a certain direction resulted in a violent physical collision with the ground, making them go bounding across the landscape.

3. Current system where I'm using PID controllers to set the torques on each axis of each joint (in a convenient coordinate system).

Problems with this:

It's not working.

So now I'm frustrated again, and as I said at the beginning of this post, I'm thinking about ditching this PID stuff. But I don't know what to do instead.

Maybe something like scheme 1, but adding per-axis limits on how much torque can be applied? That might fix the issue I had where (I think) positive feedback was resulting in absurdly large impulses.

But I don't know how to fix the "joint fighting" to make it converge faster. The technique I've been using in scheme 3 works by summing the applied torques from the claw end of the limb up to the shoulder... it worked when all I had to deal with was the limb itself, and I could just dump extra torque into the shoulder joint, but I don't think it will work here. The carapace the real limbs attach to is not a torque sink like the shoulder in the RobotArm prototype.

What do?

z64555 · **Reply #99 on:** January 20, 2013, 10:27:25 pm

Try re-working the desired position/velocity controllers, while applying physical limits as to what the servos can accomplish. If the bug's body exceeds any of the maximums, saturate the torques, velocities, and/or positions to their respective maximum those servos can accomplish before feeding them into the physics evaluator.

In systems engineering terms, after you've gotten the output signal of your controllers, pass it through several saturation blocks the reflect the limitations of the servos before passing the signal through the physical model block (which ultimately gives you the "actual output" signal).

News:

Author Topic: My Game Engine is too Lame (Read 60479 times)

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow

Re: My Game Engine is too Slow