Author Topic: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!  (Read 13485 times)

0 Members and 1 Guest are viewing this topic.

Offline Swifty

  • 210
  • I reject your fantasy & substitute my own
Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
While investigating a bug involving BX caches in the prelude to Wings of Dawn's release, I decided to take a stab at the BX file generation code in case I accidentally did something to irrversibly change the format. Fearing the possibility of everyone having to rebuild their BX caches, I wanted to make it less painful and looked into optimizing it. Here is the fruit of my Saturday labor:

Code: [Select]
Index: code/graphics/2d.cpp
===================================================================
--- code/graphics/2d.cpp (revision 11296)
+++ code/graphics/2d.cpp (working copy)
@@ -32,9 +32,8 @@
 #include "gamesequence/gamesequence.h" //WMC - for scripting hooks in gr_flip()
 #include "io/keycontrol.h" // m!m
 #include "debugconsole/console.h"
-#include "debugconsole/console.h"
+#include "io/timer.h"
 
-
 #if defined(SCP_UNIX) && !defined(__APPLE__)
 #if ( SDL_VERSION_ATLEAST(1, 2, 7) )
 #include "SDL_cpuinfo.h"
@@ -1454,6 +1453,20 @@
  return idx;
 }
 
+int poly_list::find_first_vertex_fast(int idx)
+{
+ uint* first_idx = std::lower_bound(sorted_indices, sorted_indices + n_verts, idx, finder(this, NULL, NULL));
+
+ if ( first_idx == sorted_indices + n_verts ) {
+ // if this happens then idx was never in the index list to begin with which is not good
+ mprintf(("Sorted index list missing index %d!", idx));
+ Int3();
+ return idx;
+ }
+
+ return *first_idx;
+}
+
 /**
  * Given a list (plist) find the index within the indexed list that the vert at position idx within list is at
  */
@@ -1479,7 +1492,18 @@
  return -1;
 }
 
+int poly_list::find_index_fast(poly_list *plist, int idx)
+{
+ // searching for an out of bounds index using the finder means we're trying to find the vert and norm we're passing into the finder instance
+ uint* first_idx = std::lower_bound(sorted_indices, sorted_indices + n_verts, n_verts, finder(this, &plist->vert[idx], &plist->norm[idx]));
 
+ if (first_idx == sorted_indices + n_verts) {
+ return -1;
+ }
+
+ return *first_idx;
+}
+
 void poly_list::allocate(int _verts)
 {
  if (_verts <= currently_allocated)
@@ -1500,6 +1524,11 @@
  tsb = NULL;
  }
 
+ if ( sorted_indices != NULL ) {
+ vm_free(sorted_indices);
+ sorted_indices = NULL;
+ }
+
  if (_verts) {
  vert = (vertex*)vm_malloc(sizeof(vertex) * _verts);
  norm = (vec3d*)vm_malloc(sizeof(vec3d) * _verts);
@@ -1507,6 +1536,8 @@
  if (Cmdline_normal) {
  tsb = (tsb_t*)vm_malloc(sizeof(tsb_t) * _verts);
  }
+
+ sorted_indices = (uint*)vm_malloc(sizeof(uint) * _verts);
  }
 
  n_verts = 0;
@@ -1529,6 +1560,11 @@
  vm_free(tsb);
  tsb = NULL;
  }
+
+ if (sorted_indices != NULL) {
+ vm_free(sorted_indices);
+ sorted_indices = NULL;
+ }
 }
 
 void poly_list::calculate_tangent()
@@ -1626,6 +1662,9 @@
  int j, z = 0;
  ubyte *nverts_good = NULL;
 
+ uint t0, t1;
+ t0 = timer_get_milliseconds();
+
  // calculate tangent space data (must be done early)
  calculate_tangent();
 
@@ -1640,8 +1679,10 @@
 
  vertex_list.reserve(n_verts);
 
+ generate_sorted_index_list();
+
  for (j = 0; j < n_verts; j++) {
- if (find_first_vertex(j) == j) {
+ if (find_first_vertex_fast(j) == j) {
  nverts++;
  nverts_good[j] = 1;
  vertex_list.push_back(j);
@@ -1648,6 +1689,10 @@
  }
  }
 
+ t1 = timer_get_milliseconds();
+
+ mprintf(("Index Buffer created in %d milliseconds\n", t1-t0));
+
  // if there is nothig to change then bail
  if (n_verts == nverts) {
  if (nverts_good != NULL) {
@@ -1682,6 +1727,8 @@
  vm_free(nverts_good);
  }
 
+ buffer_list_internal.generate_sorted_index_list();
+
  (*this) = buffer_list_internal;
 }
 
@@ -1696,11 +1743,94 @@
  memcpy(tsb, other_list.tsb, sizeof(tsb_t) * other_list.n_verts);
  }
 
+ memcpy(sorted_indices, other_list.sorted_indices, sizeof(uint) * other_list.n_verts);
+
  n_verts = other_list.n_verts;
 
  return *this;
 }
 
+void poly_list::generate_sorted_index_list()
+{
+ for ( uint j = 0; j < n_verts; ++j) {
+ sorted_indices[j] = j;
+ }
+
+ std::sort(sorted_indices, sorted_indices + n_verts, finder(this));
+}
+
+bool poly_list::finder::operator()(const uint a, const uint b)
+{
+ vertex *vert_a;
+ vertex *vert_b;
+ vec3d *norm_a;
+ vec3d *norm_b;
+
+ Assert(search_list != NULL);
+
+ if ( a == search_list->n_verts ) {
+ Assert(vert_to_find != NULL);
+ Assert(norm_to_find != NULL);
+ Assert(a != b);
+
+ vert_a = vert_to_find;
+ norm_a = norm_to_find;
+ } else {
+ vert_a = &search_list->vert[a];
+ norm_a = &search_list->norm[a];
+ }
+
+ if ( b == search_list->n_verts ) {
+ Assert(vert_to_find != NULL);
+ Assert(norm_to_find != NULL);
+ Assert(a != b);
+
+ vert_b = vert_to_find;
+ norm_b = norm_to_find;
+ } else {
+ vert_b = &search_list->vert[b];
+ norm_b = &search_list->norm[b];
+ }
+
+ if (norm_a->xyz.x != norm_b->xyz.x) {
+ return norm_a->xyz.x < norm_b->xyz.x;
+ }
+
+ if (norm_a->xyz.y != norm_b->xyz.y) {
+ return norm_a->xyz.y < norm_b->xyz.y;
+ }
+
+ if (norm_a->xyz.z != norm_b->xyz.z) {
+ return norm_a->xyz.z < norm_b->xyz.z;
+ }
+
+ if (vert_a->world.xyz.x != vert_b->world.xyz.x) {
+ return vert_a->world.xyz.x < vert_b->world.xyz.x;
+ }
+
+ if (vert_a->world.xyz.y != vert_b->world.xyz.y) {
+ return vert_a->world.xyz.y < vert_b->world.xyz.y;
+ }
+
+ if (vert_a->world.xyz.z != vert_b->world.xyz.z) {
+ return vert_a->world.xyz.z < vert_b->world.xyz.z;
+ }
+
+ if (vert_a->texture_position.u != vert_b->texture_position.u) {
+ return vert_a->texture_position.u < vert_b->texture_position.u;
+ }
+
+ if ( vert_a->texture_position.v != vert_b->texture_position.v ) {
+ return vert_a->texture_position.v < vert_b->texture_position.v;
+ }
+
+ if ( !compare_indices ) {
+ return vert_a->texture_position.v < vert_b->texture_position.v;
+ } else {
+ return a < b;
+ }
+}
+
 void gr_shield_icon(coord2d coords[6], int resize_mode)
 {
  if (gr_screen.mode == GR_STUB) {
Index: code/graphics/2d.h
===================================================================
--- code/graphics/2d.h (revision 11296)
+++ code/graphics/2d.h (working copy)
@@ -100,10 +100,23 @@
  * This should be basicly just like it is in the VB
  * a list of triangles and their associated normals
  */
-class poly_list
-{
+class poly_list {
+ // helper function struct that let's us sort the indices.
+ // an instance is fed into std::sort and std::lower_bound.
+ // overloaded operator() is used for the comparison function.
+ struct finder {
+ poly_list* search_list;
+ bool compare_indices;
+ vertex* vert_to_find;
+ vec3d* norm_to_find;
+
+ finder(poly_list* _search_list): search_list(_search_list), vert_to_find(NULL), norm_to_find(NULL), compare_indices(true) {}
+ finder(poly_list* _search_list, vertex* _vert, vec3d* _norm): search_list(_search_list), vert_to_find(_vert), norm_to_find(_norm), compare_indices(false) {}
+
+ bool operator()(const uint a, const uint b);
+ };
 public:
- poly_list(): n_verts(0), vert(NULL), norm(NULL), tsb(NULL), currently_allocated(0) {}
+ poly_list(): n_verts(0), vert(NULL), norm(NULL), tsb(NULL), currently_allocated(0), sorted_indices(NULL) {}
  ~poly_list();
  poly_list& operator = (poly_list&);
 
@@ -115,11 +128,15 @@
  vec3d *norm;
  tsb_t *tsb;
 
+ uint *sorted_indices;
+
  int find_index(poly_list *plist, int idx);
-
+ int find_index_fast(poly_list *plist, int idx);
 private:
  int currently_allocated;
  int find_first_vertex(int idx);
+ int find_first_vertex_fast(int idx);
+ void generate_sorted_index_list();
 };
 
 
Index: code/model/modelinterp.cpp
===================================================================
--- code/model/modelinterp.cpp (revision 11296)
+++ code/model/modelinterp.cpp (working copy)
@@ -4381,7 +4381,7 @@
 
  new_buffer.assign(j, first_index);
  } else {
- first_index = model_list->find_index(&polygon_list[i], j);
+ first_index = model_list->find_index_fast(&polygon_list[i], j);
  Assert(first_index != -1);
 
  new_buffer.assign(j, first_index);

Speeds up creation of BX files by a crazy amount. I tried it on the Diaspora Sobek Battlestar which is about a 100 Meg POF file. Takes about 10 seconds in a Release configuration which is pretty damn good; it takes 6 seconds to load the model with the BX file. In Debug, the Sobek takes about 30 seconds to load with a BX file. Loading it without the BX file with this patch only took 40 seconds.

How did i do it? I used std::sort and std::lower_bound with a custom comparison function. Generating index buffers is currently done brute force but this patch is a lot smarter about finding common verts when building an index buffer. I first sort the verts spatially while keeping track of their indices and then I use a lower_bound to perform a binary search.

Try it yourself. Make a build with this patch, delete a BX file in your cache folder and view the model in the tech room or lab. Heck, maybe with this we can get rid of BX files altogether.

 

Offline Dragon

  • Citation needed
  • 212
  • The sky is the limit.
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
10 versus 6 seconds still sounds significant, there will be more models like that as time goes on. I'd say, let's not ditch the cache system just yet. But I really love this feature, should speed up loading by a tremendous amount. Can't wait to see it implemented. :)

 

Offline The E

  • He's Ebeneezer Goode
  • Moderator
  • 213
  • Nothing personal, just tech support.
    • Steam
    • Twitter
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Yeah, this actually looks like it would make .bx files unnecessary.
If I'm just aching this can't go on
I came from chasing dreams to feel alone
There must be changes, miss to feel strong
I really need lifе to touch me
--Evergrey, Where August Mourns

 

Offline mjn.mixael

  • Cutscene Master
  • 212
  • Chopped liver
    • Steam
    • Twitter
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Yeah, this actually looks like it would make .bx files unnecessary.

When looking at one ship, perhaps.. but a 15-35% increase for all ships across the board when loading missions would be a problem. (Am I understanding bx correctly, that it uses those to speed up mission load times?)
Cutscene Upgrade Project - Mainhall Remakes - Between the Ashes
Youtube Channel - P3D Model Box
Between the Ashes is looking for committed testers, PM me for details.
Freespace Upgrade Project See what's happening.

 

Offline The E

  • He's Ebeneezer Goode
  • Moderator
  • 213
  • Nothing personal, just tech support.
    • Steam
    • Twitter
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Not exactly. This only increases load times the first time a given model is loaded; I'm pretty sure that overall load times for a given mission would not be increased significantly.
If I'm just aching this can't go on
I came from chasing dreams to feel alone
There must be changes, miss to feel strong
I really need lifе to touch me
--Evergrey, Where August Mourns

 

Offline mjn.mixael

  • Cutscene Master
  • 212
  • Chopped liver
    • Steam
    • Twitter
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Hmm OK. Well I'd love to not have to deal with bx files anymore. But a generic increase in model load times (arguably the most important asset that keeps getting more complex in modding) of 15-35% has to go somewhere.

I guess all I'm saying is that removing a feature that is mostly just annoying but would increase load times by that amount just raises a little red flag for me. But I trust you guys.
Cutscene Upgrade Project - Mainhall Remakes - Between the Ashes
Youtube Channel - P3D Model Box
Between the Ashes is looking for committed testers, PM me for details.
Freespace Upgrade Project See what's happening.

 

Offline The E

  • He's Ebeneezer Goode
  • Moderator
  • 213
  • Nothing personal, just tech support.
    • Steam
    • Twitter
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
The thing is, this slowdown is only noticeable for really big models, like the Sobek. I'm willing to bet that most campaigns will not feature these beasts in every single mission (and even then, the slowdown should only apply the first time that model is loaded).
If I'm just aching this can't go on
I came from chasing dreams to feel alone
There must be changes, miss to feel strong
I really need lifе to touch me
--Evergrey, Where August Mourns

 

Offline Dragon

  • Citation needed
  • 212
  • The sky is the limit.
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Well, a Diaspora campaign could as well use a Sobek in every mission. The player being based on the thing would probably require its presence. Also, as times goes on, models in general become more and more detailed. Some day we might have missions with multiple Sobek-level ship classes flying around. As I said, I wouldn't throw BX files out just yet. They do decrease loading times, which are good to keep as low as possible. However, this would certainly mean that bundling those files with mods would not be important anymore.

 

Offline Swifty

  • 210
  • I reject your fantasy & substitute my own
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
As E said, models don't get unloaded between missions. Once it's resident, it's there forever until the player closes the game.

 

Offline Dragon

  • Citation needed
  • 212
  • The sky is the limit.
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
That still means one slow load per game startup, not "one in a lifetime" slow load. Especially testing may be affected, because the usual workflow (at least for me) was "test-quit-adjust-test again", meaning I'll quit the game a lot. I'm not sure how many missions people play in one sitting on average, but it seems to me that the gain from having BX files is significant for it to matter in at least some cases.

 
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Yeah, I'm with Dragon on this one. Keeping startup times low is more important than saving a bit of disk space.
The good Christian should beware of mathematicians, and all those who make empty prophecies. The danger already exists that the mathematicians have made a covenant with the devil to darken the spirit and to confine man in the bonds of Hell.

 

Offline The E

  • He's Ebeneezer Goode
  • Moderator
  • 213
  • Nothing personal, just tech support.
    • Steam
    • Twitter
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
I think you guys are severely overestimating the amount of time lost here. Seriously, you're basing this on data from one of the largest models ever made for FSO; for normal models, the differences in load times are not really noticeable.
If I'm just aching this can't go on
I came from chasing dreams to feel alone
There must be changes, miss to feel strong
I really need lifе to touch me
--Evergrey, Where August Mourns

 
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
I'm open to seeing more representative data, but keeping load times low is pretty important and I'm not really seeing how this benefits the majority of users.
The good Christian should beware of mathematicians, and all those who make empty prophecies. The danger already exists that the mathematicians have made a covenant with the devil to darken the spirit and to confine man in the bonds of Hell.

 

Offline Swifty

  • 210
  • I reject your fantasy & substitute my own
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
*whistles quietly since the collision detection optimizations I did 3 years back increased load times by about 20% and no one ever complained*

Watch, we're going to sneak this update in as well and delete all BX files and you guys won't ever notice. :P

 

Offline Dragon

  • Citation needed
  • 212
  • The sky is the limit.
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Would that be possible to make a launcher flag for this? If that doesn't work (or is too much effort), the best idea would be to compile a test build and stress-test the system on a particularly poly-heavy mod. Again, keep in mind rising model polycounts in FS mods (just look at the recent FSU models and at what looked like a perfectly good HTL model before).

Also, nobody complained before, because at the cost of loading times we got a significant boost to in-game performance. Here, we decrease loading times without any additional drawbacks. The question is whether to go all-out in that regard or save some HD space at the cost of slightly lesser loading time reduction. Ultimately, I think that HD space is cheap enough to be less of a concern than minimizing loading times. But it's not like the other option wouldn't still be a great improvement over what we have now.

 

Offline The E

  • He's Ebeneezer Goode
  • Moderator
  • 213
  • Nothing personal, just tech support.
    • Steam
    • Twitter
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Here's some quick, non-scientific testing. Loading Artemis with .bx files on debug takes 3 minutes, 13 seconds. Without, 3 Minutes, 12 seconds.

In release builds, loading with .bx files took 36 seconds. Without, 22 seconds.

Yeah. I am not going to think of this as a significant slowdown.
If I'm just aching this can't go on
I came from chasing dreams to feel alone
There must be changes, miss to feel strong
I really need lifе to touch me
--Evergrey, Where August Mourns

 

Offline Swifty

  • 210
  • I reject your fantasy & substitute my own
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Ha ha, oh ****. So it's even faster without the BX files! I guess that would make sense if you think about it, reading from disk is never going to be as fast as reading from cache or memory. And index buffer generation has a disk writing cost too so if we didn't have to write BX files, loading might be even faster...

 

Offline AdmiralRalwood

  • 211
  • The Cthulhu programmer himself!
    • Skype
    • Steam
    • Twitter
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
In release builds, loading with .bx files took 36 seconds. Without, 22 seconds.
:eek2: ...Wow.
Ph'nglui mglw'nafh Codethulhu GitHub wgah'nagl fhtagn.

schrödinbug (noun) - a bug that manifests itself in running software after a programmer notices that the code should never have worked in the first place.

When you gaze long into BMPMAN, BMPMAN also gazes into you.

"I am one of the best FREDders on Earth" -General Battuta

<Aesaar> literary criticism is vladimir putin

<MageKing17> "There's probably a reason the code is the way it is" is a very dangerous line of thought. :P
<MageKing17> Because the "reason" often turns out to be "nobody noticed it was wrong".
(the very next day)
<MageKing17> this ****ing code did it to me again
<MageKing17> "That doesn't really make sense to me, but I'll assume it was being done for a reason."
<MageKing17> **** ME
<MageKing17> THE REASON IS PEOPLE ARE STUPID
<MageKing17> ESPECIALLY ME

<MageKing17> God damn, I do not understand how this is breaking.
<MageKing17> Everything points to "this should work fine", and yet it's clearly not working.
<MjnMixael> 2 hours later... "God damn, how did this ever work at all?!"
(...)
<MageKing17> so
<MageKing17> more than two hours
<MageKing17> but once again we have reached the inevitable conclusion
<MageKing17> How did this code ever work in the first place!?

<@The_E> Welcome to OpenGL, where standards compliance is optional, and error reporting inconsistent

<MageKing17> It was all working perfectly until I actually tried it on an actual mission.

<IronWorks> I am useful for FSO stuff again. This is a red-letter day!
* z64555 erases "Thursday" and rewrites it in red ink

<MageKing17> TIL the entire homing code is held up by shoestrings and duct tape, basically.

 

Offline mjn.mixael

  • Cutscene Master
  • 212
  • Chopped liver
    • Steam
    • Twitter
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
Would that be possible to make a launcher flag for this? If that doesn't work (or is too much effort), the best idea would be to compile a test build and stress-test the system on a particularly poly-heavy mod. Again, keep in mind rising model polycounts in FS mods (just look at the recent FSU models and at what looked like a perfectly good HTL model before).

Also, nobody complained before, because at the cost of loading times we got a significant boost to in-game performance. Here, we decrease loading times without any additional drawbacks. The question is whether to go all-out in that regard or save some HD space at the cost of slightly lesser loading time reduction. Ultimately, I think that HD space is cheap enough to be less of a concern than minimizing loading times. But it's not like the other option wouldn't still be a great improvement over what we have now.

Oh come on... A launcher flag for this? Please. All I said was, hey this raises a red flag, let's take a closer look (aka, I trust scp). If there's anything FSO does not need, it's more launcher flags for stupid reasons.
Cutscene Upgrade Project - Mainhall Remakes - Between the Ashes
Youtube Channel - P3D Model Box
Between the Ashes is looking for committed testers, PM me for details.
Freespace Upgrade Project See what's happening.

 

Offline Goober5000

  • HLP Loremaster
  • Moderator
  • 214
    • Goober5000 Productions
Re: Speed Up Index Buffer Creation Using This One Simple Trick! BX Files Hate It!
*whistles quietly since the collision detection optimizations I did 3 years back increased load times by about 20% and no one ever complained*

I noticed that load times had gotten longer in general but didn't know what had caused it.  Now I know to blame you. :p