Hard Light Productions Forums

Modding, Mission Design, and Coding => FS2 Open Coding - The Source Code Project (SCP) => Topic started by: ShivanSpS on November 20, 2019, 03:30:28 pm

Title: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 20, 2019, 03:30:28 pm
Hi,
im doing a little detective work trying to figure out this, this is the issue, due to only OpenGL 2.1 and no S3TC a Raspberry PI can only run 3.7.2 and uncrompressed textures.

But in some projects, like Diaspora and Wing Commander Saga attempts to load most models in the tech room ends in a crash whiout futher log info, in fact i only get a "bus error" if i run from a terminal, this is not due to textures as removing the textures also causes the game to crash.

In Diaspora for example, the Vipers load, but the Raptor crashes. On Wing Commander Saga all fighter crash, but two or 3 of the Cruicers do load. In constract, all retail FS2 models load.

As i understand, there were huge changes on 3.8.0 and mods adapted to it, thre is something in the .pof files that could cause this?
Title: Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Post by: mjn.mixael on November 20, 2019, 04:03:58 pm
A list of changes can be found at the bottom of this page. (https://www.hard-light.net/forums/index.php?topic=94988.0)

But I have a suspicion.. what build are you running and/or what date of build are you building from?
Title: Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Post by: ShivanSpS on November 20, 2019, 05:52:04 pm
Just a little bit of correction, i never tried to run mediavps, i really dont think the RPI can handle it, altrought i going to try 3.7.2 probably this weekend.

This is the build im running
https://www.hard-light.net/forums/index.php?topic=89597.0

Compiled from the source code export provided there directly on a RPI4. Its kind of a mess due to autogen not detecting the ARM cpu and generating the makefiles with x86 flags, so i had to change that manually.

As i said, the RPI 4 is limited to that at least to the day that both FSO and the RPI4 supports Vulkan, so it will be a while. There is SOMETHING in the .pofs of Diaspora and WCS (and i guess most of other mods) that crashes that 3.7.2 build with a "bus error", and i know those mods were updated to use FSO 3.8.0.

Other that that im lost, since the retail game can be finished on the RPI.
Title: Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Post by: mjn.mixael on November 20, 2019, 06:03:17 pm
I take it this is a Source Code Project question then, add it's unrelated to the MediaVPs.
Title: Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Post by: ShivanSpS on November 20, 2019, 07:03:57 pm
I dont think it is a code question, i already know that a lot of changes has been done on FSO 3.8.0 onwards, specially on the graphics department.

If FSO 3.8.0 allowed new stuff to be added on the .pofs that could cause a crash on 3.7.2 the people that should know about it are the one that work with the model themselves. And the people doing the mediavps are the one that did the most work on models ever.

There is something on those .pofs that cause the game to crash, if nothing special was added on the pofs from 3.7.2 to 3.8.0 to take advantage of some new feature them it could be something else. And i have to admit that getting a "bus error" like this is strange.
Title: Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Post by: taylor on November 20, 2019, 10:46:03 pm
Sounds like a memory alignment issue to me, which arm is more finicky about than x86. Either an unaligned data read or accessing a struct that isn't aligned/padded properly. Try running it in gdb and see if that helps to locate the problem area.
Title: Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Post by: The E on November 20, 2019, 11:14:11 pm
There have been no changes at all in the pof file format or how it is interpreted between those versions as far as I can recall.
Title: Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Post by: Nightmare on November 21, 2019, 05:24:15 am
Don't know if it helps, but I noticed that you can't open pof files that have been safed with PCS 2.1 in 2.0.3 for some reason.
Title: Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Post by: ShivanSpS on November 21, 2019, 05:38:01 pm
Sounds like a memory alignment issue to me, which arm is more finicky about than x86. Either an unaligned data read or accessing a struct that isn't aligned/padded properly. Try running it in gdb and see if that helps to locate the problem area.

You right, one would think this should be a non-issue today but nope...

Code: [Select]
Thread 1 "fs2_open_3.7.2_" received signal SIGBUS, Bus error.
point_in_octant (pm=0x24a1770, oct=0x24a33c4, vert=0xac768c16)
    at model/modeloctant.cpp:28
28 if ( vert->xyz.x < oct->min.xyz.x ) return 0;

Time to rust out my C skills to see if i can figure this one out, .net makes you soft.

Quick test, just comenting all the ifs in that function and make it to always to return 0 already fixes the issue and make the models load and techroom nd the mission to load as well, im sure im disabling something by doing this, so ill try to make a proper fix., good thing the issue seems to be isolated in that function and those pointers.

Update, the SIGBUS happens on these two calls on that file:
Code: [Select]
if ( point_in_octant( pm, oct, vp(p+20) ) )at line 174 and 235, the problem seems to be "vp(p+20)"

I spooke too soon, there are more: Shield impact
Code: [Select]
Thread 1 "fs2_open_3.7.2_" received signal SIGBUS, Bus error.
fvi_ray_boundingbox (min=0x345db0d, max=0x345db19, p0=0xabc4b4 <Mc_p0>,
    pdir=0xabc4d0 <Mc_direction>, hitpt=0xbeffeb0c) at math/fvi.cpp:335
335 if (p0->a1d[i] < min->a1d[i]) {

I fail to understand why none of this happens with the retail game files...
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 21, 2019, 10:26:02 pm
As far as the point_in_octant error, I found a potential lead.  The call stack leads up to this bit of code:

Code: [Select]
case OP_SORTNORM: {
int frontlist = w(p+36);
int backlist = w(p+40);
int prelist = w(p+44);
int postlist = w(p+48);
int onlist = w(p+52);

if (prelist) model_octant_find_faces_sub(pm,oct,p+prelist,just_count);
if (backlist) model_octant_find_faces_sub(pm,oct,p+backlist,just_count);
if (onlist) model_octant_find_faces_sub(pm,oct,p+onlist,just_count);
if (frontlist) model_octant_find_faces_sub(pm,oct,p+frontlist,just_count);
if (postlist) model_octant_find_faces_sub(pm,oct,p+postlist,just_count);

See the documentation on bsp_tree here, particularly on sortnorms:
https://wiki.hard-light.net/index.php/BSP_data_structure

Note that these five numbers, which are read from the model file itself, specify byte offsets.  In order to be properly aligned, these offsets must be multiples of 4.  But there is no enforcement of this requirement in the code, and obviously no enforcement in whichever program exported the model.  @ShivanSpS, try adding some logging statements to catch non-compliant offsets.

I suspect a similar problem is happening with the shield mesh.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 22, 2019, 05:29:46 am
mmm if thats the case this is far more extensive problem that i belived, ill keep looking into it this weekend. Also i think the reason of why this happens is that the kernel is set to fix unaligned access and with the retail files can, and with mods cant for some reason, maybe i should try to disable that just to see what happens, ARMv8-A should be able to deal with unaligned access.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 22, 2019, 11:10:31 am
I think the simple reason is that the retail model files all have offsets in multiples of 4, and some of the newer models do not have offsets in multiples of 4.  The logging statements would confirm this.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 23, 2019, 09:47:33 am
So i did this ugly thing
Code: [Select]
int model_octant_find_faces_sub(polymodel * pm, model_octant * oct, void *model_ptr, int just_count )
{
ubyte *p = (ubyte *)model_ptr;
int chunk_type, chunk_size;

chunk_type = w(p);
chunk_size = w(p+4);

mprintf(( "First chunk_type %d\n", chunk_type ));
mprintf(( "First chunk_size %d\n", chunk_size ));
while (chunk_type != OP_EOF) {

switch (chunk_type) {
case OP_DEFPOINTS:
moff_defpoints(p, just_count);
break;
case OP_FLATPOLY: moff_flatpoly(p, pm, oct, just_count ); break;
case OP_TMAPPOLY: moff_tmappoly(p, pm, oct, just_count ); break;
case OP_SORTNORM: {
int frontlist = w(p+36);
int backlist = w(p+40);
int prelist = w(p+44);
int postlist = w(p+48);
int onlist = w(p+52);

if (prelist) model_octant_find_faces_sub(pm,oct,p+prelist,just_count);
if (backlist) model_octant_find_faces_sub(pm,oct,p+backlist,just_count);
if (onlist) model_octant_find_faces_sub(pm,oct,p+onlist,just_count);
if (frontlist) model_octant_find_faces_sub(pm,oct,p+frontlist,just_count);
if (postlist) model_octant_find_faces_sub(pm,oct,p+postlist,just_count);
}
break;
case OP_BOUNDBOX: break;
default:
mprintf(( "Bad chunk type %d, len=%d in model_octant_find_faces_sub\n", chunk_type, chunk_size ));
Int3(); // Bad chunk type!
return 0;
}
p += chunk_size;
chunk_type = w(p);
chunk_size = w(p+4);
    mprintf(( "End chunk_type %d\n", chunk_type ));
    mprintf(( "End chunk_size %d\n", chunk_size ));

void model_octant_find_faces( polymodel * pm, model_octant * oct )
{
ubyte *p;
int submodel_num = pm->detail[0];

p = pm->submodel[submodel_num].bsp_data;
mprintf(( "chunk_type OP_DEFPOINTS %d\n", OP_DEFPOINTS ));
mprintf(( "chunk_type OP_FLATPOLY %d\n", OP_FLATPOLY ));
mprintf(( "chunk_type OP_TMAPPOLY %d\n", OP_TMAPPOLY ));
mprintf(( "chunk_type OP_SORTNORM %d\n", OP_SORTNORM ));
mprintf(( "chunk_type OP_BOUNDBOX %d\n", OP_BOUNDBOX ));

oct->nverts = 0;
model_octant_find_faces_sub(pm, oct, p, 1 );
    mprintf(( "%s\n", "pass" ));
if ( oct->nverts < 1 ) {
oct->nverts = 0;
oct->verts = NULL;
return;
}

oct->verts = (vec3d **)vm_malloc( sizeof(vec3d *) * oct->nverts );
Assert(oct->verts!=NULL);

oct->nverts = 0;
model_octant_find_faces_sub(pm, oct, p, 0 );

// mprintf(( "Octant has %d faces\n", oct->nfaces ));
}

Results in this
Code: [Select]
chunk_type OP_DEFPOINTS 1
chunk_type OP_FLATPOLY 2
chunk_type OP_TMAPPOLY 3
chunk_type OP_SORTNORM 4
chunk_type OP_BOUNDBOX 5
First chunk_type 1
First chunk_size 74590
End chunk_type 4
End chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 5
First chunk_size 32
End chunk_type 3
End chunk_size 80
<crash here>

74590 is not multiple of 4 and that  p += chunk_size; that makes the pointer to go to unaligned memory, so i went on to check that first chunk size on retail models:
Code: [Select]
First chunk_size 16072 | 4018.000000
First chunk_size 15268 | 3817.000000
First chunk_size 9716 | 2429.000000

No point in writting them all, they are all multiple of 4.

So Goober, you are right.
Now this happens while reading the .pof into memory or when reading the data in memory? Because if this happens while reading the file... well, i not sure if it can be fixed, altrought i dont know the code to think in a workaround, maybe memcpy the entire thing and work from there. Where exacty this is done in the code?
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 23, 2019, 01:16:40 pm
The chunk size is specified as a number within the POF file itself.  So the first thing to do is to find out what that chunk represents and why it was given that particular size.  It is very possible that whatever program was used to create the POF file wrote incorrect data.

The best solution is to re-export the POF with the correct settings.  If that is not possible, then a fallback option is to memcpy the bytes, manually align them, and then read the data from the copy.  But that would be much more complicated.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 23, 2019, 02:55:30 pm
I would like to come out with a fix that works in general rather than going to edit every pof for every mod out there.

There are two things that may work:

1) use the "__packed" atributte, this should allow for unaligned access of data, i did this with the pointer, but my attempts so far has been unsuccessfull, but i never did this before so i need to read about it a bit more, i may not be doing it right or over the right data.

2) memcpy and fix does not sound that hard, but i need to know exactly was wrong, its the BSP_Data that is not correctly aligned here right?

This data?
Code: [Select]
p = pm->submodel[submodel_num].bsp_data;
If that i think i can come out with a code that memcpy it and if the chunk size is wrong, fix it.

Anyway in both cases it will take a while as i need to read and understand how the bsp_data works, for example i do not understand why model_octant_find_faces_sub is recursive.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 23, 2019, 03:05:15 pm
More specifically it is the chunks in the BSP data tree that are not aligned.  All of the chunks are concatenated together, but if some of them are not multiples of 4, they will not be on the proper boundaries.

I will reiterate what I said before:
Quote
So the first thing to do is to find out what that chunk represents and why it was given that particular size.  It is very possible that whatever program was used to create the POF file wrote incorrect data.

We don't know why the chunk did not have a proper size.  It is possible that there is a bug in the POF program, which means that it is important to understand the bug before attempting a fix.  There may be a deeper problem hidden beneath the superficial problem.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 23, 2019, 03:32:35 pm
Well yes, the V models are all aligned that cant be a coincidence. Actually it makes sence since back in the 90s having unaligned access would have performance costs, today thats a non-issue. Except for these ARM processors that for some reason cant deal with it, even trough they should.

OK, taking a look at the pofs first, that would be the PCS2 and PCS1 right?

And if someone else also want to check for unligned models adding this
Code: [Select]
if((chunk_size % 4) != 0)
{
mprintf(( "Warning: Unaligned memory access: Chunk Type %d, chunk_size %d in model_octant_find_faces_sub\n", chunk_type, chunk_size ));
}

before
Code: [Select]
p += chunk_size;
chunk_type = w(p);
chunk_size = w(p+4);
in model_octant_find_faces_sub on modeloctant.cpp is a easy way to detect it.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 23, 2019, 05:25:44 pm
That looks good, but you should also check to see if the frontlist, etc. variables are divisible by 4.  That could cause the same issue.

PCS2 is the main program used for editing POF data.  However I don't think it's the program that's used for converting the original 3D model file to POF in the first place.  I recommend asking a modeler about that.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 23, 2019, 08:19:41 pm
The first was Modelview32 i think.

I checked ModelView32 there is no reference to anything, but i cant find the source code so i cant look into it. Opening a pof, go to edit and save it does not fix it.

PCS2 the same, i took a look in the source code https://github.com/scp-fs2open/PCS2/blob/master/src/BSPDataStructs.cpp
int BSP_DefPoints::Write(char *buffer)
There is nothing there enforcing aligment.

In the end i think it is a 20 year old oversight, V tools produced properly aligned models and Modelview and PCS do not, and no one realised because it just works on x86.

I think PCS2 can be fixed so in going to open a issue on git. Them ill start thinking in a workaround in fs2 code for these cases, that dosent mean i can do it, but ill try.

WAIT, hang on, i found this on ModelView32 readme.
Quote
Note:
~~~~~
- If you save a POF in the editor that you just extracted from the original
  FreeSpace 1/2 .VP files, the file will be a bit smaller than the original
  file, even though nothing changed. This is not a bug, there is no
  information lost. However the original Volition editors wasted some bytes
  containing nothing but null or CRLF bytes at the end of strings, which are
  however totally useless. MODELVIEW in contrast cuts strings to their actual
  content.
Those extra bytes that modelView is cutting were the aligment bytes, i have no other explanation. This carried on to PCS2.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 23, 2019, 09:17:46 pm
OOF

Yet another example of the idiotic human fallacy, "I don't understand it, therefore it must be wrong."
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: taylor on November 24, 2019, 04:09:09 am
That looks good, but you should also check to see if the frontlist, etc. variables are divisible by 4.  That could cause the same issue.
I'm not looking at the code, but those are just offsets of p, right? In which case chunk_size alignment will determine alignment of the rest.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 24, 2019, 12:00:24 pm
Ok i think im on the right track, but i need to be sure im understanding exactly how this works, this looks good?

(https://i.imgur.com/MInETWl.png)
https://i.imgur.com/MInETWl.png (https://i.imgur.com/MInETWl.png)
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 24, 2019, 01:33:35 pm
Ok some progress, and i think this is as far ill get today
i added this to line 1560 in modelread.cpp
Code: [Select]
ubyte *absp, *abspp, *p = pm->submodel[n].bsp_data;
absp = (ubyte*)vm_malloc(pm->submodel[n].bsp_data_size+(100*sizeof(ubyte*)));
memset(absp, NULL, pm->submodel[n].bsp_data_size+(100 * sizeof(ubyte*)));
abspp = absp;
int chunk_type, chunk_size, copied=0;
chunk_type = w(p);
chunk_size = w(p + 4);

if ((chunk_size % 4) != 0)
{
int newsize = chunk_size + 4 - (chunk_size % 4);
mprintf(("Warning: Unaligned memory access Defpoints: Chunk Type %d, chunk_size %d in modelread.cpp:1572\n", chunk_type, chunk_size));
mprintf(("Fixing Defpoints: Chunk Type %d, old chunk_size %d, new chunk_size %d\n", chunk_type, chunk_size, newsize));
memcpy(absp, p, chunk_size);
copied += chunk_size;
*(absp + 4) = newsize;
absp += newsize;
}
else
{
memcpy(absp, p, chunk_size);
absp += chunk_size;
copied += chunk_size;
}
memcpy(absp, p+chunk_size, pm->submodel[n].bsp_data_size-copied);
vm_free(pm->submodel[n].bsp_data);
pm->submodel[n].bsp_data = abspp;

I tried that to align the first defpoints that at least in WCS that seem to be the unaligned part, and sometimes it works in the techroom to load most models, but in others it wouldt load the model, but it dosent crash the game either, it just gets stuck there.
Ill keep looking on it this week or next weekend.

edit:updated.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 24, 2019, 01:59:52 pm
That looks good, but you should also check to see if the frontlist, etc. variables are divisible by 4.  That could cause the same issue.
I'm not looking at the code, but those are just offsets of p, right? In which case chunk_size alignment will determine alignment of the rest.

If I understand the code correctly, they are byte offsets of p.  The model_octant_find_faces_sub function is passed a pointer which is p plus a number, and p is of type ubyte.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 25, 2019, 08:22:42 am
The problem seems to be that, if i change defpoints chunk_size with the new one something crashs or loops (probably loops) silently somewhere else in the code for some models. Im not longer trying on the pi itselft, im doing it on x86 and visual studio.

Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 25, 2019, 12:29:15 pm
Err, don't replace the BSP data with the aligned version.  Try just copying the bad chunk, aligning it, and passing it to the next function.  Then when the function returns, free the copy.  Don't touch the underlying bsp tree.

This implies the function will create and maintain recursive copies of sub-trees if the sub-trees are also unaligned, but as long as the copies aren't too large, this should be fine.  The tree is finite.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 25, 2019, 04:19:10 pm
Ok i came out with... this is not a fix, it is more of a hack, too make it work. There are unaligned calls all over the place even after doing this, but the CPU is dealing and fixing those by itself. This may not be true with EVERY ARM cpu out there or with every model.

This "hack" involves in modifing 2 functions.
point_in_octant in modeloctant.cpp:25
Code: [Select]
// returns 1 if a point is in an octant.
int point_in_octant( polymodel * pm, model_octant * oct, vec3d *vert )
{
vec3d *avert;
avert= (vec3d*)vm_malloc(sizeof(vec3d));
memcpy(avert, vert, sizeof(vec3d));

if ( avert->xyz.x < oct->min.xyz.x ) free(avert);return 0;
if ( avert->xyz.x > oct->max.xyz.x ) free(avert);return 0;

if ( avert->xyz.y < oct->min.xyz.y ) free(avert);return 0;
if ( avert->xyz.y > oct->max.xyz.y ) free(avert);return 0;

if ( avert->xyz.z < oct->min.xyz.z) free(avert);return 0;
if ( avert->xyz.z > oct->max.xyz.z) free(avert);return 0;
free(avert);
return 1;
}

fvi_ray_boundingbox in fvi.cpp:326
Code: [Select]
int fvi_ray_boundingbox( vec3d *min, vec3d *max, vec3d * p0, vec3d *pdir, vec3d *hitpt )
{
int middle = ((1<<0) | (1<<1) | (1<<2));
int i;
int which_plane;
float maxt[3];
float candidate_plane[3];

vec3d *ap0, *amin, *amax;
ap0= (vec3d*)vm_malloc(sizeof(vec3d));
memcpy(ap0, p0, sizeof(vec3d));
amin= (vec3d*)vm_malloc(sizeof(vec3d));
memcpy(amin, min, sizeof(vec3d));
amax= (vec3d*)vm_malloc(sizeof(vec3d));
memcpy(amax, max, sizeof(vec3d));

for (i = 0; i < 3; i++) {
if (ap0->a1d[i] < amin->a1d[i]) {
candidate_plane[i] = amin->a1d[i];
middle &= ~(1<<i);
} else if (ap0->a1d[i] > amax->a1d[i]) {
candidate_plane[i] = amax->a1d[i];
middle &= ~(1<<i);
}
}

// ray origin inside bounding box?
// (are all three bits still set?)
if (middle == ((1<<0) | (1<<1) | (1<<2))) {
*hitpt = *ap0;
vm_free(ap0);
vm_free(amin);
vm_free(amax);
return 1;
}

// calculate T distances to candidate plane
for (i = 0; i < 3; i++) {
if ( (middle & (1<<i)) || (pdir->a1d[i] == 0.0f) ) {
maxt[i] = -1.0f;
} else {
maxt[i] = (candidateint fvi_ray_boundingbox( vec3d *min, vec3d *max, vec3d * p0, vec3d *pdir, vec3d *hitpt )
{
int middle = ((1<<0) | (1<<1) | (1<<2));
int i;
int which_plane;
float maxt[3];
float candidate_plane[3];

vec3d *ap0, *amin, *amax;
ap0= (vec3d*)vm_malloc(sizeof(vec3d));
memcpy(ap0, p0, sizeof(vec3d));
amin= (vec3d*)vm_malloc(sizeof(vec3d));
memcpy(amin, min, sizeof(vec3d));
amax= (vec3d*)vm_malloc(sizeof(vec3d));
memcpy(amax, max, sizeof(vec3d));

for (i = 0; i < 3; i++) {
if (ap0->a1d[i] < amin->a1d[i]) {
candidate_plane[i] = amin->a1d[i];
middle &= ~(1<<i);
} else if (ap0->a1d[i] > amax->a1d[i]) {
candidate_plane[i] = amax->a1d[i];
middle &= ~(1<<i);
}
}

// ray origin inside bounding box?
// (are all three bits still set?)
if (middle == ((1<<0) | (1<<1) | (1<<2))) {
*hitpt = *ap0;
vm_free(ap0);
vm_free(amin);
vm_free(amax);
return 1;
}

// calculate T distances to candidate plane
for (i = 0; i < 3; i++) {
if ( (middle & (1<<i)) || (pdir->a1d[i] == 0.0f) ) {
maxt[i] = -1.0f;
} else {
maxt[i] = (candidate_plane[i] - ap0->a1d[i]) / pdir->a1d[i];
}
}

// Get largest of the maxt's for final choice of intersection
which_plane = 0;
for (i = 1; i < 3; i++) {
if (maxt[which_plane] < maxt[i]) {
which_plane = i;
}
}

// check final candidate actually inside box
if (maxt[which_plane] < 0.0f) {
vm_free(ap0);
vm_free(amin);
vm_free(amax);
return 0;
}

for (i = 0; i < 3; i++) {
if (which_plane == i) {
hitpt->a1d[i] = candidate_plane[i];
} else {
hitpt->a1d[i] = (maxt[which_plane] * pdir->a1d[i]) + ap0->a1d[i];

if ( (hitpt->a1d[i] < amin->a1d[i]) || (hitpt->a1d[i] > amax->a1d[i]) ) {
vm_free(ap0);
vm_free(amin);
vm_free(amax);
return 0;
}
}
}
vm_free(ap0);
vm_free(amin);
vm_free(amax);
return 1;
}_plane[i] - ap0->a1d[i]) / pdir->a1d[i];
}
}

// Get largest of the maxt's for final choice of intersection
which_plane = 0;
for (i = 1; i < 3; i++) {
if (maxt[which_plane] < maxt[i]) {
which_plane = i;
}
}

// check final candidate actually inside box
if (maxt[which_plane] < 0.0f) {
vm_free(ap0);
vm_free(amin);
vm_free(amax);
return 0;
}

for (i = 0; i < 3; i++) {
if (which_plane == i) {
hitpt->a1d[i] = candidate_plane[i];
} else {
hitpt->a1d[i] = (maxt[which_plane] * pdir->a1d[i]) + ap0->a1d[i];

if ( (hitpt->a1d[i] < amin->a1d[i]) || (hitpt->a1d[i] > amax->a1d[i]) ) {
vm_free(ap0);
vm_free(amin);
vm_free(amax);
return 0;
}
}
}
vm_free(ap0);
vm_free(amin);
vm_free(amax);
return 1;
}

I keep testing this, i already finished the first 2 missions of Wing Commander Saga on the Raspberry PI with this. I also tested Dispora and it seems to work as well.

EDIT: Unfortunately, this only works with Debug build.
Code: [Select]
Thread 1 "fs2_open_3.7.2" received signal SIGBUS, Bus error.
0x0023f7f0 in model_collide_parse_bsp(bsp_collision_tree*, void*, int)
modelcollide.cpp:729

BTW, the 3.7.2 is not on git? i see the 3.7.0, but not the 3.7.2, ill like to fork and add changes on 3.7.2 for Raspberry PI users.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 25, 2019, 07:15:33 pm
It would be better to implement a proper fix, rather than a hack.  The reason Release builds still aren't working is probably because of the remaining unaligned calls.

I've asked chief1983 to tag the appropriate revision for 3.7.2.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 25, 2019, 08:26:14 pm
There are only 2 possible proper fixes, one is fixing the models themselves, on that PCS2 need to be updated, and im not sure if there are other model programs for pof edit/build. ModelView32 has to be discontinued, i dont think anyone has the source code to fix it, and im petty sure no one will be interested in doing it.

The other is taking the BSP data, and copy&align it when the model is loaded, but as you can see any of my attempts to mess around with the bsp_data ends in issues somewhere else.  Any other thing will look similar to what im doing there, copy the data from unaligned address to a new place and use it.
The problem is massive, these unaligned models ends up with having a lot of "vec3d" data in unaligned possitions. The CPU fixes +90% of them all, but if you disable that you can see it.
So before even attempt to try to fix the bsp data on fly i need to understand perfectly how BSP and the pof data structure itselft works, probably write a .net c# program fist (im more im me element here) that read the POFs and fixes them, and after that try to apply the same here. So it will take some time.

Also i want to try other things, software dxt1/3/5 descompresion using a lib, maybe i can implement that on ddsutils.cpp, and backport 3.8.0 fs1/fs2 Demo support to 3.7.2. Soo much stuff and so little time...
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: mjn.mixael on November 25, 2019, 11:39:07 pm
Wtf. Is this thread serious? Amazing.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Chromatix on November 26, 2019, 08:53:35 pm
Broadly speaking, x86 CPUs only incur a significant performance penalty for unaligned accesses when they straddle a cacheline boundary.  That's because software using unaligned accesses is so common in the x86 world that the hardware has been optimised to accommodate it.  This wasn't quite so true when Freespace was first released, but at worst you would incur a one-cycle delay per unaligned access.

Conversely, ARM CPUs are mostly still designed to be efficient first and fast second, on the grounds that an efficiently designed CPU might run fast anyway if the software is written properly.  The circuitry to handle unaligned accesses is therefore considered an unnecessary complexity, given that well-written software should avoid it; unaligned executable code is outright forbidden (as ARM instructions are all 4 bytes long).  This goes so far that unaligned accesses are not merely unoptimised, but totally unsupported in hardware.  To work around this when an unaligned access happens anyway, it is possible to handle the exception in software by replacing the single access with multiple byte-wide accesses, but of course this is slower and requires an extra temporary register; a single 32-bit load becomes a 7-instruction sequence (load byte 0, load byte 1, combine bytes, load byte 2…) that is difficult to run in parallel.  And that's ignoring the major overhead of taking the exception in the first place.

I think it would be wise to at least process all the mods in Knossos to ensure they have proper alignment, as well as patching the tools to make them produce aligned output in the first place.  But to allow the apparently large corpus of misaligned mods to run on ARM, an alignment workaround does need to be added to the loader.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: chief1983 on November 28, 2019, 11:37:23 pm
My god what have we done

As for the tag for 3.7.2, it doesn't exist because we weren't even on Git yet.  The quote from the release post concerns me though:  "It is an export of 3.7.2 branch r11329".  I don't see a 3.7.2 branch in our git even though most of the older branches were created.  Someone got too aggressive with branch pruning I think and deleted it.  Either way, I found the git commit was still on my fork and pushed a branch and tag for 3.7.2 up to the main repository.  Enjoy.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 30, 2019, 12:06:25 am
I got the remaining bus errors on release config traped in these bits of code on modelcollide.cpp

line 786
Code: [Select]
case OP_SORTNORM:
if ( version >= 2000 ) {
/*min = vp(p+56);
max = vp(p+68);

node_buffer[i].min = *min;
node_buffer[i].max = *max;*/
ubyte *ac;
ac = (ubyte*)vm_malloc(4);
memcpy(ac,p+56,4);
min = vp(ac);
node_buffer[i].min = *min;
memcpy(ac,p+68,4);
max = vp(ac);
node_buffer[i].max = *max;
vm_free(ac);
}
and
Code: [Select]
case OP_BOUNDBOX:
                      /*min = vp(p+8);
max = vp(p+20);

node_buffer[i].min = *min;
node_buffer[i].max = *max;*/
ubyte *ac;
ac = (ubyte*)vm_malloc(4);
memcpy(ac,p+8,4);
min = vp(ac);
node_buffer[i].min = *min;
memcpy(ac,p+20,4);
max = vp(ac);
node_buffer[i].max = *max;
vm_free(ac);
but now ships has no collision (mostly)and a get a segfault when killing another fighter with a missile. Not sure what i did wrong, that should work. Maybe its late and i cant see it.

Once again, is the same problem, a vec3d being dereferenced, it has been this same issue every time. At least the debug build works and it actually works very well so far.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on November 30, 2019, 12:39:01 pm
Ok now that im making progress in parsing pofs im going to point out some things that i belive should be corrected in the wiki.

https://wiki.hard-light.net/index.php/POF_data_structure
Quote
The rest of the file is a bunch of chunks. Each chunk is:
char[4] chunk_id  // see below for available chunk types
int length        // length of the chunk.

While this is correct, it worth to mention that lenght and size (in case of BSP_data structure https://wiki.hard-light.net/index.php/BSP_data_structure) are not considered the same for chunks that are part of the pof data structure and chunks that are part from BSP_data.
In pof data structure the lenght does not considers the 8 bytes that belong to chunk type and chunk lenght, meaning if the pointer is at the begining of the chunk and you want to go to the next it is p+=lenght+8. This is important to point out because the BSP_data chunks are the exact oposite, there the chunk_size already has the 8 bytes in consideration.
Like this:
(https://i.imgur.com/Q1TiUXm.png)


Now in "OBJ2" type chunk, in the wiki it says:
Quote
int submodel_number  // What submodel number this is.

#ifdef version2116orhigher
 // FreeSpace 2
 float radius        // radius of this subobject
 int submodel_parent // What submodel is this model's parent. Equal to -1 if none.
 vector offset       // Offset to from parent object <- Added 09/10/98
#else
 // FreeSpace 1
 int submodel_parent // What submodel is this model's parent. Equal to -1 if none.
 vector offset       // Offset to from parent object <- Added 09/10/98
 float radius        // radius of this subobject
#endif

vector geometric_center
vector bounding_box_min_point
vector bounding_box_max_point

string submodel_name
string properites

int movement_type
int movement_axis

int reserved         // must be 0
int bsp_data_size    // number of bytes now following
char[bsp_data_size] bsp_data  // contains actual polygons, etc.

This is wrong, there is a int before each string indicating the lenght of the string, makes sence because otherwise i would have no idea how to parse it. interesting to know that if the properites string is empty the lenght is 4 and are actually 4 NULLs in there and the names as well, if there is 9 chars, the leght is 12 with extra nulls (V Models)

Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on November 30, 2019, 02:45:19 pm
By all means, update the wiki where it needs updating.  I have created an account for you using your HLP email.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 01, 2019, 04:38:11 pm
Thanks Goober, i will update the wiki once im done with this.

Question: Does someone know what information is used in modelcollide.cpp, void model_collide_parse_bsp(bsp_collision_tree *tree, void *model_ptr, int version)? Because i managed to fix the pofs enoght to make them load on 3.7.2 Debug whiout any code modifications, but release still crashes on that function with a bus error, most likely on those vec3d i mentioned earlier.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on December 01, 2019, 06:34:48 pm
You can browse the code on GitHub:
https://github.com/scp-fs2open/fs2open.github.com/blob/master/code/model/modelcollide.cpp#L733

EDIT: Actually you would have had to have the code to make modifications.  What do you mean, does anyone know what information is used?
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 01, 2019, 07:50:26 pm
You can browse the code on GitHub:
https://github.com/scp-fs2open/fs2open.github.com/blob/master/code/model/modelcollide.cpp#L733

EDIT: Actually you would have had to have the code to make modifications.  What do you mean, does anyone know what information is used?

I was asking because since i dont fully understand how the system, as a whole, works maybe i was missing something, from what i see in that function the likely suspects are the sortnorm chunk min and max boundbox, and the boundbox chunk min and max points. Both could be in highly unaligned positions due to the posibility of "defpoints" having an incorrect size and string aligment not being enforced on OBJ2 chunk. Meaning im still missing something.

Whats strange about all this is that i was spot on trying to replace the entire BSP_Data with one with aligned defpoints copied to a new memory location on modelread.cpp, defpoints looks like the only type of chunk with a non standart data type on bsp_data, the others are just a bunch of int and floats, impossible to be unaligned. Strange that it did not work(for all models) when i tried that.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on December 01, 2019, 10:12:36 pm
¯\_(ツ)_/¯

Honestly, I think you are the person who understands this the most - certainly the person who has studied this to this level of detail in the past 10 years.

All I can say is to break down the problem into manageable tasks, take things one at a time, take baby steps, and try to squash the problems one by one.  Feel free to keep posting your progress in this thread though; if we have any insight to offer, we will certainly chime in.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 01, 2019, 10:18:09 pm
I just did a terrible discovery. As i said im working on a tool to align the .pofs, it is still not ready as it has some issues here and there, but is enoght to align the data so well that with 3.7.2 Debug i can already see 3 to 4 FPS gain because im using aligned data and the kernel does not have to intervine so much. But im afraid its not only the models, its the FSO code as well, probably were FSO stores this specific data. There is one issue while loading these models after i align them with the tool using unmodified 3.7.2 debug: It crashes with a bus error on a shield impact. fvi.cpp:326 Is those vec3ds data again. BUT, this is the bad part, if i remove "SLDC" chunk from the .pofs this does not happens, and SLDC is FSO only and thus the retail models dosent have that.
The chunk chain leading up to SLDC data is aligned, there maybe some strings inside other chunks that arent yet aligned but the chain offset itselft it is, and SLDC chunk is just a bunch of int and floats, is not unaligned, and it will never be. Its the code.

Release 3.7.2 still crashes on that void model_collide_parse_bsp(bsp_collision_tree *tree, void *model_ptr, int version)(and those *min,*max are the problem), and my best guess is that is the same issue, the debug build has to have something that kicks that data into aligment.

So i need to stop with the tool for a while and look into FSO code again.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: chief1983 on December 01, 2019, 10:35:14 pm
Well the good news is the performance gain alone may be enough incentive for a mass effort to fix all the old models floating around.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 01, 2019, 10:42:15 pm
That is on ARM, i dont think x86 will have any performance gain im afraid. But i will check just to be sure.

At any rate, ARM is becoming more and more powerfull and common.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 02, 2019, 10:43:07 am
I was just looking at FSO code and i found this on modelread on the SLDC case.

Code: [Select]
void swap_sldc_data(ubyte *buffer)
{
#if BYTE_ORDER == BIG_ENDIAN
        char *type_p = (char *)(buffer);
int *size_p = (int *)(buffer+1);
*size_p = INTEL_INT(*size_p);

// split and polygons
        vec3d *minbox_p = (vec3d*)(buffer+5);
vec3d *maxbox_p = (vec3d*)(buffer+17);

minbox_p->xyz.x = INTEL_FLOAT(&minbox_p->xyz.x);
minbox_p->xyz.y = INTEL_FLOAT(&minbox_p->xyz.y);
minbox_p->xyz.z = INTEL_FLOAT(&minbox_p->xyz.z);

maxbox_p->xyz.x = INTEL_FLOAT(&maxbox_p->xyz.x);
maxbox_p->xyz.y = INTEL_FLOAT(&maxbox_p->xyz.y);
maxbox_p->xyz.z = INTEL_FLOAT(&maxbox_p->xyz.z);


// split
        unsigned int *front_offset_p = (unsigned int*)(buffer+29);
unsigned int *back_offset_p = (unsigned int*)(buffer+33);

// polygons
unsigned int *num_polygons_p = (unsigned int*)(buffer+29);

unsigned int *shld_polys = (unsigned int*)(buffer+33);

if (*type_p == 0) // SPLIT
{
*front_offset_p = INTEL_INT(*front_offset_p);
*back_offset_p = INTEL_INT(*back_offset_p);
}
else
{
*num_polygons_p = INTEL_INT(*num_polygons_p);
for (unsigned int i = 0; i < *num_polygons_p; i++)
{
shld_polys[i] = INTEL_INT(shld_polys[i]);
}
}
#endif
}

PLEASE dont tell me that there is a size 1 char on the begining of the ubyte* making all pointers unaligned :(
memcpy can definately help here but that itself is considered a unaligned access, but one that, with a little luck the kernel can handle. Maybe is better to just disable SLDC on ARM with a command line switch.

Not sure what INTEL_INT and INTEL_FLOAT does here, and they are also used at boundbox that may be causing issues on release build, my guess is just coying data like memcpy would do but they do provide the correct size for the arch.
In that case a memcpy with a sizeof(float/int) should work the same way.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: taylor on December 02, 2019, 11:28:52 am
ARM is little endian so that code is unused in this case. But yeah, that 1 byte at the start is :sigh:. Even if that swapping code is unused, that 1 byte means the shield collision data is going to be unaligned.

INTEL_INT/INTEL_FLOAT are for byte swapping.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 02, 2019, 11:50:35 am
Then disabling SLDC data loading on archs that are not x86 is the way to go, because the workaround i came out with in fvi.cpp with all those memcpy that are done at every shild hit is in no way faster and fixing SLDC for non x86 archs would mean having to do it again, chaging that char for an int, making models incompatible.

Maybe there is some other way, but no other comes to mind right now. Dealing with it would need to do an unaligned access at some point or the other, that you can never be sure that the kernel is going to be able to fix or not.

Now if i can only figure out what bit of data is causing the release build to crash at model load on bsp_collide_tree...
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on December 02, 2019, 04:51:26 pm
Why would the model alignment affect FPS?  Aren't models completely loaded into memory before the mission starts?  So during the mission they should be referenced from their FSO data structure, not from the POF file.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 02, 2019, 05:58:12 pm
Ok now i got the release build into WCS game, with, shields, models and collision working, but now it gets a

Code: [Select]
Thread 1 "fs2_open_3.7.2" received signal SIGSEGV, Segmentation fault.
0x00448d24 in l_Shipclass_Model_f(lua_State*) ()

When a fighter is destroyed :( this never ends, again the debug build works perfectly. Worst thing is im not even sure were that function is.
There has to be some unaligned data still. Retail game data works perfectly.

Why would the model alignment affect FPS?  Aren't models completely loaded into memory before the mission starts?  So during the mission they should be referenced from their FSO data structure, not from the POF file.

My best guess is that some data is still being access with a reference to BSP_Data, bsp_data is stored as it is in memory for the model and submodels and i think some stuff is just using a reference, not copying that data somewhere else. Thats is the best i can think off, i have two folders, one with WCS with default models and other with aligned models, while being in the hangar waiting for launch i get 17-18fps with unaligned models, with aligned models 19-21, using the same client that has been modified to work with unaligned models.

Its either that or a side effect of something else.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Col. Fishguts on December 03, 2019, 05:10:17 am
Maybe there is some other way, but no other comes to mind right now. Dealing with it would need to do an unaligned access at some point or the other, that you can never be sure that the kernel is going to be able to fix or not.

This thread triggers PTSD flashbacks from when I was trying to get a Doom sourceport working on a Sparc workstation (sparc64 doesn't take a performance hit with unaligned memory access, it just goes lolnope SIGBUS).

Unaligned POF data needs to be fixed externally, but the unaligned structs in FSO code you should be able to fix by telling gcc to enforce alignment during compile time:

https://gcc.gnu.org/onlinedocs/gcc-3.1/gcc/Type-Attributes.html

Sprinkle in "__attribute__ ((aligned (4)))"
(or 8) where needed.

I know other compilers (like the Sun CC) have command line flags to enforce that globally during compilation, but gcc seems to be missing such a feature?
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 03, 2019, 07:33:28 am
Im really not worried about SLDC data, its optional, i can comment the part were is loaded and thats it, thats good enough for now.

After looking at the code whats needs to be done about SLDC is to increase pof version, change those chars for an int, make the code process that data taking that in consideration and the new data must be saved in a new pof chunk type so it does not break compatibility with older FSO versions. I can also make my aligment tool to update the SLDC data to the new format and chunk type.

None of that is hard to do, it just need time and some testing. But its just not my priority right now.

Release now runs into a segfault when a non retail ship is destroyed, that most likely means a remaining issue with BSP_Data. IT also runs into another segfault at VC4 driver when entering the techroom, even trought i can get in-game.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 03, 2019, 07:57:16 pm
OK, i given up in trying to workaround the release 3.7.2 build to work with unaligned models, a shame because i was really close. If someone wants to use unaligned models on ARM he needs has to use the 3.7.2 Debug build without SLDC(commenting out SLDC loading on modelread, altrought im going to add a new command line to disable SLDC data loading) and the "hack" memcpy method on modeloctant point_in_octant i mentioned on page 2. That and removing the "forced on" statics display on game screen on debug build(when i find were that is) is the way that in going to go when i upload the first version of the modified 3.7.2. Im feel like im failed here but well it just not possible to workaround properly the unaligned models. Is a miracle that it works on debug build for wharever the reason.

On that, i dont think it is possible to fully fix the game either, if i disable kernel aligment fixup, the game (with retail game data) it loads and you can get ingame, it will crash with a bus error in, for example, hitting a ship on scripting.cpp, so playing with retail game data still yields aligment errors somewhere in the code. And this is 3.7.2, no idea what new suprises could be hidding on 3.8.0+

My objective right now is making my aligment tool good enoght so models can load with kernel fixup disabled like retail ones do, and check if that fixes the crashes on release build.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 04, 2019, 07:14:45 pm
Ok, today i did a mayor step forward as models are now aligned enoght to load in-game like the retail ones do on a unmodified 3.7.2 (other than ignoring SLDC data) whiout a bus error with kernel fixup disabled, with everything working, collision, shields, etc. unaligned defpoints vertex offset was causing a lot of troubles, that is now fixed.

But once again i ran into the same segfault as before when a ship is destroyed.

Code: [Select]
Thread 1 "fs2_open_3.7.2" received signal SIGSEGV, Segmentation fault.
0x00448d24 in l_Shipclass_Model_f(lua_State*) ()

I still dont know what that is, maybe a script? is debug builds disabling scripts or something?

The thing is... im running out of things to align in the pofs. BSP_Data is now properly aligned. so is the OBJ2 chunk, the other chunks are soft-aligned (this means, size offset is corrected and the chain adjusted for this, but strings inside do not).
FUEL,DOCK,SPCL,PATH and TXTR all have strings inside that could cause issues i need to take a look and fix them, but thats all that remains to be done. And not sure if any of that means anything to a ship breaking up thats causing the segfault. Maybe special points(SPCL)?
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on December 04, 2019, 09:10:25 pm
This is part of the scripting system.  There have been several fixes to that over the years; it's entirely possible that this is related.  See these PRs for example:
https://github.com/scp-fs2open/fs2open.github.com/pull/2119
https://github.com/scp-fs2open/fs2open.github.com/pull/1960

You may be stuck with coding up an alignment-fixer-loader, unless you want to be restricted to only a subset of mods.  There are literally thousands of POFs that have been released for FreeSpace -- perhaps tens of thousands -- and it would be impractical to fix them all.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: chief1983 on December 04, 2019, 09:36:59 pm
Perhaps, but starting with ones released on Knossos would be a big win and a much smaller subset.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 04, 2019, 10:00:50 pm
My idea was to also integrate the aligner to FSO, verify the .pofs the game can see on startup, update the ones as necesary and place them somewhere that has higher priority than the .vps. Or align each chunk as it being loaded that is easier. But before any of this the tool must be in perferct working condition and do not cause a bus error itselft, i havent tryied run it on ARM directly yet.

Well it looks like ill have to fix every chunk that has a string, every chunk that has a string inside is problematic, for example the TXTR acording to wiki it is:
'TXTR'
int num_textures
for each texture,i
 string tex_filename    // texture filename

But is really

'TXTR'
int num_textures
for each texture,i
 int filename_length
 string tex_filename    // texture filename

This means it is really easy to fall into unaligned positions as you need the length of the current texture to move the pointer and parse the next one. SPCL is much, much worse has it has vector data after two strings. So there is no way around it.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 05, 2019, 09:13:25 pm
OK, TXTR and SPCL done, only FUEL, DOCK and PATH remains now.

Also ive made an interesting discovery on the segfault, it seem indeed to be a bug with FSO 3.7.2 and not related not anything else. The strange thing is this only happens on 3.7.2 LINUX Release config, dosent happens on debug or in Windows. Wharever it is, it dosent happens on 3.7.4 but 3.7.4 cannot be used because it is definately running on software 3D renderer. If this is true this also means i was successfull in quickly workaround FSO to work with unaligned models whiout actually having to go and align it, this is something that could be easily switched on/off with a command line argument.

Other explanation could be that it dosent happens on 3.7.4 due to running in software 3d and/or low fps.

Goober, i did check those issues on git but they are already based on 3.8.0, meaning there are files there that dosent exist on 3.7.2, im going to compare 3.7.2 and 3.7.4 scripting system files now.

EDIT: there has been some extensive changes there from 3.7.2 to 3.7.4, no way that i can just drop in the files, its not gona work, so finding exactly was wrong and how to fix it is going to take some trial and error.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 08, 2019, 08:23:22 pm
Alignment tool released here https://www.hard-light.net/forums/index.php?topic=96108.msg1890470#msg1890470

Keep in mind i did basic testing, there could be bad data being producced by this process.

its simple to use the alignment logic on FS2_Code on modelread.cpp as long it does not couse a segbus itseft... i did my best to exploit memcpys that i know it works on FSO to the max. Thats the next move after fixing the script segfaults.

In the end, the models are aligned as the will ever be, but the script error on release config still happening on 3.7.2, and 3.7.4 is fine just runing on software 3d renderer. Im comparing script files from 3.7.4 to 3.7.2 but there are so many changes that im not sure where to start.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: chief1983 on December 08, 2019, 08:25:48 pm
I wonder if we could integrate this conversion into knossos/nebula if it ever gets stable enough...
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 08, 2019, 08:52:42 pm
I dont see why not, once is fully tested and the vp writing code is working (im been smashing my head on the wall the whole day), is posible to either integrate the code or use it as a external tool, picking up the vp with the models and reeplacing it with the new version. Thats why i went for direct vp read/write support.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 09, 2019, 05:45:08 pm
OK, the very SLOW performance on 3.7.4 is caused by shaders, by using "Disable GLSL Support" the game is fully playable. But it is a little bit slower than 3.7.2. So yeah, 3.7.4 works fine, no crashes due to the scripting system like 3.7.2. Good to know that those crashes are unrelated to alignment, since the models are now aligned and i dont see anything wrong with them.
Now i dont know what to do, try to fix 3.7.2 scripting system or move to recomend 3.7.4 with disabled glsl for RPI3/4...
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: chief1983 on December 09, 2019, 05:47:35 pm
Just compile it default disabled on ARM maybe.  Just like SLDC.  Solved!
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 09, 2019, 06:05:57 pm
Maybe, in not conviced about performance so im going to try to fix 3.7.2.

Just in case, you can add 3.7.4 to git as well? Because fixing 3.7.2 could take a really long time, WCS with 3.7.4 is playable(or at least that whats it looks) right now for example.

But im still not sure, maybe is better to just use 3.7.2 debug build after all. Im going to analyse more closely the performance diferences of 3.7.2 and 3.7.4.

EDIT: Im going to pick 3.7.2 Debug, dont bother with add 3.7.4 to git.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 11, 2019, 07:23:09 am
Ok ill be uploading the changes here
https://github.com/Shivansps/fs2open.github.com

Nothing great just yet, just the enoght workarounds needed to get Wing Commander Saga working with debug build.

Now ill try to get the SLDC system working and after that implement the pof aligner directly on model load.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 11, 2019, 10:45:54 pm
I have a possible solution about the SLDC system(and alignment overall), i hope this dosent cause issues but i think pof version has to be increased by 1 to 2118, version 2118 will signal both the presence of a new SLDC chunk type, and thus the older one must be ignored (but keeped in pof for compatibility), and the second thing, 2118 also signals that the model is aligned and thus no alignment must be done at model load.
pof version 2117 or lower means that the model must be aligned and SLDC data converted to the new format at model load.

i could easily do these changes to the pof aligner, and pcs2 needs to be updated anyway.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on December 12, 2019, 12:41:40 am
Sounds reasonable.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 13, 2019, 10:02:30 pm
OK, SLDC now working, i called the new chunk "SLC2". Is the same thing but with a int instead of a char.

Ive added to FSO the ability to convert SLDC to SLC2 on model load for pof versions 2117 or lower, ive already tested on ARM and this works fine, 2117 models load, SLDC data is converted to SLC2 format, and shields collision works fine and whiout a bus error crash.
Its all in the testing branch here
https://github.com/Shivansps/fs2open.github.com/tree/fs2_open_3_7_2_rpi_test

This was crucial, because this works and it means i can implement all other chunk alignment code from the pof aligner as i did for this, directly on FSO as i wanted to do.

I will upload the new pof aligner version probably tomorrow that outputs pof version 2118 with the slc2 chunk already in, all that is already done but i want to get the VP writing code working.

If all goes well i hope to have FSO aligning the pofs on model load very soon.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 14, 2019, 09:19:16 am
pof_aligner updated, this version outputs pofs with version 2118 and the SLC2 chunk already converted.
https://www.hard-light.net/forums/index.php?topic=96108.0

It also supports VP to VP alignment for simple use. So if someone is planing to integrate it to knossos as a external tool or in code directly, thats the way to go.

I also tested these 2118 pofs on 3.8.1, it seems to be no issues caused by increasing the version and or the unknown SLC2 chunk.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on December 14, 2019, 02:17:57 pm
Er, this VP writer isn't modifying VP files in-place, is it?
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 14, 2019, 02:26:47 pm
Er, this VP writer isn't modifying VP files in-place, is it?
No, it creates a new vp, you need to specify both the input name and the output name.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 15, 2019, 11:13:27 am
Its done. Ive added BSP_data alignment to 3.7.2 FSO for pof versions 2117 or lower. Ive already merged it into main branch.
https://github.com/Shivansps/fs2open.github.com

The other chunks FUEL, GLOW, SPCL, TXTR, DOCK, PATH, SLDC, OBJ2, SOBJ, due to the way FSO parses those pof chunks, there is no need to align them and there is no way to do it either at model model whiout re-writting the entire process. This is because the chunk is not being loaded into memory and then used, it is being read from file and copied intro memory just like one would do with memcpys but with file reads.
Thats not the case for BSP_DATA or SLDC that are read and stored into memory as it is, and then used it and that was the problems.

These changes allows for 3.7.2 to load and use unmodified pof files whiout any silly workarounds or disabled SLDC, and 2118 files whiout any extra work done on model load. So unless some weird issue is found by doing this or with the pof aligner ill consider this as finished.
The scripting segfault on 3.7.2 release is unrelated to this issue, or at least, with models. But since it works on debug build that is good enoght.

The only thing that i dont know is if you guys are going to accept pof version number 2118 (and what it means) and SLC2, im waiting on that to add that information on wiki and create a issue on PCS2 git for future support.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Cyborg17 on December 15, 2019, 04:24:46 pm
Would it be possible to add this to trunk for any arm processors that are in computers that have better graphics cards or better open gl support?
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 16, 2019, 12:14:18 pm
I think it is a matter of testing, if there are no issues, it may be good enoght to be implemented on main.

Im curious to test if having the shield mesh collision aligned (SLC2) has any effect on fps, that is the worst part of this because for every hit the code has the process the unaligned SLDC tree. There may be some small difference if there are enoght ships hitting each other. Ill have to add it to a 3.8.1 and try.
-A similar issue may happen with collision detection, as unaligned BSP defpoints was breaking that part on ARM, but i belive the data is being copied there-

Also, ive learned that AVX may have problems as ARM has with unaligned access, so AVX builds may have a performance penalty on SLDC, but i dont know if AVX is used for that here. At any rate, another reason to test.




Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: Goober5000 on December 16, 2019, 05:06:39 pm
The only thing that i dont know is if you guys are going to accept pof version number 2118 (and what it means) and SLC2, im waiting on that to add that information on wiki and create a issue on PCS2 git for future support.

I don't see why not.  It's certainly a valid reason to bump the POF version.  Just make sure that the code can load both < and >= 2118.


And by all means, submit a pull request to the main FSO repository with the align-on-load fix.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: ShivanSpS on December 20, 2019, 12:58:07 pm
What is happening on Git with the PR is unbelievable, i try to run FSO on a Raspberry PI and i find a 20 year old issue with models, i try to fix it and i encounter what is probably another long term problem with the modelsinc header.

I have a worse luck than those Medusas on the fs2 intro.
Title: Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Post by: chief1983 on December 20, 2019, 01:16:48 pm
I've been following the emails from github on the updates, it's like watching a daytime soap lol