Author Topic: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)  (Read 1745 times)

0 Members and 1 Guest are viewing this topic.

Offline ShivanSpS

  • 210
[3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Hi,
im doing a little detective work trying to figure out this, this is the issue, due to only OpenGL 2.1 and no S3TC a Raspberry PI can only run 3.7.2 and uncrompressed textures.

But in some projects, like Diaspora and Wing Commander Saga attempts to load most models in the tech room ends in a crash whiout futher log info, in fact i only get a "bus error" if i run from a terminal, this is not due to textures as removing the textures also causes the game to crash.

In Diaspora for example, the Vipers load, but the Raptor crashes. On Wing Commander Saga all fighter crash, but two or 3 of the Cruicers do load. In constract, all retail FS2 models load.

As i understand, there were huge changes on 3.8.0 and mods adapted to it, thre is something in the .pof files that could cause this?
« Last Edit: November 21, 2019, 07:14:47 pm by ShivanSpS »

 

Offline mjn.mixael

  • Cutscene Master
  • 212
  • Anims: 420, Cutscenes: 10, Mainhalls: 7, Logos: 52
    • Steam
    • Twitter
    • Mix-Hai Productions
Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
A list of changes can be found at the bottom of this page.

But I have a suspicion.. what build are you running and/or what date of build are you building from?
Cutscene Upgrade Project - Mainhall Remakes - MixaelANITools - Between the Ashes - MjnMixael's Render Boutique - Mix-Hai Productions
Youtube Channel - P3D Model Box - Photobucket Albums - Model Releases - Downloads
Between the Ashes is looking for committed testers, PM me for details.
Report MediaVP issues, now on the MediaVP Mantis! Read all about it Here!

 

Offline ShivanSpS

  • 210
Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Just a little bit of correction, i never tried to run mediavps, i really dont think the RPI can handle it, altrought i going to try 3.7.2 probably this weekend.

This is the build im running
https://www.hard-light.net/forums/index.php?topic=89597.0

Compiled from the source code export provided there directly on a RPI4. Its kind of a mess due to autogen not detecting the ARM cpu and generating the makefiles with x86 flags, so i had to change that manually.

As i said, the RPI 4 is limited to that at least to the day that both FSO and the RPI4 supports Vulkan, so it will be a while. There is SOMETHING in the .pofs of Diaspora and WCS (and i guess most of other mods) that crashes that 3.7.2 build with a "bus error", and i know those mods were updated to use FSO 3.8.0.

Other that that im lost, since the retail game can be finished on the RPI.

 

Offline mjn.mixael

  • Cutscene Master
  • 212
  • Anims: 420, Cutscenes: 10, Mainhalls: 7, Logos: 52
    • Steam
    • Twitter
    • Mix-Hai Productions
Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
I take it this is a Source Code Project question then, add it's unrelated to the MediaVPs.
Cutscene Upgrade Project - Mainhall Remakes - MixaelANITools - Between the Ashes - MjnMixael's Render Boutique - Mix-Hai Productions
Youtube Channel - P3D Model Box - Photobucket Albums - Model Releases - Downloads
Between the Ashes is looking for committed testers, PM me for details.
Report MediaVP issues, now on the MediaVP Mantis! Read all about it Here!

 

Offline ShivanSpS

  • 210
Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
I dont think it is a code question, i already know that a lot of changes has been done on FSO 3.8.0 onwards, specially on the graphics department.

If FSO 3.8.0 allowed new stuff to be added on the .pofs that could cause a crash on 3.7.2 the people that should know about it are the one that work with the model themselves. And the people doing the mediavps are the one that did the most work on models ever.

There is something on those .pofs that cause the game to crash, if nothing special was added on the pofs from 3.7.2 to 3.8.0 to take advantage of some new feature them it could be something else. And i have to admit that getting a "bus error" like this is strange.
« Last Edit: November 20, 2019, 07:14:02 pm by ShivanSpS »

 

Offline taylor

  • Super SCP/Linux Guru
  • Moderator
  • 212
    • http://www.icculus.org/~taylor
Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Sounds like a memory alignment issue to me, which arm is more finicky about than x86. Either an unaligned data read or accessing a struct that isn't aligned/padded properly. Try running it in gdb and see if that helps to locate the problem area.

 

Offline The E

  • He's Ebeneezer Goode
  • Global Moderator
  • 213
  • Nothing personal, just tech support.
    • Steam
    • Twitter
Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
There have been no changes at all in the pof file format or how it is interpreted between those versions as far as I can recall.
Let there be light
Let there be moon
Let there be stars and let there be you
Let there be monsters and let there be pain
Let us begin to feel again
--Devin Townsend, Genesis

 
Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Don't know if it helps, but I noticed that you can't open pof files that have been safed with PCS 2.1 in 2.0.3 for some reason.

 

Offline ShivanSpS

  • 210
Re: Question, what has been changed to models from 3.7.2 to 3.8.0?
Sounds like a memory alignment issue to me, which arm is more finicky about than x86. Either an unaligned data read or accessing a struct that isn't aligned/padded properly. Try running it in gdb and see if that helps to locate the problem area.

You right, one would think this should be a non-issue today but nope...

Code: [Select]
Thread 1 "fs2_open_3.7.2_" received signal SIGBUS, Bus error.
point_in_octant (pm=0x24a1770, oct=0x24a33c4, vert=0xac768c16)
    at model/modeloctant.cpp:28
28 if ( vert->xyz.x < oct->min.xyz.x ) return 0;

Time to rust out my C skills to see if i can figure this one out, .net makes you soft.

Quick test, just comenting all the ifs in that function and make it to always to return 0 already fixes the issue and make the models load and techroom nd the mission to load as well, im sure im disabling something by doing this, so ill try to make a proper fix., good thing the issue seems to be isolated in that function and those pointers.

Update, the SIGBUS happens on these two calls on that file:
Code: [Select]
if ( point_in_octant( pm, oct, vp(p+20) ) )at line 174 and 235, the problem seems to be "vp(p+20)"

I spooke too soon, there are more: Shield impact
Code: [Select]
Thread 1 "fs2_open_3.7.2_" received signal SIGBUS, Bus error.
fvi_ray_boundingbox (min=0x345db0d, max=0x345db19, p0=0xabc4b4 <Mc_p0>,
    pdir=0xabc4d0 <Mc_direction>, hitpt=0xbeffeb0c) at math/fvi.cpp:335
335 if (p0->a1d[i] < min->a1d[i]) {

I fail to understand why none of this happens with the retail game files...
« Last Edit: November 21, 2019, 08:20:56 pm by ShivanSpS »

 

Offline Goober5000

  • HLP Loremaster
  • Administrator
  • 214
    • Goober5000 Productions
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
As far as the point_in_octant error, I found a potential lead.  The call stack leads up to this bit of code:

Code: [Select]
case OP_SORTNORM: {
int frontlist = w(p+36);
int backlist = w(p+40);
int prelist = w(p+44);
int postlist = w(p+48);
int onlist = w(p+52);

if (prelist) model_octant_find_faces_sub(pm,oct,p+prelist,just_count);
if (backlist) model_octant_find_faces_sub(pm,oct,p+backlist,just_count);
if (onlist) model_octant_find_faces_sub(pm,oct,p+onlist,just_count);
if (frontlist) model_octant_find_faces_sub(pm,oct,p+frontlist,just_count);
if (postlist) model_octant_find_faces_sub(pm,oct,p+postlist,just_count);

See the documentation on bsp_tree here, particularly on sortnorms:
https://wiki.hard-light.net/index.php/BSP_data_structure

Note that these five numbers, which are read from the model file itself, specify byte offsets.  In order to be properly aligned, these offsets must be multiples of 4.  But there is no enforcement of this requirement in the code, and obviously no enforcement in whichever program exported the model.  @ShivanSpS, try adding some logging statements to catch non-compliant offsets.

I suspect a similar problem is happening with the shield mesh.

 

Offline ShivanSpS

  • 210
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
mmm if thats the case this is far more extensive problem that i belived, ill keep looking into it this weekend. Also i think the reason of why this happens is that the kernel is set to fix unaligned access and with the retail files can, and with mods cant for some reason, maybe i should try to disable that just to see what happens, ARMv8-A should be able to deal with unaligned access.

 

Offline Goober5000

  • HLP Loremaster
  • Administrator
  • 214
    • Goober5000 Productions
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
I think the simple reason is that the retail model files all have offsets in multiples of 4, and some of the newer models do not have offsets in multiples of 4.  The logging statements would confirm this.

 

Offline ShivanSpS

  • 210
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
So i did this ugly thing
Code: [Select]
int model_octant_find_faces_sub(polymodel * pm, model_octant * oct, void *model_ptr, int just_count )
{
ubyte *p = (ubyte *)model_ptr;
int chunk_type, chunk_size;

chunk_type = w(p);
chunk_size = w(p+4);

mprintf(( "First chunk_type %d\n", chunk_type ));
mprintf(( "First chunk_size %d\n", chunk_size ));
while (chunk_type != OP_EOF) {

switch (chunk_type) {
case OP_DEFPOINTS:
moff_defpoints(p, just_count);
break;
case OP_FLATPOLY: moff_flatpoly(p, pm, oct, just_count ); break;
case OP_TMAPPOLY: moff_tmappoly(p, pm, oct, just_count ); break;
case OP_SORTNORM: {
int frontlist = w(p+36);
int backlist = w(p+40);
int prelist = w(p+44);
int postlist = w(p+48);
int onlist = w(p+52);

if (prelist) model_octant_find_faces_sub(pm,oct,p+prelist,just_count);
if (backlist) model_octant_find_faces_sub(pm,oct,p+backlist,just_count);
if (onlist) model_octant_find_faces_sub(pm,oct,p+onlist,just_count);
if (frontlist) model_octant_find_faces_sub(pm,oct,p+frontlist,just_count);
if (postlist) model_octant_find_faces_sub(pm,oct,p+postlist,just_count);
}
break;
case OP_BOUNDBOX: break;
default:
mprintf(( "Bad chunk type %d, len=%d in model_octant_find_faces_sub\n", chunk_type, chunk_size ));
Int3(); // Bad chunk type!
return 0;
}
p += chunk_size;
chunk_type = w(p);
chunk_size = w(p+4);
    mprintf(( "End chunk_type %d\n", chunk_type ));
    mprintf(( "End chunk_size %d\n", chunk_size ));

void model_octant_find_faces( polymodel * pm, model_octant * oct )
{
ubyte *p;
int submodel_num = pm->detail[0];

p = pm->submodel[submodel_num].bsp_data;
mprintf(( "chunk_type OP_DEFPOINTS %d\n", OP_DEFPOINTS ));
mprintf(( "chunk_type OP_FLATPOLY %d\n", OP_FLATPOLY ));
mprintf(( "chunk_type OP_TMAPPOLY %d\n", OP_TMAPPOLY ));
mprintf(( "chunk_type OP_SORTNORM %d\n", OP_SORTNORM ));
mprintf(( "chunk_type OP_BOUNDBOX %d\n", OP_BOUNDBOX ));

oct->nverts = 0;
model_octant_find_faces_sub(pm, oct, p, 1 );
    mprintf(( "%s\n", "pass" ));
if ( oct->nverts < 1 ) {
oct->nverts = 0;
oct->verts = NULL;
return;
}

oct->verts = (vec3d **)vm_malloc( sizeof(vec3d *) * oct->nverts );
Assert(oct->verts!=NULL);

oct->nverts = 0;
model_octant_find_faces_sub(pm, oct, p, 0 );

// mprintf(( "Octant has %d faces\n", oct->nfaces ));
}

Results in this
Code: [Select]
chunk_type OP_DEFPOINTS 1
chunk_type OP_FLATPOLY 2
chunk_type OP_TMAPPOLY 3
chunk_type OP_SORTNORM 4
chunk_type OP_BOUNDBOX 5
First chunk_type 1
First chunk_size 74590
End chunk_type 4
End chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 4
First chunk_size 80
First chunk_type 0
First chunk_size 0
First chunk_type 5
First chunk_size 32
End chunk_type 3
End chunk_size 80
<crash here>

74590 is not multiple of 4 and that  p += chunk_size; that makes the pointer to go to unaligned memory, so i went on to check that first chunk size on retail models:
Code: [Select]
First chunk_size 16072 | 4018.000000
First chunk_size 15268 | 3817.000000
First chunk_size 9716 | 2429.000000

No point in writting them all, they are all multiple of 4.

So Goober, you are right.
Now this happens while reading the .pof into memory or when reading the data in memory? Because if this happens while reading the file... well, i not sure if it can be fixed, altrought i dont know the code to think in a workaround, maybe memcpy the entire thing and work from there. Where exacty this is done in the code?
« Last Edit: November 23, 2019, 09:56:17 am by ShivanSpS »

 

Offline Goober5000

  • HLP Loremaster
  • Administrator
  • 214
    • Goober5000 Productions
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
The chunk size is specified as a number within the POF file itself.  So the first thing to do is to find out what that chunk represents and why it was given that particular size.  It is very possible that whatever program was used to create the POF file wrote incorrect data.

The best solution is to re-export the POF with the correct settings.  If that is not possible, then a fallback option is to memcpy the bytes, manually align them, and then read the data from the copy.  But that would be much more complicated.

 

Offline ShivanSpS

  • 210
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
I would like to come out with a fix that works in general rather than going to edit every pof for every mod out there.

There are two things that may work:

1) use the "__packed" atributte, this should allow for unaligned access of data, i did this with the pointer, but my attempts so far has been unsuccessfull, but i never did this before so i need to read about it a bit more, i may not be doing it right or over the right data.

2) memcpy and fix does not sound that hard, but i need to know exactly was wrong, its the BSP_Data that is not correctly aligned here right?

This data?
Code: [Select]
p = pm->submodel[submodel_num].bsp_data;
If that i think i can come out with a code that memcpy it and if the chunk size is wrong, fix it.

Anyway in both cases it will take a while as i need to read and understand how the bsp_data works, for example i do not understand why model_octant_find_faces_sub is recursive.
« Last Edit: November 23, 2019, 02:58:46 pm by ShivanSpS »

 

Offline Goober5000

  • HLP Loremaster
  • Administrator
  • 214
    • Goober5000 Productions
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
More specifically it is the chunks in the BSP data tree that are not aligned.  All of the chunks are concatenated together, but if some of them are not multiples of 4, they will not be on the proper boundaries.

I will reiterate what I said before:
Quote
So the first thing to do is to find out what that chunk represents and why it was given that particular size.  It is very possible that whatever program was used to create the POF file wrote incorrect data.

We don't know why the chunk did not have a proper size.  It is possible that there is a bug in the POF program, which means that it is important to understand the bug before attempting a fix.  There may be a deeper problem hidden beneath the superficial problem.

 

Offline ShivanSpS

  • 210
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
Well yes, the V models are all aligned that cant be a coincidence. Actually it makes sence since back in the 90s having unaligned access would have performance costs, today thats a non-issue. Except for these ARM processors that for some reason cant deal with it, even trough they should.

OK, taking a look at the pofs first, that would be the PCS2 and PCS1 right?

And if someone else also want to check for unligned models adding this
Code: [Select]
if((chunk_size % 4) != 0)
{
mprintf(( "Warning: Unaligned memory access: Chunk Type %d, chunk_size %d in model_octant_find_faces_sub\n", chunk_type, chunk_size ));
}

before
Code: [Select]
p += chunk_size;
chunk_type = w(p);
chunk_size = w(p+4);
in model_octant_find_faces_sub on modeloctant.cpp is a easy way to detect it.
« Last Edit: November 23, 2019, 03:44:06 pm by ShivanSpS »

 

Offline Goober5000

  • HLP Loremaster
  • Administrator
  • 214
    • Goober5000 Productions
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
That looks good, but you should also check to see if the frontlist, etc. variables are divisible by 4.  That could cause the same issue.

PCS2 is the main program used for editing POF data.  However I don't think it's the program that's used for converting the original 3D model file to POF in the first place.  I recommend asking a modeler about that.

 

Offline ShivanSpS

  • 210
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
The first was Modelview32 i think.

I checked ModelView32 there is no reference to anything, but i cant find the source code so i cant look into it. Opening a pof, go to edit and save it does not fix it.

PCS2 the same, i took a look in the source code https://github.com/scp-fs2open/PCS2/blob/master/src/BSPDataStructs.cpp
int BSP_DefPoints::Write(char *buffer)
There is nothing there enforcing aligment.

In the end i think it is a 20 year old oversight, V tools produced properly aligned models and Modelview and PCS do not, and no one realised because it just works on x86.

I think PCS2 can be fixed so in going to open a issue on git. Them ill start thinking in a workaround in fs2 code for these cases, that dosent mean i can do it, but ill try.

WAIT, hang on, i found this on ModelView32 readme.
Quote
Note:
~~~~~
- If you save a POF in the editor that you just extracted from the original
  FreeSpace 1/2 .VP files, the file will be a bit smaller than the original
  file, even though nothing changed. This is not a bug, there is no
  information lost. However the original Volition editors wasted some bytes
  containing nothing but null or CRLF bytes at the end of strings, which are
  however totally useless. MODELVIEW in contrast cuts strings to their actual
  content.
Those extra bytes that modelView is cutting were the aligment bytes, i have no other explanation. This carried on to PCS2.
« Last Edit: November 23, 2019, 08:23:54 pm by ShivanSpS »

 

Offline Goober5000

  • HLP Loremaster
  • Administrator
  • 214
    • Goober5000 Productions
Re: [3.7.2-ARM] Unaligned Memory Access at modeloctant.cpp:28 (title edited)
OOF

Yet another example of the idiotic human fallacy, "I don't understand it, therefore it must be wrong."