Author Topic: FS2 on arm  (Read 5603 times)

0 Members and 1 Guest are viewing this topic.

Offline Steel01

  • 25
Since the last arm discussion is pretty old, I figured I should open a new topic. So, I have a Nvidia Shield Tablet. Pretty fancy Android tablet with full opengl 4.5 support. Recently, someone got the Ubuntu distro made for the Jetson dev board booting on the tablet with full video acceleration. Just about the first thing I did was try to get d2x-xl to run, but didn't get very far (video driver crash loading mission). Next attempt was fs2_open. I had to make a slight tweak to configure.ac, but after that it compiled fine. With stock data files, it runs well. Like 60-80 fps in the first real mission well. Then I loaded up the 2014 mediavps and things started going downhill. Loading a briefing causes a SIGBUS. Dmesg gives me:
Code: [Select]
[ 1451.767551] Alignment trap: not handling instruction ed937a00 at [<005f3704>]
[ 1451.774786] Unhandled fault: alignment exception (0x221) at 0x025d604f

Gdb gives:
Code: [Select]
Program received signal SIGBUS, Bus error.
0x005f3708 in point_in_octant (pm=0x23fffd0, oct=0x2401c24, vert=0x25d604f) at model/modeloctant.cpp:28
28              if ( vert->xyz.x < oct->min.xyz.x ) return 0;

Some googling indicates a misaligned pointer. But the compiler flag -mno-unaligned-access is supposed to fix that. But it had no effect. I'm not entirely convinced the compiler isn't at fault, though. I'm posting to see if anyone here has an idea before I go much further. This is all with trunk as of yesterday. I've also attached a debug log from my last run.

Steel01

Edit: Doing some more poking around. I get the same thing when using the model viewer room too. It's always crashing in modeloctant line28. Starting to poke at the code, it appears the modeloctant functions do some pointer arithmetic that my arm cortex doesn't appreciate. Specifically the vert pointer. It appears to be all aligned on 4's, though. I don't know what the processor expects. Continuing to debug. Side note: all this goes away if I remove the effects and assets vps. Something in those files kicks the program down a different code path that causes the SIGBUS.

Edit 2: Wow, -Wcast-align gives 260 hits (well, grep -c align does on the build log). This could take a bit of work. I'll see if I can't get it running. I'm rather surprised now that it runs at all, even with just stock assets.

[attachment kidnapped by pirates]
« Last Edit: January 12, 2015, 07:20:28 pm by Steel01 »
Snips from Hackers Defend Liberty (Definitions for hacker):
But I like rescuing good words from sad bad fates, and so I cling to Eric Raymond's definition:
    "1. A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary...
    "7. One who enjoys the intellectual challenge of creatively overcoming or circumventing limitations.

I also like tinkerer, as defined by Freedom to Tinker:
    "Your freedom to understand, discuss, repair, and modify the technological devices you own."

 

Offline chief1983

  • Still lacks a custom title
  • Moderator
  • 212
  • ⬇️⬆️⬅️⬅️🅰➡️⬇️
    • Minecraft
    • Skype
    • Steam
    • Twitter
    • Fate of the Galaxy
I wonder if this is something AddressSanitizer could help with.
Fate of the Galaxy - Now Hiring!  Apply within | Diaspora | SCP Home | Collada Importer for PCS2
Karajorma's 'How to report bugs' | Mantis
#freespace | #scp-swc | #diaspora | #SCP | #hard-light on EsperNet

"You may not sell or otherwise commercially exploit the source or things you created based on the source." -- Excerpt from FSO license, for reference

Nuclear1:  Jesus Christ zack you're a little too hamyurger for HLP right now...
iamzack:  i dont have hamynerge i just want ptatoc hips D:
redsniper:  Platonic hips?!
iamzack:  lays

 

Offline Steel01

  • 25
Asan doesn't like this code at all. On my arm box, it dies with what appears to be a null pointer dereference and even the stack trace gets lost. On my x86_64 Fedora 21 machine, it dies in short order with a free on non-malloced address, but at least that one is tracable.

I've been trying to do some research on memory alignment. Some interesting reads out there. Seems x86 lets programmers get away with murder and just covers for them by doing multiple memory reads and shifting to return what should have been a one asm instruction mem read. Arm, among other platforms, throws a SIGBUS on such attempts. So, I'm starting to get an idea of what the problem is. Next problem is to figure out what the affected code is supposed to be doing to figure out a patch. Memcopying the bad data to new, aligned regions would probably work, but is an ugly, slow hack. Though I might try that in a couple places just to see if that stops the crash.

For reference, I'm attaching the stderr of an arm build with -Wcast-align.

Steel01

[attachment kidnapped by pirates]
Snips from Hackers Defend Liberty (Definitions for hacker):
But I like rescuing good words from sad bad fates, and so I cling to Eric Raymond's definition:
    "1. A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary...
    "7. One who enjoys the intellectual challenge of creatively overcoming or circumventing limitations.

I also like tinkerer, as defined by Freedom to Tinker:
    "Your freedom to understand, discuss, repair, and modify the technological devices you own."

 

Offline chief1983

  • Still lacks a custom title
  • Moderator
  • 212
  • ⬇️⬆️⬅️⬅️🅰➡️⬇️
    • Minecraft
    • Skype
    • Steam
    • Twitter
    • Fate of the Galaxy
Hmm, ASan has worked reasonably well for me in the past, not sure why you'd have such rough luck with it on x86 unless we've introduced a major bug recently.  Could be possible.  As for ARM, yeah no idea how it would behave there.  Interested to hear about your findings though.
Fate of the Galaxy - Now Hiring!  Apply within | Diaspora | SCP Home | Collada Importer for PCS2
Karajorma's 'How to report bugs' | Mantis
#freespace | #scp-swc | #diaspora | #SCP | #hard-light on EsperNet

"You may not sell or otherwise commercially exploit the source or things you created based on the source." -- Excerpt from FSO license, for reference

Nuclear1:  Jesus Christ zack you're a little too hamyurger for HLP right now...
iamzack:  i dont have hamynerge i just want ptatoc hips D:
redsniper:  Platonic hips?!
iamzack:  lays

 

Offline Steel01

  • 25
It's something to do with audio. Without ASan, it runs fine, so it *shouldn't* be a system problem, but it does trace through openal...

Code: [Select]
=================================================================
==3276==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x000003b9b010 in thread T0
    #0 0x7f92d9c2753f in __interceptor_free (/lib64/libasan.so.1+0x5753f)
    #1 0x360ac2abea in alcCloseDevice (/lib64/libopenal.so.1+0x360ac2abea)
    #2 0xe24f39 in openal_init_device(std::basic_string<char, std::char_traits<char>, SCP_vm_allocator<char> >*, std::basic_string<char, std::char_traits<char>, SCP_vm_allocator<char> >*) sound/openal.cpp:392
    #3 0xe1687f in ds_init() sound/ds.cpp:1061
    #4 0xe29a9b in snd_init() sound/sound.cpp:147
    #5 0x413987 in game_init() freespace2/freespace.cpp:1853
    #6 0x429b26 in game_main(char*) freespace2/freespace.cpp:7096
    #7 0x42a33b in main freespace2/freespace.cpp:7288
    #8 0x35efc1ffdf in __libc_start_main (/lib64/libc.so.6+0x35efc1ffdf)
    #9 0x40cba4 (/usr/local/games/fs2_open/fs2_open_3.7.1_DEBUG+0x40cba4)

AddressSanitizer can not describe address in more detail (wild memory access suspected).
SUMMARY: AddressSanitizer: bad-free ??:0 __interceptor_free
==3276==ABORTING

My spare time comes and goes, so I haven't had time to dig into the code for the alignment stuff yet. Should have some time in the next three or four days, though.

Steel01

Edit:  I suppose the above is a bad argument. But I haven't had problems with openal. 'Course I haven't tried to use ASan on anything before either.
« Last Edit: January 14, 2015, 11:26:55 am by Steel01 »
Snips from Hackers Defend Liberty (Definitions for hacker):
But I like rescuing good words from sad bad fates, and so I cling to Eric Raymond's definition:
    "1. A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary...
    "7. One who enjoys the intellectual challenge of creatively overcoming or circumventing limitations.

I also like tinkerer, as defined by Freedom to Tinker:
    "Your freedom to understand, discuss, repair, and modify the technological devices you own."

 

Offline chief1983

  • Still lacks a custom title
  • Moderator
  • 212
  • ⬇️⬆️⬅️⬅️🅰➡️⬇️
    • Minecraft
    • Skype
    • Steam
    • Twitter
    • Fate of the Galaxy
Is this in a most recent trunk revision?  RC4 and trunk up until recently had an apparently really bad sound bug in the beam panning code that was causing seg faults on OpenBSD.  Or we could be doing something else similarly silly there causing this issue.  We should be able to get that issue addressed if it's not already.

Edit:  On IRC we discussed that it looks like the error happens with data malloc'd and free'd in the OpenAL code itself.  Are you running the latest OpenAL Soft for Fedora 21?  Seems to be 1.16.0-3.fc21 from my research.
« Last Edit: January 14, 2015, 12:14:40 pm by chief1983 »
Fate of the Galaxy - Now Hiring!  Apply within | Diaspora | SCP Home | Collada Importer for PCS2
Karajorma's 'How to report bugs' | Mantis
#freespace | #scp-swc | #diaspora | #SCP | #hard-light on EsperNet

"You may not sell or otherwise commercially exploit the source or things you created based on the source." -- Excerpt from FSO license, for reference

Nuclear1:  Jesus Christ zack you're a little too hamyurger for HLP right now...
iamzack:  i dont have hamynerge i just want ptatoc hips D:
redsniper:  Platonic hips?!
iamzack:  lays

 

Offline Steel01

  • 25
Everything I've done in this thread has been from trunk. Easier for me to work out of a version control tree and all. This last trace was from a debug build built from a tree updated earlier today. I haven't played much on my desktop lately (read few months), so I cant say right now if I'm affected by that bug. Next time I have a few minutes at that machine (and not just remoting in), I can check.

Steel01
Snips from Hackers Defend Liberty (Definitions for hacker):
But I like rescuing good words from sad bad fates, and so I cling to Eric Raymond's definition:
    "1. A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary...
    "7. One who enjoys the intellectual challenge of creatively overcoming or circumventing limitations.

I also like tinkerer, as defined by Freedom to Tinker:
    "Your freedom to understand, discuss, repair, and modify the technological devices you own."

 

Offline chief1983

  • Still lacks a custom title
  • Moderator
  • 212
  • ⬇️⬆️⬅️⬅️🅰➡️⬇️
    • Minecraft
    • Skype
    • Steam
    • Twitter
    • Fate of the Galaxy
Dang, you replied before I could save my edit.  Please read the last addendum to my previous post :)
Fate of the Galaxy - Now Hiring!  Apply within | Diaspora | SCP Home | Collada Importer for PCS2
Karajorma's 'How to report bugs' | Mantis
#freespace | #scp-swc | #diaspora | #SCP | #hard-light on EsperNet

"You may not sell or otherwise commercially exploit the source or things you created based on the source." -- Excerpt from FSO license, for reference

Nuclear1:  Jesus Christ zack you're a little too hamyurger for HLP right now...
iamzack:  i dont have hamynerge i just want ptatoc hips D:
redsniper:  Platonic hips?!
iamzack:  lays

 

Offline Steel01

  • 25
Heh, just that quick. ;)

Yes, that's correct. No fancy hardware to get hardware acceleration here.

Steel01
Snips from Hackers Defend Liberty (Definitions for hacker):
But I like rescuing good words from sad bad fates, and so I cling to Eric Raymond's definition:
    "1. A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary...
    "7. One who enjoys the intellectual challenge of creatively overcoming or circumventing limitations.

I also like tinkerer, as defined by Freedom to Tinker:
    "Your freedom to understand, discuss, repair, and modify the technological devices you own."

 

Offline Echelon9

  • 210
That ASan trace looks like the bug is within a system library (/lib64/libopenal.so.1). If so, not something we can directly fix in the FS2Open code, unless we deliberately hack around that feature for you.

 

Offline AdmiralRalwood

  • 211
  • The Cthulhu programmer himself!
    • Skype
    • Steam
    • Twitter
Might be worth trying to compile a newer version of OpenAL Soft?
Ph'nglui mglw'nafh Codethulhu GitHub wgah'nagl fhtagn.

schrödinbug (noun) - a bug that manifests itself in running software after a programmer notices that the code should never have worked in the first place.

When you gaze long into BMPMAN, BMPMAN also gazes into you.

"I am one of the best FREDders on Earth" -General Battuta

<Aesaar> literary criticism is vladimir putin

<MageKing17> "There's probably a reason the code is the way it is" is a very dangerous line of thought. :P
<MageKing17> Because the "reason" often turns out to be "nobody noticed it was wrong".
(the very next day)
<MageKing17> this ****ing code did it to me again
<MageKing17> "That doesn't really make sense to me, but I'll assume it was being done for a reason."
<MageKing17> **** ME
<MageKing17> THE REASON IS PEOPLE ARE STUPID
<MageKing17> ESPECIALLY ME

<MageKing17> God damn, I do not understand how this is breaking.
<MageKing17> Everything points to "this should work fine", and yet it's clearly not working.
<MjnMixael> 2 hours later... "God damn, how did this ever work at all?!"
(...)
<MageKing17> so
<MageKing17> more than two hours
<MageKing17> but once again we have reached the inevitable conclusion
<MageKing17> How did this code ever work in the first place!?

<@The_E> Welcome to OpenGL, where standards compliance is optional, and error reporting inconsistent

<MageKing17> It was all working perfectly until I actually tried it on an actual mission.

<IronWorks> I am useful for FSO stuff again. This is a red-letter day!
* z64555 erases "Thursday" and rewrites it in red ink

<MageKing17> TIL the entire homing code is held up by shoestrings and duct tape, basically.

 

Offline Steel01

  • 25
Well, it doesn't break anything ingame for me, so no big deal. Just makes asan impossible to use. Maybe I'll try again after I update to Fedora 22 when it releases.

A no-so-update on the real thread topic. I haven't yet taken time to do more research on this. In a couple months, Nvidia is releasing the successor to my tablet as a console. With USB ports, Bluetooth, gig Ethernet, and all, this could be a pretty sweet TV setup for games like fs2. I'm hoping someone can get Linux booting on it as easily as it was on the tablet. Then I'll be looking a lot more seriously at making this run. In between porting cyanogen and multirom and the works...

Steel01
Snips from Hackers Defend Liberty (Definitions for hacker):
But I like rescuing good words from sad bad fates, and so I cling to Eric Raymond's definition:
    "1. A person who enjoys exploring the details of programmable systems and how to stretch their capabilities, as opposed to most users, who prefer to learn only the minimum necessary...
    "7. One who enjoys the intellectual challenge of creatively overcoming or circumventing limitations.

I also like tinkerer, as defined by Freedom to Tinker:
    "Your freedom to understand, discuss, repair, and modify the technological devices you own."

 

Offline rsaxvc

  • 27
    • rsaxvc
What you're running into is an alignment fault. This happens if you try to access a pointer whose address modulo that types alignment requirements is not zero. IIRC, C/C++ call this undefined behaviour, Intel patched it in hardware forever ago, ARM setups commonly patched it in the fault handler( but yours doesn't), and newer ARMs can do it in hardware.

For portability and performance we should fix it wherever it is. Ran into the same thing on Solaris/SPARC.

That compiler flag tells the compiler not to generate any unaligned accesses as optimizations, but devs can goof up with casting still.