Modding, Mission Design, and Coding > Cross-Platform Development

My little amdgpu bughunt

(1/3) > >>

themaddin:
Where to begin. I'm a little surprised no one else seems to have run across this issue.
I'm playing on a current Archlinux with an Intel CPU and AMD RX560 graphics card,
mesa with no amdgpu-pro or other fancy stuff as drivers.

On newer nightly builds and RC19, my system would regularly freeze a few seconds into any mission. Almost instantly with smaa enabled.
FSO logs didn't show anything wrong, sound worked, could pause/unpause, but I couldn't even
switch to a tty to look what was wrong - my display was just completely frozen!
A hard reboot later I looked into journalctl and found a mass of red lines about a gpu fault 146 or 147,
pretty similar to these bug reports: https://bugs.freedesktop.org/show_bug.cgi?id=107152

Now by running builds one after another from working to not working,
I've found everything started with 3.8.1-20181107 and this commit:
https://github.com/scp-fs2open/fs2open.github.com/commit/bb6c00a5164b89d3d51275992b25ea5313669282#diff-3a82fccc233b03c40a01981f092bc2a0
Cloned current master branch, commented out the two lines, built.
Was able to play through BtA:Operation Templar Mission 2 with smaa enabled (fxaa disabled), which had previously run 10 seconds tops.
So I'm assuming this "fixed" it for me.

Now to my questions: The commit comment is pretty cryptic ("Update gropengltnl.cpp"). What was this change meant to accomplish?
Has it caused problems anywhere else? What intended improvements am I foregoing? Is there another workaround known?

I mean, I'm happy everything seems to be working now, and I know AMDGPU isn't all that stable,
but I'd like to know WHY it works now, and understanding OpenGL is a little beyond a small-time hobbyist coder.

Thanks for any help!

niffiwan:
good detective work!

The pull request that included that commit has more info, basically it was removing errors when creating a shadow framebuffer. Seems that it was tested on Nvidia & Intel but not AMD!  :nervous: When you removed the lines do you see the other errors listed in the pull request? If not then maybe some sort of conditional statement is needed for AMD vs Nvidia/Intel.

themaddin:
You mean in the fs debug log, right?
Haven't created one since everything worked, but I'll do so soon.

themaddin:
https://fsnebula.org/log/5dd10b02cb0d3322ec684742
Here it is. Error present on line 144.
I'm getting a bad feeling about this...
Trying out something.

EDIT:
OK, that's what I was afraid of.
RC 19 works if I disable shadows.

From what I was able to gather from the graphics card board, RX560 should have more than enough power
to draw shadows. Does anyone have experience with this card, maybe on windows or with another driver?
Might AMDGPU-PRO help here?

Anyway, the problem doesn't seem to lie with FSO, but my system.
Detective work on a case that doesn't exist, it appears.

next EDIT:
Official builds work with AMDGPU-PRO. GNOME on Wayland doesn't - Xwayland segfaults. So I'm using X11.
FSO on any x11 DE has an issue I experienced some time ago:
Everything is dark. Frome the start screen to the missions, everything looks like its behind a black layer of about 50% transparency.
I can workaround that with -fullscreen-window. Is this behaviour known? Is there a better fix?
Otherwise, since this setup seems to be working, and I'm getting generally good framerates, I'll be sticking with it.

m!m:
Great work tracking this down! The changes were introduced in this pull request: https://github.com/scp-fs2open/fs2open.github.com/pull/1924/files They are correct as far as I can tell so I would guess that this is a Mesa bug. I have an RX480 and also use that with the Mesa drivers and had some GPU hang issues some time ago but not anymore. However, I remember disabling shadows at some point and never having issues after that so maybe this is the same bug.

Could record an OpenGL trace with apitrace of an instance of this bug? Maybe that can be sent to the Mesa devs to resolve this.

Navigation

[0] Message Index

[#] Next page

Go to full version