Well, I did find that the Pentium-3 laptop I tried was very much limited by it's GPU - a Rage128. Actually, it seemed to be mostly limited by the relative lack of VRAM for the textures. This is with retail binaries and data - the Rage128 drivers for WinXP didn't seem to support OpenGL, so I couldn't run FSO at all.
The ARM11's FPU is relatively powerful compared to the rest of the CPU - it can be used very much like a Cray-1, firing off vectors of operations to complete in sequence and in parallel - and seems to be on par with the Pentium 3 at the same clock speed (assuming an ordinary compiler rather than hand optimisation). Indeed, if the compiler doesn't use the SSE versions of the FP instructions, the P3 gets a significant penalty to multiply and add throughputs because x87 instructions always run at extended precision. The ARM11 can pipeline independent single-precision multiply-adds at one per cycle, which is on par with many relatively modern CPUs (Bulldozer and PPC970 can do better without SIMD, Sandy Bridge can't). It seems that ARM decided that if you were bothering to fit an FPU, you probably really needed it!
The ARM11 is also more closely coupled to the memory than a Pentium-3. Remember that in those days, Intel still attached the memory via the separate Northbridge chip, which introduced a lot of latency compared to an on-die memory controller as used today. The Northbridge in the laptop is the ubiquitous i440BX, which has fairly poor latency. The ARM11 has to share memory access with the GPU, but the latency should still generally be lower than a northbridge design would be.
So the main handicaps that the ARM11 has compared to the P3 is the decode/issue bandwidth of one instruction per cycle, versus up to three for Intel, and the lack of wide SIMD support. It remains to be seen how important that is in practice - I suspect it really isn't very important.