Author Topic: Multiple CPU cores (Read 15593 times)

MP-Ryan · **on:** August 23, 2008, 03:46:48 am

Do the latest 3.6.10 builds automatically set themselves to use only a single processor core? I've noticed whenever I launch FS2 it is setting itself to only one processor.

karajorma · **Reply #1 on:** August 23, 2008, 07:40:53 am

They should do.

phreak · **Reply #2 on:** August 23, 2008, 12:09:39 pm

On quad core systems, FSO will set itself to only use core #2

MP-Ryan · **Reply #3 on:** August 23, 2008, 02:17:27 pm

Quote from: karajorma on August 23, 2008, 07:40:53 am

They should do.

Ah, so it is intentional?

I just noticed it last night. I also noticed that if I set it to run on both cores I had no trouble, which is why I asked.

taylor · **Reply #4 on:** August 24, 2008, 06:13:19 pm

Yeah, it's intentional. There have been numerous reports of problems related to multiple cores so it was just decided to make the code handle it by default. It currently defaults to the second core in a multi-core system.

You can use the "ProcessorAffinity" setting to change it if you need to do so. It's a DWORD value, with 0 making it use all available cores, 1 making it use the first core, and 2 to make it use the second core (the default). Just be aware that the value is a bit-mask, so don't simply change it to just anything, make it one of those three numbers only.

MP-Ryan · **Reply #5 on:** August 24, 2008, 06:59:24 pm

Quote from: taylor on August 24, 2008, 06:13:19 pm

Yeah, it's intentional. There have been numerous reports of problems related to multiple cores so it was just decided to make the code handle it by default. It currently defaults to the second core in a multi-core system.

You can use the "ProcessorAffinity" setting to change it if you need to do so. It's a DWORD value, with 0 making it use all available cores, 1 making it use the first core, and 2 to make it use the second core (the default). Just be aware that the value is a bit-mask, so don't simply change it to just anything, make it one of those three numbers only.

That seems to work.

Any way a feature like this could be added to the command line options, perhaps under game Speed or Troubleshooting?

Colonol Dekker · **Reply #6 on:** August 24, 2008, 07:04:10 pm

Where do i change that setting?
Launcher or a windows setting?

MP-Ryan · **Reply #7 on:** August 24, 2008, 07:25:41 pm

Quote from: Colonol Dekker on August 24, 2008, 07:04:10 pm

Where do i change that setting?
Launcher or a windows setting?

Registry.

Go to Start - Run - type regedit.exe - press enter.

Go to HK Local Machine - Software - Voilition

New -> DWORD

Name it "ProcessorAffinity" (no quotes).

Right-click on the new key and select Modify... (NOT Modify Binary data). Change the value to zero and set it to hexidecimal. Click OK. Close Regedit.

---

See how much better a command line option would be? =)

taylor · **Reply #8 on:** August 24, 2008, 08:02:05 pm

Quote from: MP-Ryan on August 24, 2008, 06:59:24 pm

Any way a feature like this could be added to the command line options, perhaps under game Speed or Troubleshooting?

Definitely not. How we are doing it now is the correct way to do it.

MP-Ryan · **Reply #9 on:** August 24, 2008, 10:46:23 pm

OK then.

For anyone else who wants both cores enabled, I'm attaching a registry file that contains only the appropriate key. Rename from .txt to .reg, then double-click to add it to the registry. Windows only, of course.

[attachment deleted by ninja]

FUBAR-BDHR · **Reply #10 on:** August 24, 2008, 11:00:18 pm

I know someone will ask sooner or later so I might as well do it now.

Any way to setting the default to core 3 or 4? I assume you could still set it to 0 and then change it in task manager after it's running.

ARSPR · **Reply #11 on:** August 25, 2008, 01:29:38 am

Ooops. We are maybe opening Pandora's Box.

In my system, (see my signature), FSOpen definitely ran slower on multiples cores. Before the "single-core" selection was added I even used imagecfg tool, (look for it in google), to force affinity in just one core.

Please, test deeply, and DON'T GIVE ANY GENERAL ADVICE ABOUT FREEING THE AFFINITY SELECTION.

MP-Ryan · **Reply #12 on:** August 25, 2008, 02:28:45 am

Quote from: ARSPR on August 25, 2008, 01:29:38 am

Ooops. We are maybe opening Pandora's Box.

In my system, (see my signature), FSOpen definitely ran slower on multiples cores. Before the "single-core" selection was added I even used imagecfg tool, (look for it in google), to force affinity in just one core.

Please, test deeply, and DON'T GIVE ANY GENERAL ADVICE ABOUT FREEING THE AFFINITY SELECTION.

I wonder if its a difference between quad and dual cores?

My Core2 Duo E8400 (which is 3 GHz) runs it just fine using both cores, though the performance difference between using one or both is pretty much negligible.

FUBAR-BDHR · **Reply #13 on:** August 25, 2008, 09:34:36 pm

My Core2Quad ran just fine with all 4 enabled so it's not that.

taylor · **Reply #14 on:** August 25, 2008, 09:58:40 pm

Quote from: FUBAR-BDHR on August 24, 2008, 11:00:18 pm

Any way to setting the default to core 3 or 4?

It's possible, but I'm not sure what values to tell you to use. Based on what you set you could end up on a virtual rather than a logical CPU, which could seriously kill performance for you. I only needed it to work properly for dual core systems (since it works in all cases then) so I didn't really try and figure out how the hell that affinity-setting system function actually works.

I will say that for the basic settings:
- 0 does nothing (the code on our side does nothing that is)
- 1 restricts it to CPU0
- 2 restricts it to CPU1
- 3 restricts it to CPU0+CPU1

Based on that you can maybe figure out how to get it on core 3 or 4 on your own. You can enter in bad values, but I don't believe that it will cause any damage or anything, the command should just fail and it would be the same as using a setting of 0.

Oh, and the function in question is called SetProcessAffinityMask(), if you want to google for it or look on MSDN to get more info on it.

FUBAR-BDHR · **Reply #15 on:** August 25, 2008, 10:06:02 pm

I'll probably never have a need for it unless someone comes up with another file that needs password cracked. That was the only time I've ever needed to use set affinity.

WMCoolmon · **Reply #16 on:** August 26, 2008, 12:32:41 am

This and this suggests that it's just a mask. So:

1 - First processor/core
2 - Second processor/core
4 - Third processor/core
8 - Fourth processor/core
16 - Fifth processor/core

...and so on, where 2^n is the nth processor or core. To use multiple ones you add the numbers:

ProcessorAffinity = 8+16 (Use fourth and fifth cores)

I couldn't find anything that stated what happens if you use the 'wrong' numbers. So use at your own risk.

IceCadavers · **Reply #17 on:** September 04, 2008, 04:03:49 am

I might just be a nugget here (in fact this is my first post) but because a previous job had me trying to explain this concept (in much futility) to countless people who wanted to know why their shiny new phenom/core2quad wasn't as fast as it "should be" i feel compelled to take part in this thread in spite of having no established credibility here.

that said... to anyone who has even considered toying with the program's multicore affinity in hopes of improving performance, listen up: just leave it, seriously, no touch! changing what cores it runs on in nearly any case will result in little more than mere placebo at best, but you're more likely to make performance suffer than anything.

multi-core processors offer little advantage to gaming overall, especially a game written in 1999 before such cpus were even on the market. why?
because:
-pc games rely more on graphics processors than cpus, frequently even if a game maxes your video card and your RAM it still won't be bogarting the cpu time
-what little benefit is to be seen with multicore processors is that you can restrict your game to the second core and the os and other apps will pretty much stick to the first core
-a game can potentially make effective use of multiple cores if they are programmed to do so, but unless someone here seriously overhauled the engine on a fundamental level (or i'm seriously misunderstanding something), fs2 is not one of those games

so unless you've committed one of your cores to encryption cracking, or you're testing your newtonian physics mod that you wrote to thread itself on a separate core, i hope i don't come across as pretentious because i doubt people without at least some practical knowledge wander into this part of the forum much but i would strongly suggest leaving your registry and your multicore as they are

that said, i think i've run my mouth enough for now, so, it's nice to meet you all

karajorma · **Reply #18 on:** September 04, 2008, 04:12:33 am

Sounds pretty much correct to me.

MikeRoz · **Reply #19 on:** September 11, 2008, 02:56:19 am

This is rubbish! Absolute rubbish! Multi-core CPUs are the wave of the future! Intel demoed an 80-core CPU running on an FPGA years ago - even though it was running on an FPGA, it could run XP! It didn't run very fast, but because it was a prototype board all the signals had to go through a bunch of gates. Once they get a design that uses real transistors instead of gates it will be way faster.

These CPUs are coming out really soon. I even bet tomorrow they'll be announcing the Core 3 32-o with 32 cores. Only this time it won't be a prototype - oh no, it will be a monster 4 GHz, 32-core computing machine. Should be really easy to do once they get their 32 nanometer process up and running. With the power savings I bet it will run with less power than one of their 90 nanometer octo-cores.

The FS2SCP cannot ignore this trend. We need to be able to take advantage of these CPUs NOW, before it gets beaten out of the gate by all the other space simulator projects, open- and closed-source. I hear the Descent Rebirth team has a huge update coming that will take Descent's ~15 year-old engine and make it look and run ten times better than anything currently on the market, or that will soon be on the market. And you know how they're going to do that? By taking advantage of multiple cores, and the special SSE5.1 instructions built into these upcoming Intel CPUs. Damn, I hear even Orbiter is adding space combat. It's going to be awesome.

We have to make the FS2 engine more parallelism if we want to keep up. Our target hould be to get it running perfectly on one of these soon-to-be released Intel CPUs. I know your instinct is going to be to make use of that awesome feature that takes all 32 cores and reconfigures them into one giant core to rule them all. But that's wrong! That's what all the other teams want you to try! What happens when you have that many instructions in flight at that kind of speed (we're talking terahertz here) is that you strangle even the best I/O bus money can buy. That's right, the FSB, HyperTransport, even Intel's new QuickPath can't handle the raw amount of data put out by a CPU in this mode. Sure, there are some narrow scientific uses nobody cares about that keep all the traffic on-die and require no access to the memory controller or the graphics controller, but FS2 is not one of those applications. That CPU is generating thousands of teraflops per second, and it needs to send those teraflops to the graphics card as fast as it can. The problem is that graphics cards have to process those teraflops really fast to put them on the screen, right? And ATi only just came out with a video card that can only handle two teraflops per minute. So clearly it can't keep up with these Intel CPUs. The other teams know that, and they're already looking for another way to harness the power of these new CPUs.

Fortunately, I was recently fired by a team working on a very promising space project which I'm not allowed to name. I know the method they're using to optimize to put more parallelisms in their code! The trick isn't to try and get all of the cores on the CPU doing the same thing at the same time to make it faster, because each core will just end up waiting its turn for the L2 cache, no matter how big it is. No, the trick here is like this: say you have a 32-core CPU. You load your executable into memory starting at address 0x00000000. Yeah, I know, that's really reserved kernel memory or something like that, but for the sake of argument let's pretend that the OS uses an addressing scheme that loads the executable at 0x00000000. Pretend that the OS realized I had this awesome idea and made all this space for me by throwing the kernel on the stack (which usually starts at 0xFFFFFFFF and grows down). You know what I mean here.

So anyway, you have core 0 start executing at 0x00000000. Then you have core 1 start executing at 1/s, where s is the size (in the address space) of your program. Core 2 starts at 2/s, core 3 at 3/s, and so on and so forth. All of the code in the FS2 executable will be executed much faster. How much faster you ask? Try 32 times faster! That's right, we've finally found a way to get linear speedup out of multiple cores!

But how is this possible? Every computer science paper you've read, even the most optimistic marketing you've seen have shown less than linear speedup as you add codes. What's the secret? The secret is the IOMMU, or I/O Memory Management Unit. It was initially developed because it would be useful for Virtualization. The problem is that even with these massively powerful processors, virtual reality is years away yet. At least two. So in the meantime, we've found another use for this really fast memory management device. When it is implemented on the same die as the CPU, it can be used to pass messages between all 32 cores. The device literally stands on its head (okay not literally, it would fall out of its socket) to pass just-in-time messages from one CPU to another, supplying it with information just as it needs it.

So, I realize I kind of rambled on a bit here, but this new technology gets me excited. In short, here's the steps I'm proposing the SCP team take:

1. Design an engine which dynamically takes different slices of the game code and passes it to idle CPUs as needed.
2. Hack together an IOMMU driver to make the performance of the above solution acceptable. The goal should be no less than linear CPU scaling as CPUs are added. I think the best we can hope for is 2xlinear, if we find a way to take advantage of reverse-hyperthreading.
3. Make a complete switch to ray-tracing because rasterized graphics do so poorly when you spread the load out over more processing cores (just look at the size of GT200, and how it's beat in some cases by the smaller RV770!).
4. Take advantage of the iSSE5.1 instructions, which are the most important part of what I just explained.
5. Beat the DXX-Rebirth team to market, because they're going to use all their CPU power to make sure their game has an awesome story and lots of plot twists.

News:

Author Topic: Multiple CPU cores (Read 15593 times)

Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores

Re: Multiple CPU cores