Author Topic: Multicore stuff  (Read 4975 times)

0 Members and 1 Guest are viewing this topic.

Offline Mika

  • 28
I recently read some articles whose message was: programs written for single core don't apply anymore. The whole programming industry needs a new way of thinking if the performance of multicore processors is ever to be delivered. That kept me of thinking, is it likely that there are actually no methods to fully develop programs to utilize multiple processors? Some serious game developers talk about more Ph. Ds required especially in this area to fully utilize multicore stuff.

From my personal part I know that ray-tracing has been multi-cored for a long time, and it is a kind of process that should be easily parallellized, and performance increases with multiple cores are verified, though the performance increase curve is logarithmic due to processors needing to communicate at least some amount of data. But what about other programs? When we have had Master of Science and Doctors of Philosophy writing scientific and engineering software for years, what about "normal" programs like games and so?

Can I expect Windows 7 to handle multiple processors any faster than Windows XP? Would this, along with significant increase of memory capacity, be a  reason to buy the next generation killer computer that should withstand the test of time for next 5 or 10 years?

I know there are quite skilled programmers in here: what are your thoughts about this stuff?

Mika (approaching 1 promille levels fast for a Friday night!)
Relaxed movement is always more effective than forced movement.

 

Offline blackhole

  • Still not over the rainbow
  • 29
  • Destiny can suck it
    • Black Sphere Studios
Windows 7 handles multiple cores differently somehow, since FL studio 8 only uses one core in vista and uses both of my cores in windows 7. It appears to have some sort of hyperthreading going on.

However, if an application doesn't need multiple threads, it doesn't need multiple threads. Like, a text editor.
« Last Edit: February 15, 2009, 11:19:51 pm by blackhole »

 

Offline kode

  • The Swedish Chef
  • 28
  • The Swede
    • http://theswe.de
Well, it's more or less true. The idea that processor speeds double every 24 months or so is no longer valid. We're entering "the great computing depression". Already it's possible to put more transistors on a chip than we can power on, there are high error rates, climbing as die sizes get smaller, it takes 200 cycles to fetch from RAM, there's a limit to how much we can gain from instruction level parallellism,. It's not gonna cut it anymore.

So what happens now? CRAM IN MORE CORES! Enter parallell computing. Single threaded applications scale at the rate of single thread CPU:s. this increase is diminishing. Whether we like it or not, it's the future.

Pray, v. To ask that the laws of the universe be annulled in behalf of a single petitioner confessedly unworthy.
- Ambrose Bierce
<Redfang> You're almost like Stryke 9 or an0n
"Facts do not cease to exist because they are ignored."
- Aldous Huxley
WAR IS PEACE
FREEDOM IS SLAVERY
IGNORANCE IS STRENGTH

 

Offline lx

  • 22
http://channel9.msdn.com/shows/Going+Deep/Mark-Russinovich-Inside-Windows-7/

This is an interview with one of Windows' Kernel Developers.
Maybe this will help you answering your question how Windows 7 scales with more processors
compared to Vista and XP.

 
I thought I'd throw in some work info.
I've been working on an application at work that has always been a single threaded application (it's written in Salford FTN95 and is very definitely (please!) going to that starry place in the sky sometime soon along with its code base of nightmares.)
It caused a few folks at work a few surprises when they left behind their single core 3.6GHz Xeons for a dual quad core (x8!) 2.4GHz machine. The application was almost literally 2x slower.
We're currently going through the motions of redesigning all the data structures and algorithms to be able to make it multi-threaded/parallel -able.
It is not an easy task!

I do agree with kode, parallel computing will be it in 5 years - but I reckon you'll still be able to buy very fast single/dual core processors because the market will be there for both those that scale well to manymany cores and those that don't scale will still use 1 or 2 cores. (my $0.02)
STRONGTEA. Why can't the x86 be sane?

 

Offline Mika

  • 28
How do you see the programming practise to change? I realised parallellization is far from being well understood. They are saying that high level languages are needed to fully utilize parallellization, otherwise programmer has a high probability to get stuck on a single core programming mode. For example, for all numerical computing stuff I have read and done, not a single one of the cases or studies touched parallellization, even though it is pretty important in numerics.

Mika
Relaxed movement is always more effective than forced movement.

 

Offline blackhole

  • Still not over the rainbow
  • 29
  • Destiny can suck it
    • Black Sphere Studios
How do you see the programming practise to change? I realised parallellization is far from being well understood. They are saying that high level languages are needed to fully utilize parallellization, otherwise programmer has a high probability to get stuck on a single core programming mode. For example, for all numerical computing stuff I have read and done, not a single one of the cases or studies touched parallellization, even though it is pretty important in numerics.

Mika

The only reason you need high level languages to support parallelization is because most programmers are stupid.

 
How do you see the programming practise to change? I realised parallellization is far from being well understood. They are saying that high level languages are needed to fully utilize parallellization, otherwise programmer has a high probability to get stuck on a single core programming mode. For example, for all numerical computing stuff I have read and done, not a single one of the cases or studies touched parallellization, even though it is pretty important in numerics.

Mika

In terms of a change in practice, it really does mean thinking way in advance of writing any code. Coding cowboys will run into a lot of trouble trying to write threaded code, while those who write detailed documents of exactly what they're trying to do will spot errors (such as not locking shared variables before use) and algorithmic inefficiencies (such as contention for locked variables) far in advance and be able to either rewrite or mitigate their effect before writing code.
STRONGTEA. Why can't the x86 be sane?

 

Offline blackhole

  • Still not over the rainbow
  • 29
  • Destiny can suck it
    • Black Sphere Studios
How do you see the programming practise to change? I realised parallellization is far from being well understood. They are saying that high level languages are needed to fully utilize parallellization, otherwise programmer has a high probability to get stuck on a single core programming mode. For example, for all numerical computing stuff I have read and done, not a single one of the cases or studies touched parallellization, even though it is pretty important in numerics.

Mika

In terms of a change in practice, it really does mean thinking way in advance of writing any code. Coding cowboys will run into a lot of trouble trying to write threaded code, while those who write detailed documents of exactly what they're trying to do will spot errors (such as not locking shared variables before use) and algorithmic inefficiencies (such as contention for locked variables) far in advance and be able to either rewrite or mitigate their effect before writing code.

When I was writing my engine, all I knew in advance was that i'd need to lock the physics thread somehow, and it worked out fine. Personally I think it depends on the project and the kind of multithreading involved, because not every project requires a 500 page document detailing how to structure a version number and how many times you're allowed to pick your nose and how your not allowed to exceed 4294967295 characters in a string.

But then again, maybe I'm a multi-threaded coding cowboy :P

 
When I was writing my engine, all I knew in advance was that i'd need to lock the physics thread somehow, and it worked out fine. Personally I think it depends on the project and the kind of multithreading involved, because not every project requires a 500 page document detailing how to structure a version number and how many times you're allowed to pick your nose and how your not allowed to exceed 4294967295 characters in a string.

But then again, maybe I'm a multi-threaded coding cowboy :P

I didn't quite mean it like that!
I 'spose a better way of putting would have been: Have a clue what you're doing before you start, and the scope of what you're trying to do.
If you're not the one who is going to be maintaining the code, you do need to have good documentation since the person reading it won't know your code like you do, and figuring out race and deadlock conditions from code is notoriously tricky.
STRONGTEA. Why can't the x86 be sane?

 

Offline blackhole

  • Still not over the rainbow
  • 29
  • Destiny can suck it
    • Black Sphere Studios
When I was writing my engine, all I knew in advance was that i'd need to lock the physics thread somehow, and it worked out fine. Personally I think it depends on the project and the kind of multithreading involved, because not every project requires a 500 page document detailing how to structure a version number and how many times you're allowed to pick your nose and how your not allowed to exceed 4294967295 characters in a string.

But then again, maybe I'm a multi-threaded coding cowboy :P

I didn't quite mean it like that!
I 'spose a better way of putting would have been: Have a clue what you're doing before you start, and the scope of what you're trying to do.
If you're not the one who is going to be maintaining the code, you do need to have good documentation since the person reading it won't know your code like you do, and figuring out race and deadlock conditions from code is notoriously tricky.

But if no one else is going to be touching the code, you needn't worry about such things :p

 

Offline kode

  • The Swedish Chef
  • 28
  • The Swede
    • http://theswe.de
When I was writing my engine, all I knew in advance was that i'd need to lock the physics thread somehow, and it worked out fine. Personally I think it depends on the project and the kind of multithreading involved, because not every project requires a 500 page document detailing how to structure a version number and how many times you're allowed to pick your nose and how your not allowed to exceed 4294967295 characters in a string.

But then again, maybe I'm a multi-threaded coding cowboy :P

I didn't quite mean it like that!
I 'spose a better way of putting would have been: Have a clue what you're doing before you start, and the scope of what you're trying to do.
If you're not the one who is going to be maintaining the code, you do need to have good documentation since the person reading it won't know your code like you do, and figuring out race and deadlock conditions from code is notoriously tricky.

But if no one else is going to be touching the code, you needn't worry about such things :p

I dissent. You in six months may not necessarily have a clue about what present day you was thinking. Even rudimentary documentation is vital for all non-throwaway code.
Pray, v. To ask that the laws of the universe be annulled in behalf of a single petitioner confessedly unworthy.
- Ambrose Bierce
<Redfang> You're almost like Stryke 9 or an0n
"Facts do not cease to exist because they are ignored."
- Aldous Huxley
WAR IS PEACE
FREEDOM IS SLAVERY
IGNORANCE IS STRENGTH

 

Offline blackhole

  • Still not over the rainbow
  • 29
  • Destiny can suck it
    • Black Sphere Studios
When I was writing my engine, all I knew in advance was that i'd need to lock the physics thread somehow, and it worked out fine. Personally I think it depends on the project and the kind of multithreading involved, because not every project requires a 500 page document detailing how to structure a version number and how many times you're allowed to pick your nose and how your not allowed to exceed 4294967295 characters in a string.

But then again, maybe I'm a multi-threaded coding cowboy :P

I didn't quite mean it like that!
I 'spose a better way of putting would have been: Have a clue what you're doing before you start, and the scope of what you're trying to do.
If you're not the one who is going to be maintaining the code, you do need to have good documentation since the person reading it won't know your code like you do, and figuring out race and deadlock conditions from code is notoriously tricky.

But if no one else is going to be touching the code, you needn't worry about such things :p

I dissent. You in six months may not necessarily have a clue about what present day you was thinking. Even rudimentary documentation is vital for all non-throwaway code.

That's what the occasional comment is for.

 

Offline Mika

  • 28
Then another question, how often do you think in terms of processor cycles while doing multithreaded code? Or have the compliers made that sort of thinking obsolete? I mean that if all processors are running parallellizable threads and one finishes before others, how well does the required number of cycles for each operation hold while running also the operating system?

I'm really starting to consider if I should write some parts of code with Assembly (numerics actually). Though it seems a little distant since the last processor I did it was 80186. I know that in general the compilers should be able to optimize code more efficiently than me, but recent survey with disassembler on the code that the complier put out would suggest that in my choice of compiler it actually isn't so.

Mika
Relaxed movement is always more effective than forced movement.

 
While I wouldn't suggest that that kind of thinking is obsolete, I'd suggest that it is not worth writing anything in assembly anymore.
Readability and maintainability need to outweigh performance now.
The difference between a good optimising compiler (with well written code) and hand tuned assembly shouldn't be huge (and in fact, sometimes the compiler may pick up on an optimisation you miss!).

I mean that if all processors are running parallellizable threads and one finishes before others, how well does the required number of cycles for each operation hold while running also the operating system?

Depends on what you want it to do!
STRONGTEA. Why can't the x86 be sane?

 

Offline Mika

  • 28
Quote
While I wouldn't suggest that that kind of thinking is obsolete, I'd suggest that it is not worth writing anything in assembly anymore.
Readability and maintainability need to outweigh performance now.
The difference between a good optimising compiler (with well written code) and hand tuned assembly shouldn't be huge (and in fact, sometimes the compiler may pick up on an optimisation you miss!).

I'm not sure about that. Guys at ASM Community say otherwise. Of course, there's no point on writing the whole thing with Assembler, but the most time consuming inner loop. I read they have actually gained remarkable speedups with hand optimizations, but this is coming from people who I guess have been doing that stuff for tens of years and know well the internal happenings in the most current processors.

Also, the little cynicist inside tells me it's a better career move to leave an obscure, totally undocumented Assemby code part inside your program that is only maintainable by you.  :D
But then again, I'm a littlebit old fashioned programmer.

Though, I tried to search the forum in ASM Community to find information on multithreading with Assembly. I didn't get many hits and the forum is strangely quiet about that. Maybe I should consider that as a hint?

Mika
Relaxed movement is always more effective than forced movement.

 

Offline blackhole

  • Still not over the rainbow
  • 29
  • Destiny can suck it
    • Black Sphere Studios
The optimizations gained by coding in assembly are not worth the absurd amount of time that is wasted on them. Of course you can optimize stuff in assembly. Does that mean that recoding your networking engine in assembly to gain 1 frame per second is worth it when you have a bottleneck in your rendering pipeline that causes a 30% reduction? Noooooooo.

Unless you are absolutely, totally positive that you have optimized everything that can possibly be optimized in your C++ program, and that there is No faster way of doing it throughout your entire program (or all classes that interact with the section of code in question), should you actually start thinking about writing it in assembly. Assembly optimizations work well with anything that is so ridiculously low-level that you should either be finding source code that has already done it for you, or be ignoring it altogether.

Quote
Also, the little cynicist inside tells me it's a better career move to leave an obscure, totally undocumented Assembly code part inside your program that is only maintainable by you.
That's not a cynicist, that's a mentally retarded inner child that has been deprived of nutrition for weeks on end.

 
I've been reading about pointer aliasing in C/C++ recently (specifically, where and when to use the restrict keyword)
Maybe this is one of the major reasons why there is a big difference.
I'm still learning!
STRONGTEA. Why can't the x86 be sane?

 

Offline castor

  • 29
    • http://www.ffighters.co.uk./home/
I agree that ASM optimizations are waste of time, most of the time...
But if I've followed this correctly, Mika is talking about scientific calculations involving heavy number crunching (to be run on supercomputer?). So most of the time will probably be spent in just a few tight loops. ASM optimizations could make sense in that case.

 

Offline Mika

  • 28
Quote
I agree that ASM optimizations are waste of time, most of the time...
But if I've followed this correctly, Mika is talking about scientific calculations involving heavy number crunching (to be run on supercomputer?). So most of the time will probably be spent in just a few tight loops. ASM optimizations could make sense in that case.

Otherwise correct with the exception of the supercomputer part. Yeah, most of my stuff utilizes CPU only. Sometimes I need RAM, but it is not that important than the raw computation power. But I don't know if the loops are that tight. Most of them are about inverting a matrix, computing spline coordinates or just plain Newton iteration.

Mika
Relaxed movement is always more effective than forced movement.