Unfortunately, I have came to a conclusion that current multicore programming is not what it is supposed to be for my purposes. There seems to be something fundamentally wrong with the current PC processor technology to take full advantage of the cores. From what I have gathered, the culprit is the memory that is shared between several processors that is making things more difficult. Though I don't really claim to be expert with this issue, but I have seen lots of benchmarks showing little improvement of performance for quadcores or dualcores, and then again some of them do show that.
However, I find out that the CUDA is actually quite interesting from parallel computing point of view, and my current understanding is that this GPU implementation works around the memory issue by having different memory spaces reserved for each processor. And they seem to have quite good track record of scientifically shown increases in performance. Right now I'm actually considering buying one. But given as I'm running with Athlon XP I guess it would mean total system revamp, and I don't think this computer has served it's time wholly yet.