The compiler should optimize it to some/most extent for 64-bit, but getting much more than that will require asm, and that isn't likely to ever happen. And of course Win64 and Linux64 optimizations are different since Microsoft went with LLP64 instead of LP64 like most other 64-bit (*nix) platforms. Optimizations that would work best for Linux64 wouldn't be much good for Win64, and vise versa. I have made some general 64-bit optimizations, but only a very few of them. Until the various 64-bit platforms can each be properly tested for every change (and until I get the type situation a little more sane with the platform upgrade for 3.7), it's just not a good idea to do too much in the way of optimization yet.
It's only multi-core optimized to an extent, events/input and music and various bits of sound handling operate in a separate threads, and so get the benefit of the extra core(s). At some point I think that AI and/or physics and/or collision stuff will become multi-threaded. Currently too much of the code just isn't thread-safe though.