Well, I've settled on multitasking, because even if I where to do a multiprocessing OS, I'd still have to do a context swap whenever a process was paused/canceled.
So, a bit of news on context swapping:
Variables: I might not actually have to save the status of variables, since for the most part they're stored onto the stack until they're no longer needed. However, I might have to worry about them whenever the stack size is different from when the thread was exited, because as far as I know, the compiler tries to stick in as many constants as it can.
Interrupts:
There's no doubt about it, I have to use interrupts in order to access the program counter position, the good thing is that interrupts on the MCU I'm working with automatically save the statuses of the registers in addition to the program counter, so it's just a matter of using some pointer arithmetic to stash the registers into a structure per thread. I have two ASM commands that I can use to achieve a software driven interupt: SoftWare Interrupt (SWI) and the unimplemented Opcode trap (TRAP)
SWI:
It does as it's marketed: saves the registers onto the stack and the program counter before jumping to the SWI interrupt service routine. An RTI (Return from Interrupt) instruction restores the registers and program counter, thereby returning to wherever it jumped from.
TRAP trapnum:
Exactly the same as SWI, but is automatically done whenever a unimplemented opcode is happened upon by the CPU. Some CPU's may have an individual ISR per TRAP, but it's more likely they have just one for all of them. In any case, the trap ISR should try to recover the CPU from the errant thread. In simpler programs that don't have an OS, this can mean forcing a software reset or immediately stopping execution (if it is in a debug mode).
For programs that do have an OS, it may be possible to terminate the thread/process that's caused it to wander and signal that and error had occurred over communications lines and/or output devices. In a debug mode, it may also be able to pull out the stack contents of that errant thread and shoot them over to the monitoring PC or device.
OK, with those points in mind, I plan on using the SWI command to kick out execution of whatever thread it is in. This well be used for functions that manipulate thread/process behavior such as pause_thread, stop_thread, etc. and will most notably used during times that a thread needs to wait on hardware for data. i.g. :
void pause_thread( void )
{
// Some other stuff to tell the kernel to only pause this thread, such as a semiphore or flag of some sort
asm( SWI );
}
void Thread_Foo( void )
{
start_atd_converstion();
while ( !atd_is_done )
pause_thread();
}
As for the unimplemented opcode trap, that's been put on the "To Do" list for now.

[Update]
Ok, here's an example structure of the CPU context: (The order, from top to bottom, is in the same order that the MC9S12 saves the context when an interrupt occurs)
struct Thread_context
{
uint16 PC;
uint16 Y;
uint16 X;
uint16 D; // (B:A), where B and A accumulators are 8bit
uint8 CCR;
}
I don't know at this moment whether I should include the Stack Pointer or not, and I'm also not sure whether this should be in C or in ASM... but I'll think about it for some time.