I was trained on the MC9S12, an ol' Motorola chip that was designed as a central MCU for automobiles. As a result, it has an ***ton of pins (112 to be exact) that so far haven't been completely used up.
Although not specifically stated anywhere, the chip's pretty much an intermediate architecture that still allows you to put variables wherever you want, and had up to 256Kb paged flash... which again hasn't been completely used up (yet).
Nuke, I remember coming across a "GameDuo" shield over at Sparkfun.com that allowed an Arduino to hook up to a TV using the old RCA plugs. That sounds like what your working on, but with a couple of improvements. I'm dangerously interested, Lol.
As a side note, JG, and everyone else reading this, I wouldn't recommend learning assembler for an x86 chip because 99.999999....9% programs today made for them are in high level languages. Unless you where looking for a job as a compiler programmer, its just no use to learn a CISC's asm when your just beginning.
P.S.: Out of curiosity, Nuke, is your system going to be mainly parallel data transfers or will there be a few serial lines?
there are a few ways to get video support on arduino. theres the gameduino which uses an fpga, it has a full set of features, backgrounds, sprites, collision detection, and outputs to vga. theres the tv out library which bitbangs 1 bit video over composite, ive played with it a little, but it uses up so much cpu time that you really cant use it for anything but simple games. there was another video board that implemented the same kind of graphics as the tv out library, had text and graphic primitives. ive seen a few diy methods on hack a day that im loosly basing mine off of. many of those used a single sram, with a latch to multiplex address and data lines and some kind of dac.
i intend to use a brute force mentality and use several chips. everything is parallel. address bus is 15 bits and data bus is 8. the memory also requires 2 control lines. on top of that i will have an additional 2 lines to control/indicate bus switching status. one for input to request a bank flip, and the other indicates to the external processor that the bus flip has occured (transition indicates memory ready, while the level indicates which bank is currently attached). for performance i intend that both sram chips will operate simultaneously. you will be able to draw to one, while the other one is used for signal output, so that you dont have to wait for blanking periods to run code. you will draw, request flip, and when the chip controlling the signal output hits a blanking period (could be each line or each frame, whis may be a configuration setting) it reads the control line, and if its high flips the bus and sets a bit informing the cpu that the memory is ready to be written to again.
i intend to use 4 3-state octal bus transceivers to switch the i/o bus. each sram will connect to a pair of transcievers, one will go to the port for an external cpu, and the other will go to the internal bus. the pairs would be cross connected so that the internal buss can be connected to one sram bank while the other sram bank is routed to the external port. the internal bus connects to the internal mcu, and through a buffer or two, to the dacs (to support both greyscale on composite and 332rgb on vga i will need one for each, though i havent really thought this part out very well).
address bus line are one way, so only buffers are required to switch them. unfortunately at 15 lines apiece it will take about 8 3-state octal buffers to do this. im thinking of using a latch to multiplex the bus and address lines with the data lines to reduce the amount of bus. this would require only 4 transceivers, 4 buffers and 2 latches. i may also need some inverters to reduce the number of onboard control lines. hopefully i can control the bus with one of the atmega328s i have on hand, which has 20 available pins. 15 for io/addres bus, 2 for we/oe pins on the sram, 1 for the latch, and 2 control lines to the external processor, bus switching would probably be done with the bank indicator line. im going to have to scan over the datasheets to make sure i can do this with 20 pins or less, otherwise im gonna need a chip with more pins.
everything would be running at 20 mhz. the ram is asynchonus so i dont think there is any need for the external processor to match clock speed or to be in sync. everything will be in a dip package, since im not setup for surface mount. i have fresh pcbs and a recipe for home made etchant. dont have a laser printer to print toner transfer sheets, so i may just end up drawing the pcb patterns freehand. external processor will be an atmega1284, which by itself has a good amount of memory. 128k flash, 16k ram, 4k eeprom, and can be run at 20mhz. will be a fun little project.