In May of this year (2010) I wanted to create a USB HDR camera controller allowing me to shoot more then three bracketed shots quickly and in one set (most DSLRs only allow you to bracket up to three photos in one set). There is a open source project already out there to do this, the Open Camera Controller (OCC), this project uses a Nintendo DS or Nintendo DS Lite and a simple arduino interface cartridge to fire the remote shutter release on your camera. This project has a really nice software interface, the only limitation is that most camera do not allow faster then 1/20 sec remote shutter release exposure times in bulb mode. You also have to account for the frames per second rating of the camera and memory card speed. So I wanted to switch to PTP/MTP USB control of the camera which can use the camera to the full extent of it’s capabilities but is much more complicated to implement. I figured the best way to do this would be to try and add USB host capabilities to the Nintendo DS through an expansion card which would allow the possibly of adding USB control to the OCC project.
The first step was to figure out how the Nintendo DS (NDS) talks to the Slot-2 Game Boy Advanced (GBA) cartridge slot. In the OCC project the NDS talks to the arduino by toggling one data line (pin-3 WR#) on the GBA SLOT-2 that is used to fire off the rumble motor in a rumble cartridge. This allows the NDS to tell the OCC arduino to start and stop capture but will not work for the more complex data communication needed for USB control. I started by learning how to compile code to run on the DS, then figuring out the code needed to talk to the GBA slot, reversed engineered the DMA the NDS uses to talk to the GBA slot, and finally wrote logic for a CPLD to convert the GBA SLOT-2 DMA interface into FTDI PFIFO interface used on the FTDI vinculum USB host controller.
Nintendo DS Software:
Once you have a flash cartridge and proper development environment setup to compile code for the NDS, writing and reading data to and from the GBA slot using DMA in software is easy, the hardware to use the interface for purposes it was not designed for is the harder part.
The first step is to give the ARM 9 processor in the NDS control of the GBA slot-2 slot. This is done with the following code.
#define WAIT_CR_REG (*(vu16*)0x04000204) ... // Place in main function. Sets slowest write possible WAIT_CR_REG |= (1<<14)|(0<<6)|(1<<5)|(0<<4)|(1<<3)|(1<<2)|(1<<1)|(1<<0);
WAITCNT \ WAIT_CR_REG 0x04000204 4000204h - NDS9 - EXMEMCNT - 16bit - External Memory Control (R/W) 4000204h - NDS7 - EXMEMSTAT - 16bit - External Memory Status (R/W..R) 0-1 32-pin GBA Slot SRAM Access Time (0-3 = 10, 8, 6, 18 cycles) 2-3 32-pin GBA Slot ROM 1st Access Time (0-3 = 10, 8, 6, 18 cycles) 4 32-pin GBA Slot ROM 2nd Access Time (0-1 = 6, 4 cycles) 5-6 32-pin GBA Slot PHI-pin out (0-3 = Low, 4.19MHz, 8.38MHz, 16.76MHz) 7 32-pin GBA Slot Access Rights (0=ARM9, 1=ARM7) 8-10 Not used (always zero) 11 17-pin NDS Slot Access Rights (0=ARM9, 1=ARM7) 12 Not used (always zero) 13 Not used (always set ?) 14 Main Memory Interface Mode Switch (0=Async/GBA/Reserved, 1=Synchronous) 15 Main Memory Access Priority (0=ARM9 Priority, 1=ARM7 Priority) Bit 0-6 can be changed by both NDS9 and NDS7, changing these bits affects the local EXMEM register only, not that of the other CPU. Bit 7-15 can be changed by NDS9 only, changing these bits affects both EXMEM registers, ie. both NDS9 and NDS7 can read the current NDS9 setting. Bit 14=0 is intended for GBA mode, however, writes to this bit appear to be ignored?
When the NDS first loads your custom code the WAIT_CR_REG contains 1110100010000000 (0xE880), I checked this by reading the WAIT_CR_REG on program start.
The memory map of the NDS which pertains to the Slot-2 GBA DMA is as follows:
External Memory (Game Pak) 08000000-09FFFFFF Game Pak ROM/FlashROM (max 32MB) - Wait State 0 0A000000-0BFFFFFF Game Pak ROM/FlashROM (max 32MB) - Wait State 1 0C000000-0DFFFFFF Game Pak ROM/FlashROM (max 32MB) - Wait State 2 0E000000-0E00FFFF Game Pak SRAM (max 64 KBytes) - 8bit Bus width
The address bus width of the SLOT-2 \ GBA DMA is 24-bits and the data width is 16-bits, the starting address of each of the above relative \ virtual address blocks in the NDS memory map is mapped to the zero starting address of the cartridge DMA. So when you are writing, for example, to address 0x08000000 on the NDS the resulting address on the cartridge DMA is 0x000000, a simple way to look at it is just to truncate off the first two MSB. Another important thing to keep in mind is you are reading or writing a word (16-bits) of data on each read\write. The address you are using in the NDS code is not the byte-addressable address that is transmitted on the bus and seen in logic analyzer captures. To determine the byte-addressable address you will need to first truncate off the two MSBs (as discussed above) and divide the value by two, the result will be the actual byte-addressable memory address you are reading and that is transmitted on the bus.
To write 0xF0F0 to 0x80000AC (NDS memory map) / 0x000056 (byte-addressable SLOT-2 GBA DMA address)
*(vu16 *)0x80000AC = 0xF0F0; OR vu16 DummyVar16b; //Declare variable ... DummyVar16b = 0xF0F0; //Whatever value you want to write *(vu16 *)0x80000AC = DummyVar16b;
To read from 0x800FFFF (NDS memory map address) / 0x007FFF (byte-addressable SLOT-2 GBA DMA address)
vu16 DummyVar16b; DummyVar8b = *(vu16 *)0x800FFFF;
Nintendo DS Slot-2 GBA DMA Hardware Interface, Reverse Engineering:
I first stared by using a logic analyzer to sniff traffic between the Nintendo DS while writing and reading data to and form a GBA cartridge. I used the code described in the section above to write and read to specific addresses on the GBA DMA bus and monitored the resulting data using a low cost Open Bench Logic Sniffer. A note on the data captures shown in the sections below, the open bench logic sniffer is very susceptible to noise and if a NDS GBA DMA data line is not being pulled high\low strongly or if the NDS leaves a line floating (HI-Z), the OBLS will pick up switching noise on these floating data lines. You will see some switching noise on the clock line which is inactive and should be ignored
Writing to Slot-2 (GBA cartridge):
When writing to the SLOT-2 / GBA slot the NDS DMA first sets up the address to be written too on the A0-A23 lines, then pulls CS_ROM_ low (the last “_” indicates active low) which latches the address in the cartridge. Next the data to be written is put onto the bus on the AD0-AD15 data lines and the WR_ line is puled low. The data is latched into the cartridge on the falling edge of WR_. The CS_ROM_ and WR_ lines are released simultaneously allowing setup for another read or write operation.
Reading from Slot-2 (GBA cartridge):
When reading from the SLOT-2 / GBA slot the NDS DMA first sets up the address to be read on the A0-A23 lines, then pulls CS_ROM_ low which latches the address in the cartridge, this is the same as a write sequence. The NDS then releases control of the data lines on the bus and pulls RD_ low, the cartridge responds by loading the data at the selected address on to data lines. The NDS reads the data off the data lines on the rising edge of RD_ line.
NDS reading 16-bits from 0x800FFFF (NDS memory map address) / 0x007FFF (byte-addressable SLOT-2 GBA DMA address), a ROM cartridge is installed and a feed through adapter is being used to capture traffic between the NDS and the cartridge
Nintendo DS Slot-2 GBA DMA Hardware Interface:
Even with the NDS on it’s slowest DMA setting the interface is still too fast for a uC such as a Atmel AVR running at 20Mhz to grab data off the bus on a interrupt. So I decided to use a CPLD to implement a interface between the NDS and FTDI parallel first in first out (PFIFO) interface that is used on the FTDI Vinculum USB host controller. The FTDI Vinculum supports UART, non-standard SPI, and PFIFO interfaces, I chose to use the PFIFO because it is the easiest to develop logic to convert from NDS DMA to PFIFO. My first logic design turned out to be overly complicated for what was needed; in the first design I the logic would allow the NDS to write to the FTDI PFIFO go off and do other tasks and just check a bit to see if the write was finished. The CPLD handled all the operations involved with checking to see if the FTDI was ready for a write, writing, maintaining proper timing, and write \ read ready flag bits. This turned out to be unnecessary because the FTDI Vinculum is just far to slow and made writing larges amount of data overly complicated. So for the second design I implemented much simpler logic design that basically turned the GBA Slot-2 in to specialized GPIO, the NDS code is in charge of checking the FTDI PFIFO before each read and write and having necessary delays to not exceed FTDI interface minimum timing constraints.
During prototyping I used a Xilinx Coolrunner-II CPLD development board. It is important to note that the dev board I used has a lot of parts that are not needed and thus a PCB cartridge design with a CPLD would be much smaller and would use a smaller pin count CPLD. Also if a PCB is ever made out of this design it would be lower cost and lower part count to use a VQFP-44 pin XC9572X which is 3.3V core and IO, the Coolrunner-II uses a 1.8V core which would require a 3.3V -> 1.8V supply.
I tested and debugged the CPLD interface with a FTDI 2232H (seen in the picture above) instead of a FTDI vinculum which uses the same parallel FIFO interface. Using the FTDI 2232H just lets the DS communicate with a computer terminal instead of the USB host controller for easy debugging. To allow the DS to talk to USB slaves, like a camera, I just disconnect the FT2232H eval board and connect the FTDI Vinculum USB host controller eval board.
The FTDI parallel FIFO interface used on the FTDI Vinculum USB host controller is an 8-bit asynchronous interface with 4 control lines, the same interface used on the FTDI FT245 & FT2232H. I will not go into much detail about this interface because it is described in the FTDI Vinculum datasheet (page 20) and FT2232H datasheet (page 13 & 26). It is important to note that the PFIFO interface has different timing constraints on each FTDI device.
Attached in this post is the Xilinx logic, it is written in schematic entry instead of VHDL. This is not good practice and really should be re-done in VHDL or Verilog. Though schematic entry is by no means ideal it does work and meet the needed timing restrains. If I end up using this interface for a future FTDI Vinculum-II design the logic will be re-done in VHDL.
Attached File: NDS SLOT-2 to FTDI PFIFO interface (NDS_SLOT-2_interface.zip)
The attached design files contain the complex and simple design, the simple design is what I ended up using. Xilinx ISE Design Studio is required to open these files.
For the continuation of this project please check out my next blog post which includes the source code, schematic, and video for a basic PTP/MTP USB controller that runs on an AVR & NDS.