Write up on Tech Geek History: Sega Genesis -Mega Drive

SONY DSC

                                                Literature Review

Introduction

The Genesis contained 5 chips to make all of its operations work. The main processor chip was the Motorola 68000 (68k) 16-bit processor, which runs at 7.67 MHZ. Its coprocessor is the Zilog Z80 8-bit processor, which runs at 3.58 MHZ. The visual display is done through a Yamaha YM7101 video processor. There are 2 sound chips, the Yamaha YM2612 and the Texas Instrument SN76489. The main operation of the Genesis works through interaction with the cartridge. The cartridge itself is just a ROM with a max of 4 MB of data. The processor steps through cartridge instructions just like a normal CPU, and based on these decides to transfer data to or write registers in the other modules. Input from the controllers allows the user to manipulate the game execution flow.

COMPONENT DESCRIPTION Motorola 68000: This is the main CPU of the system. Its primary task is to read and execute instructions from the game ROM, then forward data or write to registers in the Z80, video processor, or sound chips. The 68k has a memory mapped address space with access to the game ROM, a 64 KB work RAM, video processor, I/O pins, and the entire Z80 address space. It basically functioned as the task manager of the system; it made sure that each other processor had the information they needed to operate and executed its own instructions, communicated with the peripheral I/O devices, and executed game code. The 68k is a CISC processor designed by Motorola (now Freescale) and introduced in 1979. It has a 24-bit external address, eight 32-bit general purpose registers, and 56 instructions with a minimum size of 16 bits. The processor uses a R/ ̅ line, upper and lower data strobes to specify valid bytes on its data line, and a data acknowledge line to handshake with any peripheral module.

There are also 3-bits of interrupt control, allowing for 6 levels of external interrupts (the Sega Genesis uses 3 of these). Zilog Z80: The Z80 is the coprocessor of the system. This chip mainly dealt with controlling the two sound chips. In most games, the Z80 looped through a small portion of driver code stored in the 8 KB sound RAM and fed data to the sound chips to play music. It has its own memory mapped address that could access either sound chip, the sound RAM, or the entire 68k address space through a bank switching mechanism. Since all elements of the system are shared, either processor needs to request the other processor’s bus in order to access modules that are not directly addressed in its own address space. For example, the Z80 can access game ROM only by requesting the bus and using the appropriate bank to access the ROM portion of the 68k address space. The Z80 is an 8-bit processor designed by Zilog and sold from 1976 onwards. The Z80 dominated the microcomputer market from the mid 70’s to the mid 80’s. Its programming set and registers are very similar to the ones that are found in x86.

 It uses a dual register set, has 252 different opcodes, and 4 opcode prefixes. Yamaha YM7101: This is the video processor (VDP) responsible for translating game data into images seen on the screen. The VDP is capable of rendering two background layers and one sprite layer per frame, with adjustable priorities between layers. Up to 32 simultaneous colors are available using color palettes stored in a 64×9-bit color RAM. This includes 16 colors for sprites and background and 16 for background only. The VDP supports resolutions of 256×192 and 256×224 pixels, 8×8 or 8×16 characters and sprites, as well as horizontal, vertical, and partial screen scrolling. A 64 KB video RAM (VRAM) stores pattern data, sprite tables, and other assorted information needed to render the scene. The VDP also has a 40×10-bit vertical scroll RAM and 23 8-bit registers. Finally, a Direct Memory Access (DMA) engine is used to quickly transfer data from either the game ROM or work RAM to the VDP. Yamaha Y2612: This is a 6 channel FM synthesizer chip used to produce the main background music and sound effects in games. Each channel uses four “operators” combined in different ways to produce notes in different instrument voices. An adjustable attack-decay-sustain release envelope controls the attack and duration of notes. There are also two interval timers, a low frequency oscillator, and an undocumented SSG-EG mode. An optional 8 bit PCM stream can replace channel 6 to play raw audio data directly from the game ROM. The YM2612 is completely controlled by writing to 213 different 8-bit registers. The chip performs internal calculations using sine and power lookup tables to generate frequency modulated notes. The output is 14-bit signed PCM data generated at 53 kHz. Texas Instrument SN76489: This is a Programmable Sound Generator (PSG) capable of producing 3 channels of square wave tones and 1 channel of noise. Like the YM2612, the PSG is controlled by writing to different registers to control the frequency and attenuation of each channel. 5 The noise channel is capable of producing either white or periodic noise. The output of this chip is an 11-bit signed value which is summed with the output from the YM2612 to produce the final audio data. The PSG was typically used to produce simple notes, sound effects, and noise which the YM2612 may not be capable of synthesizing.

F12_Genesis.pdf

Chapter 1.1  Hardware

The SEGA Mega Drive Address Checker was a development tool used to check the integrity of software for the system. It warned the user of any writes to invalid memory addresses, which is a very useful debugging tool and useful for SEGA itself to make sure cartridges are of a quality enough to produce.

The Sega Dev Card by Western Technologies was a development kit that functioned in a similar way to modern day Flash Cartridges. It connects to the developers PC (running MS-DOS) via a port on the back of the cartridge 2.

Western Technologies created the 2MB RAM cartridge that was then distributed by SEGA to developers.

This development cartridge has a port on it to connect to a development PC to load data into the RAM chips 1. Apparently the two 8KB EPROMS you can see are for a bootloader program.

The Sega Virtua Processor was an additional processor for handling 3D geometry contained inside the mega drive cartridge itself. You can think of this as SEGAs answer to the superFX chip.

In order to develop games that used this new processor, development hardware has to be created with the processor on board. The SVP Dev board was just that, it had the SVP processor along with slots for EPROM chips to be inserted with the custom game code.

The bus system

When the Z80 accesses ROM the bus arbiter needs to pause the 68k, let the Z80 finish its request, then unpause the 68k again. Since all those components run asynchronously timing is barely predictable, also if the 68k is just about to access the bus itself it finishes its own access cycle first before releasing the bus. This typically means that the Z80’s 68k bus accesses will be delayed by 2 to 5 cycles. IIRC reads and writes behave exactly the same way. The average Z80 delay is around 3.3 Z80 cycles, whereas the average 68k delay is around 11 68k cycles.

If the Z80 tries to access the bus while the VDP is doing a DMA from RAM/ROM or the 68k is halted because it tried to read from the VDP while its FIFO is empty or write to the VDP while its FIFO is full it has to wait for that to complete first.

IIRC If the Z80 tries to access ROM while the VDP is doing a DMA from 68k RAM this can lead to corruption of RAM contents due to glitchy signals on the address bus (similar to the C64’s VSP bug).

The 68k is a bit slower than expected because something on the bus steals 2 out of every 128 cycles. This is visible when measuring the bus because every time 128 cycles have passed the next bus access is slowed down by 2 cycles. The slowdown even affects the Z80 if it happens to hit its ROM access. Interestingly if this hits when the 68k is writing to the VDP (no matter which port) then the slowdown doesn’t happen. Tampering with internal registers didn’t seem to affect this at all. And this doesn’t seem to be the RAM refresh logic since RAM accesses get their own further slowdowns. (I did not try to investigate this further, I just noticed that code accessing RAM instead of ROM is slightly slower.) Maybe it’s a leftover from development kits that are based on DRAM cartridges that need to be refreshed as well. After all these slowdowns the 68K seems to be running at roughly 480 cycles per raster line instead of the theoretical 488+4/7 ones.

I did not see hints of those slowdowns during DMAs, probably the VDP accounts for that itself by not fetching fresh data during DMA for 2 consecutive cycles whenever it’s spending 1 internal cycle on VRAM refresh. For fast DMA this means that 10 slots are lost every rasterline (these are limited by how fast the VDP can fetch fresh data), for slow DMA (to VRAM in 64k mode) thus only 5 slots are lost (limited by how fast the VDP can write to VRAM).

The YM2612

YM2612 registers $30..$FF are special as they only exist in a shift register that rotates all the time, but this also means that the YM2612 must wait until it reached the desired register before it can overwrite it. That’s what’s limiting the maximum write speed to ½ the duration of a YM2612 cycle, or 33.6 Z80 cycles. Probably slightly more due to latching overhead. I.e. after writing a register value one needs to wait that long before writing the next register’s address.

The YM2612’s busy flag is useless as already documented elsewhere.

It’s not possible to reliably write both register number and data with a Z80 word write, that seems to be too fast for the YM2612.

The YM2612 samples the PCM register once every cycle. The point in time when that happens differs between MD1s and MD2s I tested (and is different again in emulators, but unfortunately I can’t give any precise values here, I’d have to dig for my test code), it also seemed to be more stable on MD1 and varying a bit or sampled for longer than just an instant on MD2.

I tried to account for both in my music routine to keep jitter down to an inaudible or at least barely audible level. And beware of the music routine in general, I developed it from scratch to support 26 kHz sample playback with nearly no jitter, and for that used every available trick. Also it reads the VDP’s V counter, make sure your emulator supports this when there are music issues.

The DAC works a bit differently in MD1 and MD2. On MD1 it just outputs a short pulse for each voice, one voice after another in a cycle, on MD2 it holds the value for quite a while (but still just a fraction of that voice’s slot which is 1/6th of the YM2612 clock (about 53 kHz)). Maybe the YM2612 allows for replacing the DAC value for as long as the DAC is being held which would explain the behaviour seen on MD2s.

The VDP ports

The VDP’s command port is directly connected to its internal address buffer without any latching (or rather, the respective half thereof; after writing the 1st half of an address the next write will go to the 2nd half unless you e.g. write to the data port first). However depending on the value written certain actions can be triggered:

if the topmost 2 bits of the 1st half are 10 then this is a register write, the command port is not directed to the 2nd half, instead a register write is initiated (and executed after the next internal VDP clock = once every 2 pixels). So this way the command buffer serves as a latch.

if bit 7 of the 2nd half is set this initiates a DMA.

The address buffer is incremented after each VRAM/VSRAM/CRAM read or write (by adding the value from register $f). When writing a value to the VDP’s data port (or the VDP does that internally through DMA) both value and current address are appended to its internal FIFO, so the address is incremented immediately even if the VDP could not actually write the value to its memory yet. And if the FIFO was full then the CPU is blocked for a bit first. In either case this means that the CPU could not potentially access the control port while the VDP is incrementing it internally and cause any trouble… except for 2 corner cases

DMA fill/copy: these run internally in the VDP and the CPU is not blocked from accessing the control port. Sega already warned about leaving the VDP alone while such a DMA is ongoing.

Setting up the VDP for read: after writing the read command the VDP will immediately fetch the desired value from its internal memory into a buffer (same FIFO? or separate?) for the CPU to read; once the CPU read that value the VDP reads the next value. And after each read it increments the address register. But especially when reading from VRAM (2 reads needed instead of 1?) and during active scan (VDP will need to wait until the next access slot) if the 68k immediately writes to the command port after reading a value from VRAM through the data port this can cause a conflict because the VDP will only increment the address after having fetched the next value; funny effects occur if the 68k tried to write to a 68k register, typically leading to the register number and value being ANDed with the address after increment, or with the address before increment and then incremented, before being used as new register number and value to write to. (Since a register write was already triggered by the 2 topmost bits it doesn’t matter if ANDing alters these 2 bits, the register write will still occur.)

1.2 Memory Card

§ 7 DMA TRANSFER DMA (Direct Memory Access) is a high speed technique for memory accesses to the VRAM. CRAM and VSRAM. During DMA VRAM, CRAM and VSRAM occur at the fastest possible rate (please see the section on access timing). There are three modes of DMA access. as can be seen below. all of which may be done to VRAM or CRAM or VSRAM. The 68K is stopped during memory to VRAM/CRAM/VSRAM DMA, but the Z80 continues to run as long as it does not attempt access to the 68K memory space. The DMA is quite fast during VBLANK. about double the tightest possible 68K Top’s speed, but during active scan the speed is the same as a 68K loop.

 Please note that after this point. VRAM is used as a generic term for VRAM/CRAM/VSRAM. DMD1 DMD0 DMA MODE SIZE 0 SA23 A. MEMORY TO V-RAM WORD to BYTE(H)&(L) 1 0 B. VRAM FILL BYTE to BYTE 1 1 C. VRAM COPY DMD1, DMD0: REG #23 BYTE to BYTE * DMD0=SA23 Source address are $000000-$3FFFFF(ROM) and $FFOOOO–$FFFFFF(RAM) for memory to VRAM transfers. In the case of ROM to VRAM transfers, a hardware feature causes occasional failure of DMA unless the following two conditions are observed: –The destination address write (to address $C00004) must be a word write. –

The final write must use the work RAM

. There are two ways to accomplish this, by copying the DMA program into RAM or by doing a final “move.w ram address $C00004” _ MEMORY TO VRAM _ The function transfers data from 68K memory to VRAM, CRAM or VSRAM. During this DMA all 68K processing stops. The source address is $000000-$3FFFFF for ROM or $FFOOOO-$FFFFFF for RAM. The DMA reads are word wide. writes are byte wide for VRAM and word wide for CRAM and VSRAM

1.3 Audio

The Yamaha 2612 Frequency Modulation (FM) sound synthesis IC resembles the Yamaha 2151 (used in Sega’s coin-op machines) and the chips used in Yamaha’s synthesizers. It’s capabilities include: — 6 channels of FM sound — An 8-bit Digitized Audio channel (as replacement for one of the FM channels) — Stereo output capability — One LFO(low frequency oscillator) to distort the FM sounds — 2 timers. for use by software To define these terms more carefully; an FM channel is capable of expressing, with a high degree of realism, a single note in almost any instrument’s voice. Chords are generally created by using multiple FM channels. The standard FM channels each have a single overall frequency and data for how to turn this frequency into the complex final wave form (the voice). This conversion process uses four dedicated channel components called ‘operators’, each possessing a frequency (a variant of the overall frequency), an envelope, and the capability to modulate its input using the frequency and envelope. The operator frequencies are offsets of integral multiples of the overall frequency. There are two sets of three FM channels, named channels 1 to 3 and 4 to 6 respectively. Channels 3 and 6, the last in each set, have the capability to use a totally separate frequency for each operator rather than offsets of integral multiples. This works well (l believe) for percussion instruments, which have harmonics at odd multiples such as 1.4 or 1.7 of the fundamental. The 8-bit Digitized Audio exists as a replacement of FM channel 6, meaning that turning on the DAC turns off FM channel 6. Unfortunately, all timing must be done by software — meaning that unless the software has been very cleverly constructed, it is impossible to use any of the FH channels at the same time as the DAC.

Genesis_Technical_Overview_v1.00_1991_Sega_US.pdf

1.4 Complier

BasiEgaXorz is a BASIC compiler for the Sega Genesis consoles. That means, by using this compiler, you can program in a form of BASIC language to create awesome programs, or games for your old Sega Genesis game console. The compiler will also compile CD ISOs for the Sega CD attachment, ROMs that can use the features of the 32x extension, and not to mention, creating ROMs for the regular console without attachements. Today, when most programmers think of the BASIC language, they think about Visual Basic. The language BasiEgaXorz uses is not like Visual Basic, and it certianly wasn’t derived from it. This compiler is aimed for speed, so there are many things that cannot be dynamic within the environment, everything is stayed static (like variables for example, no such thing as REDIM). BasiEgaXorz is intended for a beginer’s platform in order to give an opportunity to make fun and simple games easy to make on an awesome gaming console!

What are the main features?

BasiEgaXorz language features:

Support for both line numbered labels and just regular plain old labels

User defined subroutines and functions

Integer (16 bit), Long (32 bit), and String (limited 8 bit) data types

Multiplication, division, addition, subtraction, bit shift, modulo, compare, and logical operator support

A wide variety of string data type commands and functions

Argunerics

Do….Loop, For….Next, While….Wend looping

Data storage using the classical BASIC Data statement approach

Single dimension arrays for integer and long data types


BasiEgaXorz system implementation features:

  • Full text displaying and color features withe the PRINT and INK commands
  • User input using the INPUT command
  • Background tile graphics (for both planes) and sprite graphics of the VDP supported
  • Joypad functions included
  • Pallette changing and loading commands are there
  • Tile graphics changing and loading commands there too
  • Limited tile mapping supported
  • Background plane scrolling supported
  • PSG sound effects support


References:

F12_Genesis.pdf

GitHub – And-0/awesome-megadrive: A curated list of Sega Mega Drive development resources

Genesis_Technical_Overview_v1.00_1991_Sega_US.pdf

Leave a Comment

Your email address will not be published. Required fields are marked *