DDR RAM
by Gene Cooperman

Basic Concepts

DDR RAM Operation

States and State Transitions of DDR RAM

DDR Commands and Registers

Timing Parameters

Timing Constraints

Pinout

Timing Example (Worst Case)

Basic Concepts

DDR RAM is Double Data Rate RAM. Although DDR RAM can be designed for various clock rates, we will concentrate on DDR-266 RAM. It operates with a 133 MHz clock, but it uses both the leading and trailing edge of the clock cycle. Hence, it produces data at an equivalent clock rate of 266 MHz, which is a double data rate. DDR RAM achieves its double data rate even though the internal RAM core operates at only the 133 MHz clock rate. Effectively, if a DDR RAM chip produces 8 data bits for every cycle of the 266 MHz clock, then it is internally producing 16 data bits for every cycle of the 133 MHz clock, but delivering on demand only 8 bits at a time to the I/O data pins.

DDR-266 RAM is sometimes also called PC-2100 RAM, since on a 64 bit or 8 byte system (memory) bus, which is typical for PCs, DDR-266 RAM has a bandwidth of 8 x 266 = 2128 or 2100 MB/s. (Note the system bus is also called the FSB or Front Side Bus.)

DDR RAM grew out of the original SDRAM (synchronous DRAM), later called PC-66 RAM. Synchronous, here, means clocked. Under the impetus of Intel, this was replaced by PC-100 and later PC-133 RAM (the same technology, but distinguished by higher clock rates and more detailed specifications for interoperation). As of this writing, DDR II RAM has been standardized. It is the third generation in this series: SDRAM/DDR RAM/DDR II RAM. DDR II RAM will have an effective clock rate of 400 MHz at introduction.

DDR RAM is organized in rows or memory pages. The memory pages are divided into four sections, called banks. Each bank has a kind of register associated with it. In order to address a row of DDR RAM (a memory page), one must specify on the pins both a memory bank and a row address. A memory bank can be active, in which case there is an open page associated with the register of the memory bank.

Note that the address lines on the address bus of the CPU will be "wired" to the row address, memory bank, column address and chip select. The address lines can be wired arbitrarily, so that a section of RAM associated with a memory bank may appear to the CPU either to be contiguous or interleaved with other memory banks. Since reading from the same memory bank can be faster, memory banks are generally wired to be contiguous, although it is possible to wire the addresses of different chips as interleaved with each other.

In this example, we assume a 256 Mb chip organized as 32 Meg x 8 (8 data pins), or 8 Meg x 8 data pins x 4 banks. A set of four data bits is specified by using pins A0-A12 for the row address, pins BA0-BA1 for the bank, and pins A0-A9 for the column address. Hence, there are 8K rows, 4 banks, and 1K columns, or 8Kx4x1K=32 Meg of sets of 8 data bits. The row address is specified in a first phase, and the bank and column address is specified in a second phase.

For detailed information, consult a vendor's datasheets, such as the Micron 256 Mb (Megabit) DDR RAM chip from their RAM web page.

DDR RAM Operation

DDR RAM executes commands, which are usually issued by the chipset. The details of its pinout, commands, etc., are similar, although not identical to that of PC-133 RAM. In order to activate a bank with a row for reading or writing, the bank must first be charged. The charge allows the chip to sense a particular row, and amplify the signal from that row. A memory bank is usually precharged, rather than waiting for a read/write request and then charging. Precharging one memory bank can usually be overlapped with accessing a second memory bank.

Reads and writes occur in bursts. The burst lengths are 2, 4, or 8. Bursts of four are the most common, and will be the standard in DRR II. Note that the Intel Pentiums II and III had cache blocks of length 32 bytes. Since the system (memory) bus is 8 bytes, a burst of 4 yields 4x8=32 bytes, exactly enough to fill a cache block.

We will assume that auto-precharge is not being used. This appears to be the most common mode of operation of current chipsets. The auto-precharge option for READ and WRITE allows for a fast, automatic PRECHARGE command after READ/WRITE. However, PRECHARGE will close the current page. So, auto-precharge should only be used if it is known that the next access to this memory bank will be to a different memory page. This is a prediction about the future. Most chipsets prefer to assume spatial locality, and so not use auto-precharge.

We also neglect refresh cycles. Since DRAM cells are capacitors, they must be periodically refreshed. Many chipsets appear to periodically broadcast a REFRESH ALL command, although it is possible to individually refresh rows when its bank is not being used. A row must be refreshed periodically (often 64 ms) or else its cells will lose their charge.

States and State Transitions of DDR RAM

The following are the states of DDR RAM. They refer both to commands, and to timing parameters (t_XXX), which are described later under Timing Parameters. The states are based on Truth Table 3, pp. 39 and following of the Micron specs.

States (open for further commands):

Idle:: bank has been precharged and t_RP has been met
Row Active:: row in bank has been activated, and t_RCD has been met, no data accesses or bursts in progress
Read:: READ burst has been initiated with auto precharge disabled, and has not yet terminated
Write:: WRITE burst has been initiated with auto precharge disabled, and has not yet terminated

States (must not be interrupted by command in same bank):

Precharging:: prior PRECHARGE command, t_RP not yet met
Row Activating:: prior ACTIVE commnad, t_RCD not yet met
Read w/ Auto-Precharge Enabled:: prior READ command with auto precharge, t_RP not yet met
Write w/ Auto-Precharge Enabled:: prior WRITE command with auto precharge, t_RP not yet met

Refreshing:: prior AUTO REFRESH command, ends when t_RC met, leaves all banks in idle state
Accessing Mode Register:: prior LOAD MODE REGISTER command, ends when t_MRD met, leaves all banks in idle state
Precharging All:: prior LOAD MODE REGISTER command, ends when t_RP met, leaves all banks in idle state

For the states associated with commands, a bank of RAM will see the following state transitions.

---READ/WRITE--->Read/Write---READ/WRITE--->Read/Write ...
...--->Read/Write---PRECHARGE--->Idle---ACTIVE--->Row Active---READ/WRITE--->

DDR Commands and Registers

These commands sometimes make reference to pins. The most important here are BA0-BA1 (Bank Address, 2 lines) and DM (Data Mask, must be low for data to be valid).

Commands

DESELECT:: Equivalent to setting CS# high to deselect chip A chip recognizes commands only if it is selected (CS# low) There is a command latency after selecting chip again
NOP:: do nothing, but chip remains selected
ACTIVE:: Activate row of given bank; No READ or WRITE command can be entered until at least t_RCD; No ACTIVE command until after t_RC; Row remains active until PRECHARGE command issued; PRECHARGE required before opening different row in same bank.
READ:: Initiate burst read at bank BA0,BA1 starting at column A0-A9; The A10 value determines if Auto Precharge is selected, causing precharge at end of READ burst. (If you don't know if you will be activating a different row on next access to this bank, then auto precharge is safer. If you know that you will be using the same bank, auto precharge is bad.)
WRITE:: Similar to READ; also subject to DM (Data Mask pin) being low.
PRECHARGE:: Deactivate an open row ("closes" row) in one or all banks. Bank(s) cannot be used again until after t_RP; After precharging, a bank is in the _idle_ state, and requires an ACTIVE command before any READ or WRITE commands.
AUTO PRECHARGE (with READ or WRITE):: As if PRECHARGE issued after t_RAS interval of READ/WRITE. Bank cannot be used again until after t_RP
BURST TERMINATE:: terminate burst
AUTO REFRESH:: refresh next row and autoincrement row counter. (AUTO because you don't have to specify which row to refresh.) Each row must be refreshed within 64 ms, or data of that row may no longer be valid.
SELF REFRESH:: As part of power down, can order chip to refresh self on schedule without external clock.

Registers:

Mode Register (Bx): (aka Base Mode Register)
- Operating Mode: Normal Operation, Normal Operation/Reset DLL (DLL set for normal operation, reset for debugging)
- CAS Latency: 2 or 2.5
- Burst Type: Sequential or Interleaved (Interleaved designed if both sides of DIMM populated with x8 chips; Can alternate accesses to each side of DIMM chip)
- Burst Length: 2, 4, or 8
Extended Mode Register (Ex): (details omitted)

Timing Parameters

The specifications for PC-133 and for DDR-RAM dictate the required timing parameters. The chips are sometimes quoted according to their most important timing parameters. The times (in clock cycles) refer to one of the following. (Definitions of the timing parameters are provided below.)

t_CL - t_RCD - t_RP
t_CL - t_RCD - t_RP - t_RAS
t_CL - t_RCD - t_RP - t_RAS - T1

For example, a common timing of a PC-133 RAM chip is 3-2-2 or 2-2-2. A common timing of a DDR-266 RAM chip is 2.5-3-3-6 and a common timing of a DDR-333 chip is 2.5-3-3-7. The DDR specifications allow for either 2.5 or 2.0 CL for the first timing parameter.

Recall that DDR stands for Double Data Rate. Hence, DDR-266 timings refer to the number of 133 MHz clock cycles. Similarly, DDR-333 timings refer to the number of 167 MHz clock cycles. In particular, if t_CL is 2.5 for 2.5-3-3-6 DDR-266, then t_CL (or more accurately, t_CAC) is guaranteed to be no more than 2.5/(133 MHz) = 18.8 ns and t_RCD is guaranteed to be no more than 3/(133 MHz) = 22.5 ns. The actual timing specs are:

DDR-266: T_CAC=15ns, t_RCD=20ns, t_RP=20ns, t_RAS=45ns
DDR-333: T_CAC=20ns, t_RCD=18ns, t_RP=18ns, t_RAS=42ns

A read or write access passes through three stages internally on the chip.

RAS (row access strobe): read row address
RAS-to-CAS: decode the row address, the sense amps (sense amplifiers) must have been precharged if new row requested
CAS (column access strobe): read selected columns and send to data pins (or write to data pins)

Some of the timings are defined as:

t_RP (Time for Row Precharge): time to charge sense amps, activate bank; Command to same bank must wait at least t_RP after PRECHARGE command
t_RCD (Time for Ras to Cas Delay): internal row signal settles enough for sensor to amplify it; earliest time to issue a READ or WRITE command after ACTIVE (round up to next full clock cycle, since commands are issued only on rising edge of clock signal)
t_RC (Time for Row Cycling): minimum time interval between successive ACTIVE commands to same bank; sum of t_RAS + t_RP
t_RRD (time for Ras to Ras Delay ?): minimum time interval between successive ACTIVE commands to different banks
t_CAC (Column Access ... / CAS Latency): data appears on output pins (see t_CL, below)
t_RAS (Time for activation / RAS Active Strobe(?)): time to activate a row of a bank (minimum time bank stays open before it can be closed/precharged again)
t_CLK: clock cycle time (also known as t_CK)
t_CL (CAS latency): t_CAC/t_CLK
t_DQSS (Time for Data Clock ... Strobe ?): Minimum time interval between WRITE command and valid data (nominally 1 clock cycle)
t_WTR (Time for Write To Read ?): Minimum time interval between end of WRITE and READ command (1 clock cycle)
t_WR (time for Write to Row Precharge ?): Minimum time interval between end of WRITE and PRECHARGE command (2 clock cycles)

Timing Constraints

In reading these timing constraints, it is useful to have at the same time some example timing diagrams. Such timing diagrams can usually be found in vendors' datasheets, such as that of Micron.

  [ We only consider timing for bursts of 4 cycles, since this is what
    most CPUs will issue, for sake of cache line fill ]
  READ latency:  number of clock cycles between READ command and valid data
  READ must be completed before WRITE command is issued.
t_RP:  Command to same bank must wait at least t_RP after PRECHARGE command
t_DQSS:  Time between WRITE command and valid data (nominally 1 clock cycle)
t_WTR:  Time between end of WRITE and READ command (1 clock cycle)
t_WR:  Time between end of WRITE and PRECHARGE command (2 clock cycles)

Implications:
  (Recall in timing diagrams, DQ read associated with leading clock
	 transition and DQ write associated with centered clock transition.)
Auto Precharge disabled, command to _same_bank:
  READ burst followed by READ burst:  no idle data bus
  WRITE burst followed by WRITE burst:  no idle data bus
  WRITE burst followed by READ:  after data, bus idle for t_WTR + CL * t_CLK
  WRITE burst followed by PRECHARGE:  t_WR, PRECHARGE command, t_RP
  READ burst followed by WRITE:  t_DQSS
  READ burst followed by PRECHARGE:  PRECHARGE issued 2 clock cycles after
		READ, DQ read starts CL clocks after READ, No further
		commands unti t_RP after PRECHARGE
  Auto Precharge enabled, command to _same_ bank:
    As if PRECHARGE issued at earliest possible moment (after t_RAS interval),
     then wait t_RP during precharging.

Auto Precharge:
NOTATION: BL: Burst Length, t_CK: clock, CL: CAS latency, [CL] (rounded up to int)
  Auto Precharge command followed by command to _different_ bank:
  WRITE burst w/AP followed by READ:  (1+(BL/2)) t_CK + t_WTR
  WRITE burst w/AP followed by WRITE:  (BL/2) t_CK
  WRITE burst w/AP followed by PRECHARGE:  t_CK
  WRITE burst w/AP followed by ACTIVE:  t_CK
  READ burst w/AP followed by READ:  (BL/2) t_CK
  READ burst w/AP followed by WRITE:  ([CL] + (BL/2)) t_CK
  READ burst w/AP followed by PRECHARGE:  t_CK
  READ burst w/AP followed by ACTIVE:  t_CK

Pinout

PINOUT (DDR, 64 Meg x 4, www.micron.com/datasheets/, MT46V32M8 8Megx8x4banks):
  Configuration:  32 Meg x 8 x (8 Meg x 8 x 4 banks)
  Refresh Count:  8K (one refresh per row)
  Row Addressing: 8K (A0-A12)
  Bank Addressing: 4 (BA0-BA1)
  Column Addressing: 1K (A0-A9)
                     A10/AP

For pin definitions, # means low is active.

Pins associated with commands:
  CS#: (chip select, commands only recognized when CS# active)
  WE#: (write enable)
  CAS#: (column access strobe)
  RAS#: (row access strobe)
  DM: (input data mask) input data is masked out (not used) when DM high,
             input data written during write request if DM low
  AP/A10: (auto precharge) during a PRECHARGE, READ, or WRITE command,
	 A10 low means use bank BA0-BA1,
         A10 high means use all banks
Selected Other Pins:
  Data:  DQ0-DQ7  (clocked data, Q = clock)
  Data Strobe:  DQS  (data clock strobe, edge-aligned w/ read data,
		 centered for write data)
  Clock Signal:  CK and CK#  (# or line over signal means low is active)
  V_DD = 2.5 V  (power and heat dissipation rises with voltage)
  V_SS (ground)
  V_REF (reference voltage)

The following is quoted from the Micron datasheet:

Data of RAM organized into rows, which are split among separate banks; 
A bank can have at most one active row within it.

A single read or write access for the 256Mb DDR SDRAM effectively consists
of a single 2n-bit wide, one-clockcycle data transfer at the internal
DRAM core and two corresponding n-bit wide, one-half-clock-cycle data
transfers at the I/O pins.

Commands (address and control signals) are registered at every positive
edge of CK. Input data is registered on both edges of DQS, and output
data is referenced to both edges of DQS, as well as to both edges of CK.

Accesses begin with the registration of an ACTIVE command, which is
then followed by a READ or WRITE command. The address bits registered
coincident with the ACTIVE command are used to select the bank and row
to be accessed. The address bits registered coincident with the READ
or WRITE command are used to select the bank and the starting column
location for the burst access.

An auto precharge function may be enabled to provide a selftimed row
precharge that is initiated at the end of the burst access.

As with standard SDR SDRAMs, the pipelined, multibank architecture of DDR
SDRAMs allows for concurrent operation, thereby providing high effective
bandwidth by hiding row precharge and activation time.

Timing Example (Worst Case)

We illustrate a worst case timing: WRITE, then READ from a different page (read page miss), but from the same bank.

1 cycle: After first write command, wait t_DQSS for valid data
2 cycles: Assuming a burst length of four, wait two clock cycles (Recall that if a DDR RAM chip produces n bits each half clock cycle, then its core produces 2n bits each clock cycle.)
2 cycles: Minimum time interval between end of write and PRECHARGE, wait t_WR; Then issue PRECHARGE command
3 cycles: Command to same bank must wait at least t_RP after PRECHARGE command; Then issue ACTIVE command. If the WRITE were to a different bank, then we could overlap the precharge of the first bank with the write to a second memory bank. So, using the same memory bank hurts us. The write address of the CPU dictates which memory bank we have to use.
1.5 cycles: Wait CL=2.5 cycles before data from READ can begin. We waited t_DQSS=1 cycle after WRITE command for valid data. So, on net, we wait 2.5-1=1.5 cycles. If this were a page hit, we could have issued the READ command early, and overlapped the CL with some other delays. If we were doing WRITE after WRITE, then we would have paid t_DQSS = 1 cycle to wait for valid data. But we already counted that cost in the first step above.

Hence, there is a delay of 9.5 cycles. Two of those cycles are used to write a data burst of length four. So, there are 7.5 idle cycles. Since commands have to be issued on the leading edge of a clock cycle, the next command after the READ may incur an extra 0.5 cycle delay.

DDR RAMby Gene Cooperman

DDR RAM
by Gene Cooperman