Tokyo, New York and Sunnyvale (CA) – Yesterday, Toshiba, IBM and AMD announced that via a joint effort they have developed a SRAM cell just 0.128 μm2. The cell is more than 50% smaller than the previous record holder, a nonplanar-FET cell measuring 0.274 μm2. SRAM is used in a computer’s cache. Smaller SRAM cells means less heat, greater performance and lower production costs as it requires less silicon real-estate to produce similar cache sizes.
The cell was developed using IBM’s high-k/metal gate (HKMG) materials and process technology at 32nm. And whereas previous attempts to reduce SRAM cell size have required modifications to the doping agents used (impurities are put onto the surface to create desirable electrical effects), the new solution is undoped and greatly increases stability. This allows their cell size to be dramatically reduced (relative to other processes) while increasing stability.
According to IBM, projections indicate the cell size will remain stable even at a point below the projected 22nm process node, though products will first arrive at the 32nm process node (in 2010/2011 ??).
CPUs utilize cache memory to increase performance. At their core, CPUs read data, process it, and then write the results. When operating at a few billion cycles per second, with multiple cores and each core being capable of doing more than one thing at a time, this equates to several GB of bandwidth. The main memory bus operates at a relatively low clock speed compared to the CPU, typically 1.0 GHz to 2.0 GHz. This results in slow operations compared to the needs of a CPU.
It can be visualized by imagining a pedestrian desiring to walk onto a busy 4-lane interstate. The cars are going by so fast that, by comparison, the walker seems slow, though both may be operating at maximum speeds relative to their abilities.
The cache memory operates like the busy 4-lane highway, going amazingly fast in fact. Still the cache is broken out into layers designed for specific things. The L1 cache operates the fastest, but is also the smallest in functional size. The speed requirement, however, demands that it actually be quite large by comparison to the relative size for the same amount of memory as seen in the L2 or L3 caches. The L1 cache is responsible for all immediate operations in tight blocks of code. Without the L1, the performance of CPUs would be significantly depressed.
Level 2 cache (L2 cache) is generally the power-house of the CPU as most large workloads can be stored there. These are typically 512KB or larger in modern CPUs, and can handle large data sets without the need to access main memory at all.
Level 3 caches are a relatively new on-die invention. However, these also provide a buffer between main memory and the CPU’s needs and are usually quite large, even 8 MB or more. These allow huge data sets to be maintained directly on the CPU without the need to go to main memory.
All of these can be considered as progressively faster levels on our imaginary highway. The pedestrian on the outside is main memory. Next in is the service road near the interstate. This is L3 cache. Beyond that is the interstate and its busy 4 lanes. This is the L2 cache. And then there’s the through-lane or carpool-lane where operations are not slowed by regular on/off traffic. The carpool-lane is a high speed pathway between point A and B, and this would be like the L1 cache.