Cache Memory

Overview#

Cache memory is a special type of high speed memory. Though it is much faster compared to main memory or disk memory, it offers great speed.

Cache can be used effectively to lower the access time from the main memory.

Types of Cache Memory#

L1 or Primary Cache#

Extremely fast, but relatively small. Usually embedded within the processor chip as CPU cache.

L2 Cache or Secondary Cache#

More capacity than L1. Can be embedded on CPU or on separate chip. Has a high-speed alternate system bus connecting cache and CPU.

L3 Cache#

Specialised memory to improve performance of L1/L2 cache. For example, in multicore processors, each core can have dedicated L1 and L2 caches but they can share a L3 cache.

Cache Performance#

Cache Hit#

When processor is able to find the memory location in the cache, a cache hit occurs.

Cache Miss#

When a processor is not able to find the memory location in the cache, a cache miss occurs. The cache now allocates a new entry and copies data from the main memory to fulfill the request.

Hit Ratio#

The hit ratio can be given as hits / (hits + misses).

Cache Mapping#

Direct Mapping#

Direct mapped cache has each block mapped to exactly one memory location. It works like a table with 3 columns: the cache block with the data, a tag with the address to fetch data from and a flag bit to indicate the presence of valid data.

Fully Associative Mapping#

In this type, the associative memory is used to store content and address of the memory word. This allows the memory block to be mapped to any cache location rather than a prespecified cache memory location as in direct mapping.

Set Associative Mapping#

This method is like a compromise between the above 2 methods. Each block is mapped to a subset of cache locations. Sometimes this is known as N-way set associative mapping.

Data Writing Policies#

Write Through#

In this mode, data is written to both the cache and main memory simulataneously.

Here, we need more write operations resulting in latency.

Write Back#

In this mode, data is only written to cache initially. Data may be written to main memory but it is not mandatory.

This makes the operations more efficient, however the data may not be synchronised or consistent between the main memory and cache.

One way to overcome inconsistencies is to use an active dirty bit which indicates that there are more recent versions and that information has been modified.

Locality of Reference#

Cache memory helps improve performance based on the locality of reference.

Temporal#

Temporal locality is used when the same resources are accessed repeatedly in short intervals of time.

Spatial#

Spatial locality is used when accessing data or resources stored near each other in memory. For example, when iterating through an array.

Cache Replacement Mechanisms#

There are many mechanisms to choose from when we need to replace existing items in a cache. We encounter this when the cache is reaching or has reached its full capacity.

Least Recently Used (LRU)#

In this mechanism, we remove the least recently used cache block. Since this block has not been used for the longest time, the probability of us needing it is less likely.

Least Frequently Used (LFU)#

Here, we choose the least often used cache block and replace it with the new incoming block.

FIFO#

We can implement a simple queue FIFO mechanism where the cache evicts blocks based on the order they were added.

References used:

  1. https://www.geeksforgeeks.org/cache-memory-in-computer-organization/
  2. https://searchstorage.techtarget.com/definition/cache-memory