Cache memory is a special type of high speed memory. Though it is much faster compared to main memory or disk memory, it offers great speed.
Cache can be used effectively to lower the access time from the main memory.
Extremely fast, but relatively small. Usually embedded within the processor chip as CPU cache.
More capacity than L1. Can be embedded on CPU or on separate chip. Has a high-speed alternate system bus connecting cache and CPU.
Specialised memory to improve performance of L1/L2 cache. For example, in multicore processors, each core can have dedicated L1 and L2 caches but they can share a L3 cache.
When processor is able to find the memory location in the cache, a cache hit occurs.
When a processor is not able to find the memory location in the cache, a cache miss occurs. The cache now allocates a new entry and copies data from the main memory to fulfill the request.
The hit ratio can be given as
hits / (hits + misses).
Direct mapped cache has each block mapped to exactly one memory location. It works like a table with 3 columns: the cache block with the data, a tag with the address to fetch data from and a flag bit to indicate the presence of valid data.
In this type, the associative memory is used to store content and address of the memory word. This allows the memory block to be mapped to any cache location rather than a prespecified cache memory location as in direct mapping.
This method is like a compromise between the above 2 methods. Each block is mapped to a subset of cache locations. Sometimes this is known as N-way set associative mapping.
In this mode, data is written to both the cache and main memory simulataneously.
Here, we need more write operations resulting in latency.
In this mode, data is only written to cache initially. Data may be written to main memory but it is not mandatory.
This makes the operations more efficient, however the data may not be synchronised or consistent between the main memory and cache.
One way to overcome inconsistencies is to use an active dirty bit which indicates that there are more recent versions and that information has been modified.
Cache memory helps improve performance based on the locality of reference.
Temporal locality is used when the same resources are accessed repeatedly in short intervals of time.
Spatial locality is used when accessing data or resources stored near each other in memory. For example, when iterating through an array.
There are many mechanisms to choose from when we need to replace existing items in a cache. We encounter this when the cache is reaching or has reached its full capacity.
In this mechanism, we remove the least recently used cache block. Since this block has not been used for the longest time, the probability of us needing it is less likely.
Here, we choose the least often used cache block and replace it with the new incoming block.
We can implement a simple queue FIFO mechanism where the cache evicts blocks based on the order they were added.