Fully associative cache : just park wherever, we will help you find your car.
In direct mapped cache each piece of data can only go to one space.
Assume that the cache has $2^n$ entries.
n is probably small compared to main memory.
From an intel page:
L1 Data 32KB per core,
L1 Instruction 32KB per core
For a 4 core chip, 256KB
Generally caches follow the memory triangle. (larger/slower/cheaper).
Notice the different caches for instruction and for data.
Let's assume an address of 10 bits and a cache size of $2^4$.
This means that there are 16 cache locations (0000 through 1111)
We will use the bottom 4 bits of an address to determine where an item goes in cache.
We will record the top 6 bit as a tag.
Bits 9 through 4 TAG
Bits 3 through 0 cache address
A tag is a field in a table use for memory hierarchy that contains the address information required to identify whether the associated block in the hierarchy corresponds to a requested word.
The example continues
Where would the data from address 0x24F be found?
Where would the data from address 0x032 be found?
Where would the data from address 0x03F be found?
When looking to use the cache, we need to know if the data in the cache location is valid or not.
To do this, we use a valid bit.
If the bit is set the entry is valid.
We initialize the cache with all valid bits set to false.
When data is requested from cache (given a memory address)
Calculate the cache address and tag.
if cache[index].valid != valid
fail
else if cache[index].tag == tag
return cache[index].value
else
fail
What do we do on a fail?
Retrieve the data from memory
Store it in the cache.
And return it to the CPU.
They note that is MIPS the memory is byte addressable, but all memory transactions are on word addresses.
So with a 1024 entry cache
10 bits are used for the index.
2 bits can be ignored as they are 0
20 bits are used for the tag.
And 1 bit for the valid bit
Look at the picture on page 407.
This is good, as it overcomes fetching the same data twice, but we can do better.
If we take 2 bits from the address (bottom) to use for an offset into the cache table.
we can store 4 adjacent words in each line of cache.
TAG
INDEX
OFFSET
00
When we fetch any word in the block, we fetch the entire block.
Then locality says we will probably use the three words around the word we fetched.
So if we use the word at address 0x??????40, we will also use the words at 0x??????44, 0x??????48 and 0x???????4c
See figure 5.12 o page 414.
What happens when we write to cache?
There are two schemes
write through - write to the cache and to memory
write back - write to the cache and only write to memory when the cache entry is invalidated.
Each of these has advantages and disadvantages speed wise.