Memory
- Read Page 343 for all of the 4 letter words.
- I don't expect you to memorize these, just read them through once
- And remember in 4 years it is most likely that they will all be different!
- Cache
- Terms
- Hit - we find the item we are looking for in the cache
- Miss - we do not find the item we are looking for in cache
- Hit Rate - percentage of times we have a cache hit (80% or even higher)
- Miss Rate - 1-Hit Rate
- Hit Time - or access time - the time to get an entry from cache
- Miss Penalty - the time to get an entry from memory
- These terms actually are applicable at lower levels too, (see picture page 236)
but we usually use different terms
- Page Fault, etc for memory to disk
- Off Line, etc for disk to other storage
- Principle of locality
- This applies to both data and instructinos
- This applies to both time and space
- If we used something, we are likely to use something near it
- Temporal locality - in time
- Spatial Locality - in space
- We tend to write programs that loop, or go in predictable patterns
- We tend to access the same, or predictable memory locations (arrays)
- Cache can exploite these patterns to attempt to keep the right thing available
- Mapping Schemes
- The problem is we want to make a tiny bit of memory look like real memory
- So we can't use addresses (like memory does)
- Two and a half schemes
- Direct Mapped Cache
- Assume we have 32 words of main memory 00000 - 11111
- Assume we have 4 words of cache memory 00 - 11
- Direct mapped caching uses the bottom two (or top two or middle two, but let's be reasonable) bits of the actual address
to determine where the data will be stored
- For example 00010 01010 11110 will all be stored at cache location 10
- And 00000 01000 11100 will all be stored at cache location 00
- To tell which one, we store the top three bits with it in a field called a tag
- On top of that we have a single bit, called the valid bit, to tell if
the cache entry has something stored in it.
-
- To check for a word tttmm
- Check the tag at cache location mm to see if it is ttt
- Check the valid bit at cachelocation mm to see if it is 1
- If both conditions are met, return the data,
- Otherwise fetch the data from memory location tttmm and
- Store it at cache location mm
- Return it to the CPU
- In this case, if we want to find something in cache when there is
already something there, it is called a collision, the cache is
too small for what we are trying to do. (this is a cache miss)
- If we are looking for data and nothing is in the cache, we have
encountered a compulsory miss, (no way around this one)
- Trace the fetch of memory addresses 00000, 00001, 11010, 00010, 11100, 0000
- In this scheme, every memory location is mapped to a single cache entry
- But each cache entry can be mapped to from multiple memory locations
- We know where to look, but it might not be there
- Fully Associative Cache
- Collisions are a problem with direct mapped cache.
- What if we are accessing a set of instructions at 0000xxx
- And a set of data at 1111xxx?
- We would have collisions even though cache was mostly unused
- So how about placing things anywhere in cache?
- The tag field needs to become the full size of an address
- We still need a data field and a valid bit
- To check for something
- Search all of the tags to see if they match the address we are looking for
- If not, replace one, with the new data, and send it to the cpu
- In this case, we might not have room for everything we want to store
in cache, and this is called a capacity miss.
- Replacement Schemes
- A victim block must be found to replace with new data
- Multiple schemes exist
- We want to eliminate the one that will never be used again, or at least is needed furthest in the future
- This is hard (impossible without running the program to see what it is)
- So we can pick the least reciently used (but this requires us to keep a time stamp on cache entries and search when looking for a space to replace them)
- Or use First In First Out
- Or random (which is just about as good as the others, but is really inexpensive to implement)
- Trace the fetch of memory addresses 00000, 00001, 11010, 00010, 11100, 0000
- N-Way Set Associative Cache
- Fully Associative needs to be searched.
- So N-Way set associative cache can be formed.
- In this case, assume we have 8 words in our cache.
- We will arrange them in 4 groups of two words
- Each word will still have a tag and a valid bit
- But words ending in 00 can be stored in location 000 or 100
- This is 2-way set assocative.
- It is subject to all of the above miss problems, but not as bad
- It is subject to replacement policy problems above, but not as bad
- Trace the fetch of memory addresses 00000, 00001, 11010, 00010, 11100, 0000
- These schemes can work with larger chunks of memory too.
- Writing to Cache
- Two schemes
- Write Through
- Each time we write to cache, write through to memory as well
- No book keeping
- Is slower
- And we might perform unnecessairy memory writes
- Write Back
- Only write to memory when the entry is to be removed from the cache
- Needs a dirty bit to indicate that it needs to be written
- It suffers a double penalty - write old data, read new data
- But does not perform unnecessairy writes when variables are changing often.
- No matter what, you can write a program that messes up cache if you
really try. But they are getting smarter.
- Access Time = Hit Ration * Hit Time + (1-Hit Ration) * Miss Penalty
- This gets worse when considering writes.
- And you like math so well so ...
- Cache Hit rate of 85%, Access time 5 ns, Miss Penalty 25 ns
- AT = .85*5+.15*25 = 8 ns
- But it could be placed into our model for processor performance.
- And we would use the 8ns as our memory speed with cache
- Everyone read 6.0 - 6.4.5
- Zimmer's class read 6.5 - end