//****************************************************************************//
//*********************** Caching - July 2nd, 2018 **************************//
//**************************************************************************//

- A lecture on Monday? IMPROBABLE!
    - ...well, if I disregard the T-Square announcement
    - Professor Moss said he won't cover the whole lecture in recitation, but will go over enough so that we don't fall too far behind
--------------------------------------------------------

- So, last Thursday, we mentioned the "Translation Lookaside Buffer" to try and get around a big disadvantage with paging: it takes 2 memory accesses!
    - That's a SIGNIFICANT slowdown, so how can we get around this? We can use a CACHE!
        - A CACHE is where we have a small set of VERY fast memory that stores recent memory accesses - or, in some other way, stores values to help speed up the processor
            - Why don't we use this fast memory for everything? Well, primarily, because it's expensive - furthermore, faster memory has issues with power consumption, heat, etc.
            - Why is the cache faster in the first place, though? Well, "hardware" is the easy answer for now...
    - The TLB is a cache that stores the last few frame numbers accessed; if the VPN we're looking for is in there, we can get it from there instead of memory!
        - This works well because of the PRINCIPLE OF LOCALITY - if we're executing a line in a program, there's a good chance that the next thing we execute will be pretty close to the previous line in memory
    - A HIT is when we find what we're looking for in the cache; if it's a MISS, then we just go to memory as usual
        - You can think of this like page faults in paging; we hope the page is valid, but if it isn't, well, we have to go to the disk
        - Following this, we have the HIT/MISS RATE to measure what percentage of our attempts are hit/misses
            - The MISS PENALTY is how much extra time it takes to handle a miss in the cache
            - EMAT - "Effective Memory Access Time" - is how long it takes on average to access memory, given the miss/hit rates
                - This has a slightly recursive formula:

                    EMATn = T1 + missRate * (EMATn+1)

                - (EMATn+1 is the EMAT of the next level down)
                - Calculating EMAT is a pretty regular test question!

- How do we organize this cache, though? There're a few different schemes:
    - DIRECT-MAPPED caches are where, say, if we have 8 cache location and 16 memory locations, cache 0 maps to 0/8 in memory, cache 1 maps to 1/9, cache 2 maps to 2/10...etc.
        - This way, EVERY bit in memory has a place where it can be stored in the cache (since we've broken up our memory into cache-sized chunks)
            - Practically, we'd do this with a modulo:

                Memory cache slot = address # % (cache size)

        - The issue, of course, is that we can't have all of our memory in the cache at the same time - but if we organize it well enough, that shouldn't matter; we can still use it effectively
- To look stuff up in the cache, we can break the memory into 2 parts:
    - The CACHE INDEX is the "memory address % cache size"
    - The CACHE TAG is the "memory address / cache size"
        - When we save something in the cache, we put in the tag and the index; using the tag, we can quickly figure out which chunk of memory it came from, so we know if it's what we're looking for
    - Great! But there's a problem: what if we haven't put anything in the cache? How do we tell a value of 0x0 from an unused slot in the cache?
        - Well, just like in paging, we'll have a separate valid bit in the cache entry that'll be 1 if it's holding a real value and zero if it's not being used
    - Hardware-wise, checking the valid bit/tag is just an AND gate and a comparison component, and that's all
    - In algorithm form, then, cache retrieval basically looks like this:

            <code on the slides>

- What if 2 processes have different things stored in the cache?
    - Well, that's the idea behind SET ASSOCIATIVITY: Instead of using direct-mapping, we split the cache into a bunch of "single entry" caches that can each hold ANY value
        - The trouble is now, cache retrieval is more complicated. Since we don't know which cache it's in, we have to check ALL of them simultaneously and then MUX the correct value through
        - This is annoying to build, of course
            - We won't be focusing on this too much in class, but make sure you're aware it exists, and know the tradeoffs of using it

- Alright, with that being said, we'll end a bit early today - have a wonderful 4th of July!