Tuesday, 10 December 2013

Memory Organization (Direct Mapped Cache/Accessing Cache/Caches Misses) by SONG CHONG HARN B031310216

Direct Mapped cache

• Direct-mapped caches are the opposite extreme from fully associative caches
• In a direct-mapped cache, each memory address can only be stored in one location in the cache
• When a memory operation is sent to a direct-mapped cache, a subset of the bits in the address is used to select the line in the cache that may contain the address
• Another subset of the bits is used to select the byte within a cache line that the address points to
• In general, the n lowest-order bits in the address are used to determine the position of the address within its cache where n is the base-2 logarithm of the number of bytes in the line 
• The m next higher-order bits, where m is the base-2 logarithm of the number of lines in the cache, are used to select the line in which the address may be stored


Line number is calculated using the following function
i = j modulo m

Where
i = cache line number
j = main memory block (not full address) number
m = number of lines in the cache

Each main memory address can by divided into three fields
         Least Significant w bits identify unique word within a block
         Remaining bits (s) specify which block in memory. These are divided into two fields
       Least significant r bits of these s bits identifies which line in the cache
       Most significant s-r bits uniquely identifies the block within a line of the cache





Direct Mapping Cache Line Table

Cache line
Main Memory blocks held
0
0, m, 2m, 3m…2s–m
1
1, m+1, 2m+1…2s–m+1
m–1
m–1, 2m–1, 3m–1…2s–1












Direct Mapping pros & cons
         Simple
         Inexpensive
         Fixed location for given block –
If a program accesses 2 blocks that map to the same line repeatedly, cache misses are very high (thrashing)


  Accessing cache

Total number of bits needed for a cache
Size of tag field is
32 – (n + m + 2)
The total number of bits in a direct-mapped cache
2n x (block size + tag size + valid field size).

Block size = 2m words (2m+5 bits), 1 bit for valid field size,

Total number of bits in cache is
2n x (2m x 32 + (32-n-m-2) + 1)
= 2n x (2m x 32 + 31-n-m)

Example
How many total bit are required for a direct mapped cache with 16kb of data and 4-word blocks, assuming a 32-bit address?
16kb is 4K (212) words
with block size of 4 words (22), 1024 (210) blocks remaining for cache size
each block has 4 x 32= 128 bits data plus a tag, which is 32 – 10 – 2 – 2 bits
plus a valid bit
total cache size = 210 x (4 x 32+ (32 – 10 – 2 – 2) + 1)
                             = 2
10 x 147
                             =147 Kbits

 Cache Misses
Cache misses refers to a failed attempt to read or write a piece of data in the cache, which results in a main memory access with much longer latency. There are three kinds of cache misses: instruction read miss, data read miss, and data write miss.

A cache read miss from an instruction cache generally causes the most delay, because the processor, or at least the thread of execution, has to wait (stall) until the instruction is fetched from main memory.

A cache read miss from a data cache usually causes less delay, because instructions not dependent on the cache read can be issued and continue execution until the data is returned from main memory, and the dependent instructions can resume execution.

A cache write miss to a data cache generally causes the least delay, because the write can be queued and there are few limitations on the execution of subsequent instructions. The processor can continue until the queue is full.

Since the processor then has to wait for the data to be fetched from the next cache level or from main memory before it can continue to execute, cache misses directly influence the performance of the application.

It is hard to tell from just the number of misses if cache misses are causing performance problems in an application. The same number of misses will cause a much greater relative slowdown in a short-running application than in a long-running one.

If we know the miss rate for reads in a cache memory, we can calculate the number of read-stall cycles as follows:


For writes, the expression is similar, except that the effect of the write buffer must be added in:


In many cache memory systems, the penalties are the same for a cache read or write. Here we can use a single miss rate, and miss penalty:


Example:
Assume a cache ``miss rate'' of 5%, (a ``hit rate'' of 95%) with cache memory of 20ns cycle time, and main memory of 150ns cycle time. We can calculate the average cycle time as


(1-0.05)*20ns + 0.05*150ns = 26.5ns


Write-Through

Write through is a storage method in which data is written into the cache and the corresponding main memory location at the same time. The cached data allows for fast retrieval on demand, while the same data in main memory ensures that nothing will get lost if a crash, power failure, or other system disruption occurs.

Although write through minimizes the risk of data loss, every write operation must be done twice, and this redundancy takes time. The active application program must wait until each block of data has been written into both the main memory and the cache before starting the next operation. The "data insurance" therefore comes at the expense of system speed.

Write through is the preferred method of data storage in applications where data loss cannot be tolerated, such as banking and medical device control. In less critical applications, and especially when data volume is large, an alternative method called write back accelerates system performance because updates are normally written exclusively to the cache, and are backed up in the main memory only at specified intervals or under certain conditions.

Write Back

Write back is a storage method in which data is written into the cache every time a change occurs, but is written into the corresponding location in main memory only at specified intervals or under certain conditions.

When a data location is updated in write back mode, the data in cache is called fresh, and the corresponding data in main memory, which no longer matches the data in cache, is called stale. If a request for stale data in main memory arrives from another application program, the cache controller updates the data in main memory before the application accesses it.

Write back optimizes the system speed because it takes less time to write data into cache alone, as compared with writing the same data into both cache and main memory. However, this speed comes with the risk of data loss in case of a crash or other adverse event.

Write back is the preferred method of data storage in applications where occasional data loss events can be tolerated. In more critical applications such as banking and medical device control, an alternative method called write through practically eliminates the risk of data loss because every update gets written into both the main memory and the cache. In write through mode, the main memory data always stays fresh.

Ø  Write-through – The information is written to both the block in the cache and to the block in the next lower level of the memory hierarchy
·         Write-through is always combined with a write buffer so write waits to lower level memory can be eliminated (as long as the write buffer doesn’t fill)

Ø  Write-back – The information is written only to the block in the cache. The modified cache block is written to main memory only when it is replaced.
·         Need a dirty bit to keep track of whether the block is clean or dirty

Ø  Pros and cons of each
·         Write-through: read misses don’t result in writes (so are simpler and cheaper)
·         Write-back: repeated writes require only one write to lower level

No comments:

Post a Comment