Hide menu

Lab Assignment 1: Cache Memories

Table of Contents

Objective

The purpose of this lab is to understand the functionality of cache memories, and to get an insight into various trade-offs related to the design of systems with cache memories.

Time allocation

4 hours (2 lab sessions) are allocated for this lab.

Preparation

You should review the following resources before you start working on this lab:

  • Lab 0: Using the Toolset
  • Lecture notes about cache memories
  • Chapter 4 (on cache memories) in the course book
  • This link briefly describes different caching policies. It also provides tools to help visualize different cache structures.

Assignments

  1. Cache basics
    Solve the following problems and include the solution in the submitted report.

  2. Locality of data
    This assignment requires you to use sim-cache for the simulations on two different architectures.
    You need to run two test programs (test1.ss, test2.ss) on each of the two architectures.

    The source code and the binaries of the test programs are of the .c and .ss suffix, respectively. The configuration files for simulations are of the .cfg suffix. The source code, the test programs and the configuration files are already available in the simplescalar/cde-root directory - which you copied earlier in Lab 0.

    To complete this assignment, please follow the instructions listed below:
    • Look into the configuration files, cache1.cfg and cache2.cfg, used for the simulations.
      Compare the caches in the configuration files.
    • Report which fragments of the source code (in both test programs) exhibit spatial and temporal locality of the data.
    • Consider the nested loops in both the programs (i.e., Line 22-26 in test1.c and Line 22-27 in test2.c). Further consider all possible combinations of test programs and cache configurations (i.e. test1.c with cache1.cfg and cache2.cfg, as well as test2.c with cache1.cfg and cache2.cfg). For each such combination, report the number of times variable "sum" is incremented between two consecutive cache misses. This basically means count the number of cache hits between two consecutive cache misses. Assume that the first access to the cache results in a cache miss. Find the combination of test program and cache configuration with the best and the worst miss ratio. You can assume int is 4 bytes.
      For this part, assume that only the array elements are stored in the cache.
    • Run simulations for each test program (test1.ss and test2.ss) using both configuration files (cache1.cfg and cache2.cfg). To run simulations with given configurations, please follow the instructions in preparations. Find the combination of the test program and cache configuration with the best and the worst miss ratio. Verify that your simulation results agree with your answer in the previous question. [Note that the miss ratios obtained from the simulations are not identical to your analysis.]
  3. Evaluation of cache configurations

    This assignment requires you to use sim-cheetah in order to evaluate the performance of several cache configurations.

    • Create a configuration file for sim-cheetah (by first dumping the default configuration file of sim-cheetah as explained in lab0) so that the simulations cover the following cache configurations:
      • caches with a number of sets between 128 and 4096,
      • caches with a level of associativity between 1 and 8,
      • caches with a line size of 32 bytes and a LRU replacement policy.
      ATTENTION: For those arguments that are commented with "log base 2", you should use the value of power 2 instead of the original value. For example, if you want to set the line size of the cache to be 64, you should set 6 instead of 64.

      Using the go.ss benchmark, run the simulation for the three cases in which the cache stores either only data (data cache), or only intructions (instruction cache), or both (unified cache). To specify whether the cache stores only data, instructions or both, modify the -refs parameter (# reference stream to analyze, i.e., {inst|data|unified}) in the configuartion file.

      Here is an example of runnning go.ss on SimpleScalar:

        ./sim-cheetah -config config-file go.ss 3 7
      where go.ss is a simple program that makes the computer run the game "Go" against itself using the config-file configuration file. The first parameter (3) indicates the level of the player, while the second parameter (7) sets the go table size.

    • For each of the three simulations, extract the results from the output files, and create a plot graph which shows how the miss ratio depends on the associativity.
    • Answer the following questions.
      • What is the relation between the miss ratio and: the associativity and number of sets?
      • Assume that "go.ss" has almost the same number of instruction and data accesses. Further assume that we are poor enough to afford only one cache (with the same cache size). Based on your plot graphs, explain which variant of cache (instruction, data or unified) you will choose to run this application.
      • Consider your plot graph for the instruction cache. Assume that a designer can spend at most 8000 SEK for an instruction cache and the cost to design each byte in an instruction cache is 0.01 SEK. For instance, with this budget and cost, we can have an instruction cache with the maximum cache size of 800000 bytes. Now, for the given maximum budget and the given cost-per-byte, find the exact cache configuration that minimizes the following cost function:

        cost function = (100000 * miss ratio) + (0.01 * cache size in bytes)

What to report

  • Answers (with sufficient explanations) to each of the previous problems. You may use figures when necessary.

  • Page responsible: Zebo Peng
    Last updated: 2023-09-27