Homework 3 -- LRU Approximations

Overview

Due: EOD Mar. 3rd

The goals of this homework are:

  1. Get experience hacking some piece of gem5
  2. Practice considering tradeoffs in cache design
  3. Suffer through simulation with more complex workloads (useful for projects!).

Setup

In this lab, we’ll investigate some approximations of the LRU cache replacement policy that use less bits of information per-set than full-LRU.

Step 1: Implement Two New LRU Approximations:

Gem5 implements a pseudo-LRU replacement policy based on trees (tree_plru_rp.hh). Try to implement two additional policies:

This tutorial will tell you a bit about how to accomplish this within gem5.

Step 2: Beyond LRU

LRU is not the best policy for all workloads. One example is when the working set is just slightly bigger than cache capacity: imagine a simple case where you’re streaming through a vector may times. LRU will always keep the more recently used data, so by the time you get to the beginning again, the beginning of the vector is evicted.

Look through the available cache replacement policies in gem5 (gem5/src/mem/cache/replacement_policies/), and select one that is thrash resistant.

Step 3: Running With SPEC

You can test your policies on the previous microbenchmarks, but the point of this assignment is to get some experience with a “better” benchmark. Specifically, we’ll use SPEC 2017.

Follow the directions here to run SPEC 2017.

Step 3: Architectural exploration

Suppose you are designing the L1D and L2 data caches of a new processor, based on gem5’s out-of-order O3CPU. You’re trying to decide whether it’s worth it to implement a complex policy or higher associativity, or both.

Run the simulations of the discussed policies (NMRU,NRU,PLRU,LRU,THRASH) on the SPEC workloads, and assume you have a 64KB L1D, and a 2MB L2. Assume both L1D and L2 policies are the same, and that you are using 8-way set associativity by default. Please use these additional parameters: L1I + LRU, 2GHz frequency, 8-issue OOO core.

Now Answer the following questions:

Your report should describe the configurations you simulated, and use some charts/graphs to back up your answers and analysis. Feel free to choose the experiments as appropriate, or use microbenchmarks if you like.

What to Hand In

Please turn in to Canvas separately:

  1. A PDF of the report
  2. A zip file containing the config.ini and stats.txt for the simulations in gem5.

How we will grade this:

50 points for completing the implementation/experiments, and 50 points for the analysis.

Citation

This homework is based on gem5 101 homework 5, but we use cooler workloads now. Thanks Zhengrong!