Brief Note on Simulating SPEC2017 on GEM5

This is a brief note on how to simulate SPEC2017, a standard (if complicated) CPU benchmark, on gem5.

Options

SPEC2017 has complicated compile scripts. So, we provide two paths forward, one easy, one not so much.

Option 1: Use Prebuilt SPEC 2017 and Use Provided Script

We provide an prebuilt version of SPEC2017 and the script to run simulation in /usr/eda/CS251A/zhengrong/spec2017. These following commands will copy the folder to your docker root directory (if you follow the docker instruction and mount the /usr/eda/CS251A/your_username to /root in docker), and start simulation.

# log into tetracosa first, and then copy.
cp -r /usr/eda/CS251A/zhengrong/spec2017 /usr/eda/CS251A/<YOUR USERNAME>/
# log into docker.
docker exec -ti cs251a-<YOUR USERNAME> /bin/bash
# there should be a spec2017 in /root
cd /root/spec2017
# Important: Change spec2017's makefile to use your configuration script. (this one uses the old se.py)
vim Makefile #etc...
# build all benchmarks
make buildall
# simulate with gem5.
make simall

After the simulation you should see something like:

Exiting @ tick 159318111500 because a thread reached the max instruction count

This is because we limit the number of simulated instructions (otherwise it takes forever to finish). The default simall takes about 3 hours to finish on tetracosa. You can modify the variable SIM_INSTS in spec2017/Makefile to adjust the simulation time (currently 50 million instructions).

Notes:

You will need to read through spec2017/Makefile to understand what’s going on and modify the simulation options to do your own architectural exploration. As noted, this version uses the old se.py. You’ll need to sub-in whatever new configuration approach you are using in your work (or use the deprecated se.py).
The simulation results are located in spec2017 folder. For example, for lbm_s, the results can be found at:
```
cd /root/spec2017/benchspec/CPU/619.lbm_s/run/run_base_refspeed_mytest-m64.0000/m5out
```
By default you this script is setup to fast-forward, i.e. use a simpler CPU to quikly skip some of the less-important instructions. This may cause some incompatibility if you are directly setting CPU parameters (e.g. issue width) in se.py for system.cpu[i]. For example, gem5 will complain that it’s an AtomicSimpleCPU and there is no issue width to set. The problem is that SPEC2017 script uses fast forwarding to skip the initialization phase, which uses AtomicSimpleCPU. The actual CPU (O3CPU) is created in Simulation.py. Specifically, in run() there is switch_cpus[i] which is the actual O3CPU if you want to set the issue width and other parameters. Also, you can always print() in the python script to verify that you have correctly set the parameter.

Option 2: Download yourself and install

If you want to manually do this without a script, we provide the following more detailed steps.

First, download and install SPEC 2017.

(SPEC 2017 Download – Licensed for UCLA only, technically for tetracosa)

From here, the basic workflow is to compile it, do a fake run to get the arguments for the binary, and finally simulate it in gem5. This is by no means the official instructions or guaranteed to work on your machine. You can also follow the instructions on the official website of SPEC2017.

Compile SPEC2017

First go to the folder of SPEC2017, and set up the environment. This gives you many useful commands to navigate through the SPEC2017, compile and run it.

> source shrc

Here we use lbm_s as an example. For other workloads it should be similar. Now let’s go to where lbm_s is:

> go lbm_s

The first thing is to do fake run. This will let the building system set up all the folder, inputs and so on. You can also do a full run, which will take much longer time to finish, but it’s a good way to verify that you SPEC2017 works on your native machine.

# Remove existing build
> rm -r build
> runcpu --fake --config gcc-linux-x86 lbm_s

This should create build and run folder. Now let’s compile the program:

> cd build/build_base_mytest-m64.0000
> specmake

This should compile and gives a lbm_s binary in the folder.

Simulate it in GEM5

First we need to get arguments to run the binary. Go the the run directory.

> go lbm_s 
> cd run/run_base_refspeed_mytest-m64.0000
> specinvoke -n
../run_base_refspeed_mytest-m64.0000/lbm_s_base.mytest-m64 2000 reference.dat 0 0 200_200_260_ldc.of > lbm.out 2>> lbm.err

This gives us the command line arguments to run lbm_s:

reference.dat 0 0 200_200_260_ldc.of

Now simulate it in gem5. This command will start to simulate lbm_s using AtomicSimpleCPU. You can also specify other CPU types and add cache, and there are detailed instructions here.

> /where/gem5/is/build/X86/gem5.opt \
> /where/gem5/is/configs/example/se.py \
> --cmd=../../build/build_base_mytest-m64.0000/lbm_s \
> --options="2000 reference.dat 0 0 200_200_260_ldc.of" \
> --mem-size=8GB

As an example to simulate using O3 cpu:

> /where/gem5/is/build/X86/gem5.opt \
> /where/gem5/is/configs/example/se.py \
> --cmd=../../build/build_base_mytest-m64.0000/lbm_s \
> --options="2000 reference.dat 0 0 200_200_260_ldc.of" \
> --mem-size=8GB \
> --cpu-type=DerivO3CPU \
> --caches --l2cache \
> --l1d_size=32kB --l1i_size=32kB --l2_size=512kB

Finally, here is how you can fast forward using AtomicSimpleCPU for 1 million instructions and then switch to DerivO3CPU:

> /where/gem5/is/build/X86/gem5.opt \
> /where/gem5/is/configs/example/se.py \
> --cmd=../../build/build_base_mytest-m64.0000/lbm_s \
> --options="2000 reference.dat 0 0 200_200_260_ldc.of" \
> --mem-size=8GB \
> --cpu-type=DerivO3CPU \
> --caches --l2cache \
> --l1d_size=32kB --l1i_size=32kB --l2_size=512kB \
> --fast-forward=1000000

The same warning about changing parameters with fast-forwarding and se.py from the previous section applies here as well.

CS251a