School of Informatics - University of Edinburgh Institute for Computing Systems Architecture - School of Informatics
Institute for Computing
Systems Architecture

Instruction set simulators are indispensable tools in both ASIP design space exploration and the software development and optimisation process for existing platforms. A functional simulator is a focal point of the tool-flow for embedded systems and ASIC design, acting as a Golden Reference model for the complete system. Hence, a functional simulator has three primary uses, each with distinct and sometimes conflicting requirements, as illustrated below:
High Speed Simulation - Goals
To meet these three requirements, the PASTA project developed a functional simulator that has a high-speed JIT compilation capability, a cycle-accurate modeling capability and yet maintains a precise model of the architectural state of the processor it simulates. It can therefore be used as a back-end target for a debugger, to assist in software development, as well as providing a Golden Reference Model to our co-simulation environment, and providing detailed cycle counts and other performance measurements.
The use of just-in-time (JIT) dynamic binary translation (DBT) techniques allows us to create very high speed functional simulators capable of simulating an embedded system at speeds approaching (or even exceeding) real time. The simulator developed within this activity is used extensively by the PASTA team, particularly for co-simulation and the development of the CoSy compiler.

JIT Translation

The simulator operates by interpreting and profiling the target code over a short period of time, called an epoch. At the end of each epoch, frequently executed blocks of target code are translated to the host architecture, using an optimising compiler backend. The result of dynamic binary translation can be seen in the example below, which shows a single basic block (from the Linux kernel): High Speed Simulation - Assembly code
Below we see the sequence of host instructions created when the target sequence is translated.
High Speed Simulation - Translated code

Results

When operating in JIT mode, using individual basic blocks as the translation unit, we see simulation rates in the range 260-730 native MIPS. The measurements shown below were taken on an Intel Xeon 5160 3.0 GHz server with 32KB I/D caches and 4MB L2 cache per dual-core CPU.
High Speed Simulation - Simulation speed The simulator has persistent translations, allowing it to learn how to speed up the simulation of each application by keeping useful translations from one run to the next. The chart below shows how the simulation rate, during the booting of a Linux kernel, increases over a sequence of 7 runs.
High Speed Simulation - Persistence
Interestingly, the speed of the simulator on a high-end Xeon server will typically be 4 times greater than the speed of a full implementation of the EnCore processor when running in an FPGA, and will be comparable to the real-time speed of a silicon implementation.

Publications