The optimum pipeline depth for a microprocessor
A. Hartstein, Thomas R. Puzak
Abstract
The impact of pipeline length on the performance of a
microprocessor is explored both theoretically and by simulation. An
analytical theory is presented that shows two opposing
architectural parameters affect the optimal pipeline length: the
degree of instruction level parallelism (superscalar) decreases the
optimal pipeline length, while the lack of pipeline stalls
increases the optimal pipeline length. This theory is tested by
analyzing the optimal pipeline length for 35 applications
representing three classes of workloads. Trace tapes are collected
from SPEC95 and SPEC2000 applications, traditional (legacy)
database and on-line transaction processing (OLTP) applications,
and modern (e. g. web) applications primarily written in Java and
C++. The results show that there is a clear and significant
difference in the optimal pipeline length between the SPEC
workloads and both the legacy and modern applications. The SPEC
applications, written in C, optimize to a shorter pipeline length
than the legacy applications, largely written in assembler
language, with relatively little overlap in the two distributions.
Additionally, the optimal pipeline length distribution for the C++
and Java workloads overlaps with the legacy applications,
suggesting similar workload characteristics. These results are
explored across a wide range of superscalar processors, both
in-order and out-of-order.