Advanced search
1 file | 451.29 KB

An efficient CPI stack counter architecture for superscalar processors

Osman Allam (UGent) , Stijn Eyerman (UGent) and Lieven Eeckhout (UGent)
Author
Organization
Abstract
Cycles-Per-Instruction (CPI) stacks provide intuitive and insightful performance information to software developers. Performance bottlenecks are easily identified from CPI stacks, which hint towards software changes for improving performance. Computing CPI stacks on contemporary superscalar processors is non-trivial though because of various overlap effects. Prior work proposed a CPI counter architecture for computing CPI stacks on out-of-order processors. The accuracy of the obtained CPI stacks was evaluated previously, however, the hardware overhead analysis was not based on a detailed hardware implementation. In this paper, we implement the previously proposed CPI counter architecture in hardware and we find that the previous design can be further optimized. We propose a novel hardware- and power-efficient CPI counter architecture that reduces chip area by 44% and power consumption by 47% over the best possible prior design, while maintaining nearly the same level of performance and accuracy.

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 451.29 KB

Citation

Please use this url to cite or link to this publication:

Chicago
Allam, Osman, Stijn Eyerman, and Lieven Eeckhout. 2012. “An Efficient CPI Stack Counter Architecture for Superscalar Processors.” In GLSVLSI  ’12 : Proceedings of the Great Lakes Symposium on VLSI, 55–58. New York, NY, USA: Association for Computing Machinery (ACM).
APA
Allam, O., Eyerman, S., & Eeckhout, L. (2012). An efficient CPI stack counter architecture for superscalar processors. GLSVLSI  ’12 : proceedings of the Great Lakes symposium on VLSI (pp. 55–58). Presented at the 22nd Great Lakes symposium on VLSI (GLSVLSI 2012), New York, NY, USA: Association for Computing Machinery (ACM).
Vancouver
1.
Allam O, Eyerman S, Eeckhout L. An efficient CPI stack counter architecture for superscalar processors. GLSVLSI  ’12 : proceedings of the Great Lakes symposium on VLSI. New York, NY, USA: Association for Computing Machinery (ACM); 2012. p. 55–8.
MLA
Allam, Osman, Stijn Eyerman, and Lieven Eeckhout. “An Efficient CPI Stack Counter Architecture for Superscalar Processors.” GLSVLSI  ’12 : Proceedings of the Great Lakes Symposium on VLSI. New York, NY, USA: Association for Computing Machinery (ACM), 2012. 55–58. Print.
@inproceedings{3073169,
  abstract     = {Cycles-Per-Instruction (CPI) stacks provide intuitive and insightful performance information to software developers. Performance bottlenecks are easily identified from CPI stacks, which hint towards software changes for improving performance.
Computing CPI stacks on contemporary superscalar processors is non-trivial though because of various overlap effects. Prior work proposed a CPI counter architecture for computing CPI stacks on out-of-order processors. The accuracy of the obtained CPI stacks was evaluated previously, however, the hardware overhead analysis was not based on a detailed hardware implementation.
In this paper, we implement the previously proposed CPI counter architecture in hardware and we find that the previous design can be further optimized. We propose a novel hardware- and power-efficient CPI counter architecture that reduces chip area by 44\% and power consumption by 47\% over the best possible prior design, while maintaining nearly the same level of performance and accuracy.},
  author       = {Allam, Osman and Eyerman, Stijn and Eeckhout, Lieven},
  booktitle    = {GLSVLSI '12 : proceedings of the Great Lakes symposium on VLSI},
  isbn         = {9781450312448},
  language     = {eng},
  location     = {Salt Lake City, UT, USA},
  pages        = {55--58},
  publisher    = {Association for Computing Machinery (ACM)},
  title        = {An efficient CPI stack counter architecture for superscalar processors},
  url          = {http://dx.doi.org/10.1145/2206781.2206796},
  year         = {2012},
}

Altmetric
View in Altmetric