Ghent University Academic Bibliography

Advanced

Speedup stacks: identifying scaling Bottlenecks in multi-threaded applications

Stijn Eyerman UGent, Kristof Du Bois UGent and Lieven Eeckhout UGent (2012) IEEE international symposium on performance analysis of systems and software, Proceedings. p.145-155
abstract
Multi-threaded workloads typically show sublinear speedup on multi-core hardware, i.e., the achieved speedup is not proportional to the number of cores and threads. Sublinear scaling may have multiple causes, such as poorly scalable synchronization leading to spinning and/or yielding, and interference in shared resources such as the lastlevel cache (LLC) as well as the main memory subsystem. It is vital for programmers and processor designers to understand scaling bottlenecks in existing and emerging workloads in order to optimize application performance and design future hardware. In this paper, we propose the speedup stack, which quantifies the impact of the various scaling delimiters on multithreaded application speedup in a single stack. We describe a mechanism for computing speedup stacks on a multi-core processor, and we find speedup stacks to be accurate within 5.1% on average for sixteen-threaded applications. We present several use cases: we discuss how speedup stacks can be used to identify scaling bottlenecks, classify benchmarks, optimize performance, and understand LLC performance.
Please use this url to cite or link to this publication:
author
organization
year
type
conference
publication status
published
subject
keyword
multi-threaded, performance analysis, multi-core, Computer systems
in
IEEE international symposium on performance analysis of systems and software, Proceedings
pages
145 - 155
publisher
IEEE
place of publication
New York, NY, USA
conference name
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS - 2012)
conference location
New Brunswick, NJ, USA
conference start
2012-04-01
conference end
2012-04-03
ISBN
9781467311441
language
English
UGent publication?
yes
classification
C1
copyright statement
I have transferred the copyright for this publication to the publisher
id
2093717
handle
http://hdl.handle.net/1854/LU-2093717
date created
2012-04-24 16:14:58
date last changed
2012-04-25 13:51:05
@inproceedings{2093717,
  abstract     = {Multi-threaded workloads typically show sublinear speedup on multi-core hardware, i.e., the achieved speedup is not proportional to the number of cores and threads. Sublinear scaling may have multiple causes, such as poorly scalable synchronization leading to spinning and/or yielding, and interference in shared resources such as the lastlevel cache (LLC) as well as the main memory subsystem. It is vital for programmers and processor designers to understand scaling bottlenecks in existing and emerging workloads in order to optimize application performance and design future hardware. In this paper, we propose the speedup stack, which quantifies the impact of the various scaling delimiters on multithreaded application speedup in a single stack. We describe a mechanism for computing speedup stacks on a multi-core processor, and we find speedup stacks to be accurate within 5.1\% on average for sixteen-threaded applications. We present several use cases: we discuss how speedup stacks can be used to identify scaling bottlenecks, classify benchmarks, optimize performance, and understand LLC performance.},
  author       = {Eyerman, Stijn and Du Bois, Kristof and Eeckhout, Lieven},
  booktitle    = {IEEE international symposium on performance analysis of systems and software, Proceedings},
  isbn         = {9781467311441},
  keyword      = {multi-threaded,performance analysis,multi-core,Computer systems},
  language     = {eng},
  location     = {New Brunswick, NJ, USA},
  pages        = {145--155},
  publisher    = {IEEE},
  title        = {Speedup stacks: identifying scaling Bottlenecks in multi-threaded applications},
  year         = {2012},
}

Chicago
Eyerman, Stijn, Kristof Du Bois, and Lieven Eeckhout. 2012. “Speedup Stacks: Identifying Scaling Bottlenecks in Multi-threaded Applications.” In IEEE International Symposium on Performance Analysis of Systems and Software, Proceedings, 145–155. New York, NY, USA: IEEE.
APA
Eyerman, S., Du Bois, K., & Eeckhout, L. (2012). Speedup stacks: identifying scaling Bottlenecks in multi-threaded applications. IEEE international symposium on performance analysis of systems and software, Proceedings (pp. 145–155). Presented at the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS - 2012), New York, NY, USA: IEEE.
Vancouver
1.
Eyerman S, Du Bois K, Eeckhout L. Speedup stacks: identifying scaling Bottlenecks in multi-threaded applications. IEEE international symposium on performance analysis of systems and software, Proceedings. New York, NY, USA: IEEE; 2012. p. 145–55.
MLA
Eyerman, Stijn, Kristof Du Bois, and Lieven Eeckhout. “Speedup Stacks: Identifying Scaling Bottlenecks in Multi-threaded Applications.” IEEE International Symposium on Performance Analysis of Systems and Software, Proceedings. New York, NY, USA: IEEE, 2012. 145–155. Print.