Advanced search
1 file | 2.54 MB Add to list

GPGPU-MiniBench: accelerating GPGPU micro-architecture simulation

(2015) IEEE TRANSACTIONS ON COMPUTERS. 64(11). p.3153-3166
Author
Organization
Abstract
Graphics processing units (GPU), due to their massive computational power with up to thousands of concurrent threads and general-purpose GPU (GPGPU) programming models such as CUDA and OpenCL, have opened up new opportunities for speeding up general-purpose parallel applications. Unfortunately, pre-silicon architectural simulation of modern-day GPGPU architectures and workloads is extremely time-consuming. This paper addresses the GPGPU simulation challenge by proposing a framework, called GPGPU-MiniBench, for generating miniature, yet representative GPGPU workloads. GPGPU-MiniBench first summarizes the inherent execution behavior of existing GPGPU workloads in a profile. The central component in the profile is the Divergence Flow Statistics Graph (DFSG), which characterizes the dynamic control flow behavior including loops and branches of a GPGPU kernel. GPGPU-MiniBench generates a synthetic miniature GPGPU kernel that exhibits similar execution characteristics as the original workload, yet its execution time is much shorter thereby dramatically speeding up architectural simulation. Our experimental results show that GPGPU-MiniBench can speed up GPGPU architectural simulation by a factor of 49 on average and up to 589 , with an average IPC error of 4.7 percent across a broad set of GPGPU benchmarks from the CUDA SDK, Rodinia and Parboil benchmark suites. We also demonstrate the usefulness of GPGPU-MiniBench for driving GPU architecture exploration.
Keywords
computer architecture, workload synthesis, workload characterization, GPGPU, CUDA

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 2.54 MB

Citation

Please use this url to cite or link to this publication:

MLA
Yu, Zhibin et al. “GPGPU-MiniBench: Accelerating GPGPU Micro-architecture Simulation.” IEEE TRANSACTIONS ON COMPUTERS 64.11 (2015): 3153–3166. Print.
APA
Yu, Z., Eeckhout, L., Goswami, N., Li, T., John, L. K., Jin, H., Xu, C.-Z., et al. (2015). GPGPU-MiniBench: accelerating GPGPU micro-architecture simulation. IEEE TRANSACTIONS ON COMPUTERS, 64(11), 3153–3166.
Chicago author-date
Yu, Zhibin, Lieven Eeckhout, Nilanjan Goswami, Tao Li, Lizy K John, Hai Jin, Cheng-Zhong Xu, and Junmin Wu. 2015. “GPGPU-MiniBench: Accelerating GPGPU Micro-architecture Simulation.” Ieee Transactions on Computers 64 (11): 3153–3166.
Chicago author-date (all authors)
Yu, Zhibin, Lieven Eeckhout, Nilanjan Goswami, Tao Li, Lizy K John, Hai Jin, Cheng-Zhong Xu, and Junmin Wu. 2015. “GPGPU-MiniBench: Accelerating GPGPU Micro-architecture Simulation.” Ieee Transactions on Computers 64 (11): 3153–3166.
Vancouver
1.
Yu Z, Eeckhout L, Goswami N, Li T, John LK, Jin H, et al. GPGPU-MiniBench: accelerating GPGPU micro-architecture simulation. IEEE TRANSACTIONS ON COMPUTERS. 2015;64(11):3153–66.
IEEE
[1]
Z. Yu et al., “GPGPU-MiniBench: accelerating GPGPU micro-architecture simulation,” IEEE TRANSACTIONS ON COMPUTERS, vol. 64, no. 11, pp. 3153–3166, 2015.
@article{6977972,
  abstract     = {Graphics processing units (GPU), due to their massive computational power with up to thousands of concurrent threads and general-purpose GPU (GPGPU) programming models such as CUDA and OpenCL, have opened up new opportunities for speeding up general-purpose parallel applications. Unfortunately, pre-silicon architectural simulation of modern-day GPGPU architectures and workloads is extremely time-consuming. This paper addresses the GPGPU simulation challenge by proposing a framework, called GPGPU-MiniBench, for generating miniature, yet representative GPGPU workloads. GPGPU-MiniBench first summarizes the inherent execution behavior of existing GPGPU workloads in a profile. The central component in the profile is the Divergence Flow Statistics Graph (DFSG), which characterizes the dynamic control flow behavior including loops and branches of a GPGPU kernel. GPGPU-MiniBench generates a synthetic miniature GPGPU kernel that exhibits similar execution characteristics as the original workload, yet its execution time is much shorter thereby dramatically speeding up architectural simulation. Our experimental results show that GPGPU-MiniBench can speed up GPGPU architectural simulation by a factor of 49 on average and up to 589 , with an average IPC error of 4.7 percent across a broad set of GPGPU benchmarks from the CUDA SDK, Rodinia and Parboil benchmark suites. We also demonstrate the usefulness of GPGPU-MiniBench for driving GPU architecture exploration.},
  author       = {Yu, Zhibin and Eeckhout, Lieven and Goswami, Nilanjan and Li, Tao and John, Lizy K and Jin, Hai and Xu, Cheng-Zhong and Wu, Junmin},
  issn         = {0018-9340},
  journal      = {IEEE TRANSACTIONS ON COMPUTERS},
  keywords     = {computer architecture,workload synthesis,workload characterization,GPGPU,CUDA},
  language     = {eng},
  number       = {11},
  pages        = {3153--3166},
  title        = {GPGPU-MiniBench: accelerating GPGPU micro-architecture simulation},
  url          = {http://dx.doi.org/10.1109/TC.2015.2395427},
  volume       = {64},
  year         = {2015},
}

Altmetric
View in Altmetric
Web of Science
Times cited: