Advanced search
1 file | 530.39 KB Add to list

Shorter on-line warmup for sampled simulation of multi-threaded applications

Author
Organization
Abstract
Warmup is a crucial issue in sampled microarchitectural simulation to avoid performance bias by constructing accurate states for micro-architectural structures before each sampling unit. Not until very recently have researchers proposed Time-Based Sampling (TBS) for the sampled simulation of multi-threaded applications. However, warmup in TBS is challenging and complicated, because (i) full functional warmup in TBS causes very high overhead, limiting overall simulation speed; (ii) traditional adaptive functional warmup for sampling single-threaded applications cannot be readily applied to TBS; and (iii) checkpointing is inflexible (even invalid) due to the huge storage requirements and the variations across different runs for multi-threaded applications. In this work, we propose Shorter On-Line (SOL) warmup, which employs a two-stage strategy, using 'prime' warmup in the first stage, and an extended 'No-State-Loss (NSL)' method in the second stage. SOL is a single-pass, on-line warmup technique that addresses the warmup challenges posed in TBS in parallel simulators. SOL is highly accurate and efficient, providing a good trade-off between simulation accuracy and speed, and is easily deployed to different TBS techniques. For the PARSEC benchmarks on a simulated 8-core system, two state-of-the-art TBS techniques with SOL warmup provide a 7.2x and 37x simulation speedup over detailed simulation, respectively, compared to 3.1x and 4.5x under full warmup. SOL sacrifices only 0.3% in absolute execution time prediction accuracy on average.
Keywords
STATE-LOSS, CACHE SIMULATION, sampling, cold-start, micro-architectural simulation, warmup, multi-threaded applications

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 530.39 KB

Citation

Please use this url to cite or link to this publication:

MLA
Jiang, Chuntao et al. “Shorter On-line Warmup for Sampled Simulation of Multi-threaded Applications.” Proceedings of the International Conference on Parallel Processing. IEEE Computer Society, 2015. 350–359. Print.
APA
Jiang, C., Yu, Z., Jin, H., Liao, X., Eeckhout, L., Zeng, Y., & Xu, C.-Z. (2015). Shorter on-line warmup for sampled simulation of multi-threaded applications. Proceedings of the International Conference on Parallel Processing (pp. 350–359). Presented at the 44th Annual International Conference on Parallel Processing Workshops (ICPPW), IEEE Computer Society.
Chicago author-date
Jiang, Chuntao, Zhibin Yu, Hai Jin, Xiaofei Liao, Lieven Eeckhout, Yonggang Zeng, and Cheng-Zhong Xu. 2015. “Shorter On-line Warmup for Sampled Simulation of Multi-threaded Applications.” In Proceedings of the International Conference on Parallel Processing, 350–359. IEEE Computer Society.
Chicago author-date (all authors)
Jiang, Chuntao, Zhibin Yu, Hai Jin, Xiaofei Liao, Lieven Eeckhout, Yonggang Zeng, and Cheng-Zhong Xu. 2015. “Shorter On-line Warmup for Sampled Simulation of Multi-threaded Applications.” In Proceedings of the International Conference on Parallel Processing, 350–359. IEEE Computer Society.
Vancouver
1.
Jiang C, Yu Z, Jin H, Liao X, Eeckhout L, Zeng Y, et al. Shorter on-line warmup for sampled simulation of multi-threaded applications. Proceedings of the International Conference on Parallel Processing. IEEE Computer Society; 2015. p. 350–9.
IEEE
[1]
C. Jiang et al., “Shorter on-line warmup for sampled simulation of multi-threaded applications,” in Proceedings of the International Conference on Parallel Processing, Beijing, China, 2015, pp. 350–359.
@inproceedings{7023417,
  abstract     = {Warmup is a crucial issue in sampled microarchitectural simulation to avoid performance bias by constructing accurate states for micro-architectural structures before each sampling unit. Not until very recently have researchers proposed Time-Based Sampling (TBS) for the sampled simulation of multi-threaded applications. However, warmup in TBS is challenging and complicated, because (i) full functional warmup in TBS causes very high overhead, limiting overall simulation speed; (ii) traditional adaptive functional warmup for sampling single-threaded applications cannot be readily applied to TBS; and (iii) checkpointing is inflexible (even invalid) due to the huge storage requirements and the variations across different runs for multi-threaded applications. 

In this work, we propose Shorter On-Line (SOL) warmup, which employs a two-stage strategy, using 'prime' warmup in the first stage, and an extended 'No-State-Loss (NSL)' method in the second stage. SOL is a single-pass, on-line warmup technique that addresses the warmup challenges posed in TBS in parallel simulators. SOL is highly accurate and efficient, providing a good trade-off between simulation accuracy and speed, and is easily deployed to different TBS techniques. For the PARSEC benchmarks on a simulated 8-core system, two state-of-the-art TBS techniques with SOL warmup provide a 7.2x and 37x simulation speedup over detailed simulation, respectively, compared to 3.1x and 4.5x under full warmup. SOL sacrifices only 0.3% in absolute execution time prediction accuracy on average.},
  author       = {Jiang, Chuntao and Yu, Zhibin and Jin, Hai and Liao, Xiaofei and Eeckhout, Lieven and Zeng, Yonggang and Xu, Cheng-Zhong},
  booktitle    = {Proceedings of the International Conference on Parallel Processing},
  isbn         = {978-1-4673-7588-7},
  issn         = {0190-3918},
  keywords     = {STATE-LOSS,CACHE SIMULATION,sampling,cold-start,micro-architectural simulation,warmup,multi-threaded applications},
  language     = {eng},
  location     = {Beijing, China},
  pages        = {350--359},
  publisher    = {IEEE Computer Society},
  title        = {Shorter on-line warmup for sampled simulation of multi-threaded applications},
  url          = {http://dx.doi.org/10.1109/ICPP.2015.44},
  year         = {2015},
}

Altmetric
View in Altmetric
Web of Science
Times cited: