Advanced search
1 file | 293.27 KB

Optimizing the datacenter for data-centric workloads

Stijn Polfliet (UGent) , Frederick Ryckbosch (UGent) and Lieven Eeckhout (UGent)
Author
Organization
Project
HPC-UGent: the central High Performance Computing infrastructure of Ghent University
Abstract
The amount of data produced on the internet is growing rapidly. Along with data explosion comes the trend towards more and more diverse data, including rich media such as audio and video. Data explosion and diversity leads to the emergence of data-centric workloads to manipulate, manage and analyze the vast amounts of data. These data-centric workloads are likely to run in the background and include application domains such as data mining, indexing, compression, encryption, audio/video manipulation, data warehousing, etc. Given that datacenters are very much cost sensitive, reducing the cost of a single component by a small fraction immediately translates into huge cost savings because of the large scale. Hence, when designing a datacenter, it is important to understand data-centric workloads and optimize the ensemble for these workloads so that the best possible performance per dollar is achieved. This paper studies how the emerging class of data-centric workloads affects design decisions in the datacenter. Through the architectural simulation of minutes of run time on a validated full-system x86 simulator, we derive the insight that for some data-centric workloads, a high-end server optimizes performance per total cost of ownership (TCO), whereas for other workloads, a low-end server is the winner. This observation suggests heterogeneity in the datacenter, in which a job is run on the most cost-efficient server. Our experimental results report that a heterogeneous datacenter achieves an up to 88%, 24% and 17% improvement in cost-efficiency over a homogeneous high-end, commodity and low-end server datacenter, respectively.
Keywords
computer architecture, data center, workload characterization

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 293.27 KB

Citation

Please use this url to cite or link to this publication:

Chicago
Polfliet, Stijn, Frederick Ryckbosch, and Lieven Eeckhout. 2011. “Optimizing the Datacenter for Data-centric Workloads.” In ICS  ’11 : Proceedings of the International Conference on Supercomputing, 182–191. New York, NY, USA: Association for Computing Machinery (ACM).
APA
Polfliet, S., Ryckbosch, F., & Eeckhout, L. (2011). Optimizing the datacenter for data-centric workloads. ICS  ’11 : proceedings of the international conference on supercomputing (pp. 182–191). Presented at the 25th International conference on Supercomputing (ICS 2011), New York, NY, USA: Association for Computing Machinery (ACM).
Vancouver
1.
Polfliet S, Ryckbosch F, Eeckhout L. Optimizing the datacenter for data-centric workloads. ICS  ’11 : proceedings of the international conference on supercomputing. New York, NY, USA: Association for Computing Machinery (ACM); 2011. p. 182–91.
MLA
Polfliet, Stijn, Frederick Ryckbosch, and Lieven Eeckhout. “Optimizing the Datacenter for Data-centric Workloads.” ICS  ’11 : Proceedings of the International Conference on Supercomputing. New York, NY, USA: Association for Computing Machinery (ACM), 2011. 182–191. Print.
@inproceedings{1977531,
  abstract     = {The amount of data produced on the internet is growing rapidly. Along with data explosion comes the trend towards more and more diverse data, including rich media such as audio and video. Data explosion and diversity leads to the emergence of data-centric workloads to manipulate, manage and analyze the vast amounts of data. These data-centric workloads are likely to run in the background and include application domains such as data mining, indexing, compression, encryption, audio/video manipulation, data warehousing, etc.
Given that datacenters are very much cost sensitive, reducing the cost of a single component by a small fraction immediately translates into huge cost savings because of the large scale. Hence, when designing a datacenter, it is important to understand data-centric workloads and optimize the ensemble for these workloads so that the best possible performance per dollar is achieved.
This paper studies how the emerging class of data-centric workloads affects design decisions in the datacenter.
Through the architectural simulation of minutes of run time on a validated full-system x86 simulator, we derive the insight that for some data-centric workloads, a high-end server optimizes performance per total cost of ownership (TCO), whereas for other workloads, a low-end server is the winner. This observation suggests heterogeneity in the datacenter, in which a job is run on the most cost-efficient server. Our experimental results report that a heterogeneous datacenter achieves an up to 88\%, 24\% and 17\% improvement in cost-efficiency over a homogeneous high-end, commodity and low-end server datacenter, respectively.},
  author       = {Polfliet, Stijn and Ryckbosch, Frederick and Eeckhout, Lieven},
  booktitle    = {ICS '11 : proceedings of the international conference on supercomputing},
  isbn         = {9781450301022},
  keyword      = {computer architecture,data center,workload characterization},
  language     = {eng},
  location     = {Tucson, AZ, USA},
  pages        = {182--191},
  publisher    = {Association for Computing Machinery (ACM)},
  title        = {Optimizing the datacenter for data-centric workloads},
  url          = {http://dx.doi.org/10.1145/1995896.1995926},
  year         = {2011},
}

Altmetric
View in Altmetric