Advanced search
1 file | 1.86 MB Add to list
Author
Organization
Abstract
Superscalar out-of-order cores deliver high performance at the cost of increased complexity and power budget. In-order cores, in contrast, are less complex and have a smaller power budget, but offer low performance. A processor architecture should ideally provide high performance in a power- and cost-efficient manner. Recently proposed slice-out-of-order (sOoO) cores identify backward slices of memory operations which they execute out-of-order with respect to the rest of the dynamic instruction stream for increased instruction-level and memory-hierarchy parallelism. Unfortunately, constructing backward slices is imprecise and hardware-inefficient, leaving performance on the table. In this paper, we propose Forward Slice Core (FSC), a novel core microarchitecture that builds on a stall-on-use in-order core and extracts more instruction-level and memory-hierarchy parallelism than slice-out-of-order cores. FSC does so by identifying and steering forward slices (rather than backward slices) to dedicated in-order FIFO queues. Moreover, FSC puts load-consumers that depend on L1 D-cache misses on the side to enable younger independent load-consumers to execute faster. Finally, FSC eliminates the need for dynamic memory disambiguation by replicating store-address instructions across queues. FSC improves performance by 9.7% on average compared to Freeway, the state-of-the-art sOoO core, across the SPEC CPU2017 benchmarks, while incurring reduced hardware complexity and a similar power budget.

Downloads

  • pact20.pdf
    • full text (Accepted manuscript)
    • |
    • open access
    • |
    • PDF
    • |
    • 1.86 MB

Citation

Please use this url to cite or link to this publication:

MLA
Lakshminarasimhan, Kartik, et al. “The Forward Slice Core Microarchitecture.” PACT’20 : Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, 2020, pp. 361–72, doi:10.1145/3410463.3414629.
APA
Lakshminarasimhan, K., Naithani, A., Feliu, J., & Eeckhout, L. (2020). The forward slice core microarchitecture. PACT’20 : Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, 361–372. https://doi.org/10.1145/3410463.3414629
Chicago author-date
Lakshminarasimhan, Kartik, Ajeya Naithani, Josué Feliu, and Lieven Eeckhout. 2020. “The Forward Slice Core Microarchitecture.” In PACT’20 : Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, 361–72. https://doi.org/10.1145/3410463.3414629.
Chicago author-date (all authors)
Lakshminarasimhan, Kartik, Ajeya Naithani, Josué Feliu, and Lieven Eeckhout. 2020. “The Forward Slice Core Microarchitecture.” In PACT’20 : Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, 361–372. doi:10.1145/3410463.3414629.
Vancouver
1.
Lakshminarasimhan K, Naithani A, Feliu J, Eeckhout L. The forward slice core microarchitecture. In: PACT’20 : Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques. 2020. p. 361–72.
IEEE
[1]
K. Lakshminarasimhan, A. Naithani, J. Feliu, and L. Eeckhout, “The forward slice core microarchitecture,” in PACT’20 : Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques, Virtual Event GA USA, 2020, pp. 361–372.
@inproceedings{8692318,
  abstract     = {{Superscalar out-of-order cores deliver high performance at the cost of increased complexity and power budget. In-order cores, in contrast, are less complex and have a smaller power budget, but offer low performance. A processor architecture should ideally provide high performance in a power- and cost-efficient manner. Recently proposed slice-out-of-order (sOoO) cores identify backward slices of memory operations which they execute out-of-order with respect to the rest of the dynamic instruction stream for increased instruction-level and memory-hierarchy parallelism. Unfortunately, constructing backward slices is imprecise and hardware-inefficient, leaving performance on the table.

In this paper, we propose Forward Slice Core (FSC), a novel core microarchitecture that builds on a stall-on-use in-order core and extracts more instruction-level and memory-hierarchy parallelism than slice-out-of-order cores. FSC does so by identifying and steering forward slices (rather than backward slices) to dedicated in-order FIFO queues. Moreover, FSC puts load-consumers that depend on L1 D-cache misses on the side to enable younger independent load-consumers to execute faster. Finally, FSC eliminates the need for dynamic memory disambiguation by replicating store-address instructions across queues. FSC improves performance by 9.7% on average compared to Freeway, the state-of-the-art sOoO core, across the SPEC CPU2017 benchmarks, while incurring reduced hardware complexity and a similar power budget.}},
  author       = {{Lakshminarasimhan, Kartik and Naithani, Ajeya and Feliu, Josué and Eeckhout, Lieven}},
  booktitle    = {{PACT'20 : Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques}},
  isbn         = {{9781450380751}},
  issn         = {{1089-795X}},
  language     = {{eng}},
  location     = {{Virtual Event GA USA}},
  pages        = {{361--372}},
  title        = {{The forward slice core microarchitecture}},
  url          = {{http://doi.org/10.1145/3410463.3414629}},
  year         = {{2020}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: