
A pipelining-based heterogeneous scheduling and energy-throughput optimization scheme for CNNs leveraging apache TVM
- Author
- Delia Velasco-Montero, Bart Goossens (UGent) , Jorge Fernández-Berni, Ángel Rodríguez-Vázquez and Wilfried Philips (UGent)
- Organization
- Project
- Abstract
- Extracting information of interest from continuous video streams is a strongly demanded computer vision task. For the realization of this task at the edge using the current de-facto standard approach, i.e., deep learning, it is critical to optimize key performance metrics such as throughput and energy consumption according to prescribed application requirements. This allows achieving timely decision-making while extending the battery lifetime as much as possible. In this context, we propose a method to boost neural-network performance based on a co-execution strategy that exploits hardware heterogeneity on edge platforms. The enabling tool is Apache TVM, a highly efficient machine-learning compiler compatible with a diversity of hardware back-ends. The proposed approach solves the problem of network partitioning and distributes the workloads to make concurrent use of all the processors available on the board following a pipeline scheme. We conducted experiments on various popular CNNs compiled with TVM on the Jetson TX2 platform. The experimental results based on measurements show a significant improvement in throughput with respect to a single-processor execution, ranging from 14% to 150% over all tested networks. Power-efficient configurations were also identified, accomplishing energy reductions above 10%.
- Keywords
- General Engineering, General Materials Science, General Computer Science, Electrical and Electronic Engineering, Hardware, Graphics processing units, Runtime, Throughput, Costs, Convolutional neural networks, Performance evaluation, Apache TVM, continuous inference, convolutional neural networks, edge vision, heterogeneous processing, jetson TX2, performance optimization, DEEP NEURAL-NETWORKS, MOBILE, INFERENCE, INTERNET
Downloads
-
A Pipelining-Based Heterogeneous Scheduling and Energy-Throughput Optimization Scheme for CNNs Leveraging Apache TVM.pdf
- full text (Published version)
- |
- open access
- |
- |
- 2.52 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-01GY7JZKDHSCDVB3V2XA5PRBAB
- MLA
- Velasco-Montero, Delia, et al. “A Pipelining-Based Heterogeneous Scheduling and Energy-Throughput Optimization Scheme for CNNs Leveraging Apache TVM.” IEEE ACCESS, vol. 11, 2023, pp. 35007–21, doi:10.1109/access.2023.3264828.
- APA
- Velasco-Montero, D., Goossens, B., Fernández-Berni, J., Rodríguez-Vázquez, Á., & Philips, W. (2023). A pipelining-based heterogeneous scheduling and energy-throughput optimization scheme for CNNs leveraging apache TVM. IEEE ACCESS, 11, 35007–35021. https://doi.org/10.1109/access.2023.3264828
- Chicago author-date
- Velasco-Montero, Delia, Bart Goossens, Jorge Fernández-Berni, Ángel Rodríguez-Vázquez, and Wilfried Philips. 2023. “A Pipelining-Based Heterogeneous Scheduling and Energy-Throughput Optimization Scheme for CNNs Leveraging Apache TVM.” IEEE ACCESS 11: 35007–21. https://doi.org/10.1109/access.2023.3264828.
- Chicago author-date (all authors)
- Velasco-Montero, Delia, Bart Goossens, Jorge Fernández-Berni, Ángel Rodríguez-Vázquez, and Wilfried Philips. 2023. “A Pipelining-Based Heterogeneous Scheduling and Energy-Throughput Optimization Scheme for CNNs Leveraging Apache TVM.” IEEE ACCESS 11: 35007–35021. doi:10.1109/access.2023.3264828.
- Vancouver
- 1.Velasco-Montero D, Goossens B, Fernández-Berni J, Rodríguez-Vázquez Á, Philips W. A pipelining-based heterogeneous scheduling and energy-throughput optimization scheme for CNNs leveraging apache TVM. IEEE ACCESS. 2023;11:35007–21.
- IEEE
- [1]D. Velasco-Montero, B. Goossens, J. Fernández-Berni, Á. Rodríguez-Vázquez, and W. Philips, “A pipelining-based heterogeneous scheduling and energy-throughput optimization scheme for CNNs leveraging apache TVM,” IEEE ACCESS, vol. 11, pp. 35007–35021, 2023.
@article{01GY7JZKDHSCDVB3V2XA5PRBAB, abstract = {{Extracting information of interest from continuous video streams is a strongly demanded computer vision task. For the realization of this task at the edge using the current de-facto standard approach, i.e., deep learning, it is critical to optimize key performance metrics such as throughput and energy consumption according to prescribed application requirements. This allows achieving timely decision-making while extending the battery lifetime as much as possible. In this context, we propose a method to boost neural-network performance based on a co-execution strategy that exploits hardware heterogeneity on edge platforms. The enabling tool is Apache TVM, a highly efficient machine-learning compiler compatible with a diversity of hardware back-ends. The proposed approach solves the problem of network partitioning and distributes the workloads to make concurrent use of all the processors available on the board following a pipeline scheme. We conducted experiments on various popular CNNs compiled with TVM on the Jetson TX2 platform. The experimental results based on measurements show a significant improvement in throughput with respect to a single-processor execution, ranging from 14% to 150% over all tested networks. Power-efficient configurations were also identified, accomplishing energy reductions above 10%.}}, author = {{Velasco-Montero, Delia and Goossens, Bart and Fernández-Berni, Jorge and Rodríguez-Vázquez, Ángel and Philips, Wilfried}}, issn = {{2169-3536}}, journal = {{IEEE ACCESS}}, keywords = {{General Engineering,General Materials Science,General Computer Science,Electrical and Electronic Engineering,Hardware,Graphics processing units,Runtime,Throughput,Costs,Convolutional neural networks,Performance evaluation,Apache TVM,continuous inference,convolutional neural networks,edge vision,heterogeneous processing,jetson TX2,performance optimization,DEEP NEURAL-NETWORKS,MOBILE,INFERENCE,INTERNET}}, language = {{eng}}, pages = {{35007--35021}}, title = {{A pipelining-based heterogeneous scheduling and energy-throughput optimization scheme for CNNs leveraging apache TVM}}, url = {{http://doi.org/10.1109/access.2023.3264828}}, volume = {{11}}, year = {{2023}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: