research-article

A graph neural network-based performance model for deep learning applications

Authors:

Hugh Leather, and

Benoit SteinerAuthors Info & Claims

MAPS 2022: Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming

June 2022

Pages 11 - 20

https://doi.org/10.1145/3520312.3534863

Published: 13 June 2022 Publication History

Abstract

The unprecedented proliferation of machine learning based software brings an ever-increasing need to optimize the implementation of such applications. State-of-the-art compilers for neural networks, such as Halide and TVM, incorporate a machine learning-based performance model to search the space of valid implementations of a given deep learning algorithm. For a given application, the model predicts the value of performance metrics such as the run time without executing the application on hardware. Such models speed up the compilation process by obviating the need to benchmark an enormous number of candidate implementations, referred to as schedules, on hardware. Existing performance models employ feed-forward networks, recurrent networks, or decision tree ensembles to estimate the performance of different implementations of a neural network. Graphs present a natural and intuitive way to model deep-learning networks where each node represents a computational stage or operation. Incorporating the inherent graph structure of these workloads in the performance model can enable a better representation and learning of inter-stage interactions. The accuracy of the performance model has direct implications on the efficiency of the search strategy, making it a crucial component of this class of deep-learning compilers. In this work, we develop a novel performance model that adopts a graph representation. In our model, each stage of computation represents a node characterized by features that capture the operations performed by the stage. The interaction between nodes is achieved using graph convolutions. Experimental evaluation shows a 7.75𝑥 and 12𝑥 reduction in prediction error compared to the existing Halide and TVM models, respectively.

References

[1]

[n.d.]. Halide. https://github.com/halide/Halide

[2]

[n.d.]. Halide: A language for fast, portable computation on images and tensors. https://halide-lang.org/

[3]

[n.d.]. Halide Tutorial: Scheduling multi-stage pipelines. https://halide-lang.org/docs/tutorial_2lesson_08_scheduling_2_8cppexample.html

[4]

[n.d.]. Open Neural Network Exchange. https://onnx.ai/. Accessed: 2020-06-15.

[5]

[n.d.]. oneAPI Deep Neural Network Library (oneDNN). https:// github.com/oneapi-src/oneDNN. Accessed: 2021-08-19.

[6]

Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jefrey Dean, Matthieu Devin, Sanjay Ghemawat, Geofrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 265-283. https://www.usenix.org/system/files/conference/osdi16/ osdi16-abadi.pdf

Digital Library

[7]

Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, TzuMao Li, Michaël Gharbi, Benoit Steiner, Steven Johnson, Kayvon Fatahalian, Frédo Durand, and Jonathan Ragan-Kelley. 2019. Learning to Optimize Halide with Tree Search and Random Programs. ACM Trans. Graph. 38, 4, Article 121 ( July 2019 ), 12 pages. https://doi.org/10.1145/3306346.3322967

Digital Library

[8]

Amir H. Ashouri, William Killian, John Cavazos, Gianluca Palermo, and Cristina Silvano. 2018. A Survey on Compiler Autotuning Using Machine Learning. ACM Comput. Surv. 51, 5, Article 96 ( Sept. 2018 ), 42 pages. https://doi.org/10.1145/3197978

Digital Library

[9]

Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham Leghettas, Kamel Abdous, Taha Arbaoui, Karima Benatchba, and Saman Amarasinghe. 2021. A Deep Learning Based Cost Model for Automatic Code Optimization. arXiv: 2104.04955 [cs.PL]

[10]

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). USENIX Association, Carlsbad, CA, 578-594. https://www.usenix.org/conference/osdi18/presentation/chen

[11]

Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. Learning to Optimize Tensor Programs. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (Montréal, Canada) ( NIPS'18). Curran Associates Inc., USA, 3393-3404. http://dl.acm.org/citation.cfm?id= 3327144. 3327258

[12]

Sharan Chetlur, Clif Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cuDNN: Eficient Primitives for Deep Learning. arXiv:1410.0759 [cs.NE]

[13]

Samuel J. Kaufman, Phitchaya Mangpo Phothilimthana, Yanqi Zhou, Charith Mendis, Sudip Roy, Amit Sabne, and Mike Burrows. 2021. A Learned Performance Model for Tensor Processing Units. arXiv: 2008. 01040 [cs.PF]

[14]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. arXiv:1609.02907 [cs.LG]

[15]

Charith Mendis, Alex Renda, Saman Amarasinghe, and Michael Carbin. 2019. Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks. In Proceedings of the 36th International Conference on Machine Learning (ICML), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.), Vol. 97. 4505-4515.

[16]

Ravi Teja Mullapudi, Andrew Adams, Dillon Sharlet, Jonathan RaganKelley, and Kayvon Fatahalian. 2016. Automatically Scheduling Halide Image Processing Pipelines. ACM Trans. Graph. 35, 4, Article 83 ( July 2016 ), 11 pages. https://doi.org/10.1145/2897824.2925952

Digital Library

[17]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv: 1912. 01703 [cs.LG]

[18]

Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. 2013. Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (Seattle, Washington, USA) ( PLDI '13). Association for Computing Machinery, New York, NY, USA, 519-530. https://doi.org/10.1145/2491956.2462176

Digital Library

[19]

Savvas Sioutas, Sander Stuijk, Henk Corporaal, Twan Basten, and Lou Somers. 2018. Loop Transformations Leveraging Hardware Prefetching. In Proceedings of the 2018 International Symposium on Code Generation and Optimization (Vienna, Austria) ( CGO 2018 ). Association for Computing Machinery, New York, NY, USA, 254-264. https://doi.org/10.1145/3168823

Digital Library

[20]

Savvas Sioutas, Sander Stuijk, Luc Waeijen, Twan Basten, Henk Corporaal, and Lou Somers. 2019. Schedule Synthesis for Halide Pipelines through Reuse Analysis. ACM Trans. Archit. Code Optim. 16, 2, Article 10 ( April 2019 ), 22 pages. https://doi.org/10.1145/3310248

Digital Library

[21]

Benoit Steiner, Chris Cummins, Horace He, and Hugh Leather. 2021. Value Learning for Throughput Optimization of Deep Learning Workloads. In Proceedings of Machine Learning and Systems, Vol. 3. 323-334.

[22]

Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. 2018. An End-to-End Deep Learning Architecture for Graph Classification. In AAAI.

[23]

Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, and Maosong Sun. 2018. Graph Neural Networks: A Review of Methods and Applications. CoRR abs/ 1812.08434 ( 2018 ). arXiv: 1812.08434 http://arxiv.org/abs/ 1812.08434

Cited By

Index Terms

A graph neural network-based performance model for deep learning applications
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Software and its engineering
  1. Software notations and tools
    1. Compilers

Recommendations

Generating a Graph Colouring Heuristic with Deep Q-Learning and Graph Neural Networks
Learning and Intelligent Optimization
Abstract
The graph colouring problem consists of assigning labels, or colours, to the vertices of a graph such that no two adjacent vertices share the same colour. In this work we investigate whether deep reinforcement learning can be used to discover a ...
Read More
Deep reinforcement learning meets graph neural networks: Exploring a routing optimization use case
Abstract
Deep Reinforcement Learning (DRL) has shown a dramatic improvement in decision-making and automated control problems. Consequently, DRL represents a promising technique to efficiently solve many relevant optimization problems (e.g., ...
Read More
Towards neural architecture-aware exploration of compiler optimizations in a deep learning {graph} compiler
CF '22: Proceedings of the 19th ACM International Conference on Computing Frontiers

Deep Neural Networks (DNN) form the basis for many existing and emerging applications. Many DL compilers analyze the computation graphs and apply various optimizations at different stages. These high-level optimizations are applied using compiler passes ...
Read More

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MAPS 2022: Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming

June 2022

79 pages

ISBN:9781450392730

DOI:10.1145/3520312

General Chairs:
Swarat Chaudhuri
University of Texas at Austin, USA
,
Charles Sutton
Google Research, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MAPS '22

Sponsor:

SIGPLAN

MAPS '22: 6th ACM SIGPLAN International Symposium on Machine Programming

June 13, 2022

CA, San Diego, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
277
Total Downloads

Downloads (Last 12 months)80
Downloads (Last 6 weeks)4

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents