skip to main content
10.1145/1242531.1242553acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
Article

Fast compiler optimisation evaluation using code-feature based performance prediction

Published: 07 May 2007 Publication History

Abstract

Performance tuning is an important and time consuming task which may have to be repeated for each new application and platform. Although iterative optimisation can automate this process, it still requires many executions of different versions of the program. As execution time is frequently the limiting factor in the number of versions or transformed programs that can be considered, what is needed is a mechanism that can automatically predict the performance of a modified program without actually having to run it. This paper presents a new machine learning based technique to automatically predict the speedup of a modified program using a performance model based on the code features of the tuned programs. Unlike previous approaches it does not require any prior learning over a benchmark suite. Furthermore, it can be used to predict the performance of any tuning and is not restricted to a prior seen trans-formation space. We show that it can deliver predictions with a high correlation coefficient and can be used to dramatically reduce the cost of search.

References

[1]
AGAKOV, F., BONILLA, E., CAVAZOS, J., FRANKE, B., FURSIN, G., O'BOYLE, M. F. P., THOMSON, J., TOUSSAINT, M., AND WILLIAMS, C. K. I. Using machine learning to focus iterative optimization. In CGO '06: Proceedings of the International Symposium on Code Generation and Optimization (Washington, DC, USA, March 2006), IEEE Computer Society, pp. 295--305.
[2]
ALMAGOR, L., COOPER, K. D., GROSUL, A., HARVEY, T. J., REEVES, S. W., SUBRAMANIAN, D., TORCZON, L., AND WATERMAN, T. Finding effective compilation sequences. In LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems (New York, NY, USA, 2004), ACM Press, pp. 231--239.
[3]
BISHOP, C. Neural Networks for Pattern Recognition. Oxford University Press, 2005.
[4]
CAVAZOS, J., DUBACH, C., AGAKOV, F., BONILLA, E., O'BOYLE, M. F. P., FURSIN, G., AND TEMAM, O. Automatic performance model construction for the fast software exploration of new hardware designs. In CASES '06: Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems (New York, NY, USA, October 2006), ACM Press, pp. 24--34.
[5]
CAVAZOS, J., ELIOT, J., AND MOSS, B. Inducing heuristics to decide whether to schedule. In PLDI '04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation (New York, NY, USA, 2004), ACM Press, pp. 183--194.
[6]
COOPER, K., GROSUL, A., HARVEY, T., REEVES, S., SUBRAMANIAN, D., TORCZON, L., AND WATERMAN, T. Searching for compilation sequences. Tech. rep., Rice University, 2005.
[7]
COOPER, K. D., GROSUL, A., HARVEY, T. J., REEVES, S., SUBRAMANIAN, D., TORCZON, L., AND WATERMAN, T. Acme: adaptive compilation made efficient. In LCTES '05: Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems (New York, NY, USA, 2005), ACM Press, pp. 69--77.
[8]
EECKHOUT, L., JR., R. H. B., STOUGIE, B., BOSSCHERE, K. D., AND JOHN, L. K. Control flow modeling in statistical simulation for accurate and efficient processor design studies. SIGARCH Comput. Archit. News 32, 2 (2004), 350.
[9]
FRANKE, B., O'BOYLE, M., THOMSON, J., AND FURSIN, G. Probabilistic source-level optimisation of embedded programs. In LCTES '05: Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems (New York, NY, USA, 2005), ACM Press, pp. 78--86.
[10]
FURSIN, G., COHEN, A., O'BOYLE, M., AND TEMAM, O. A practical method for quickly evaluating program optimizations. In Proceedings of the 1st International Conference on High Performance Embedded Architectures & Compilers (HiPEAC) (2005), pp. 29--46.
[11]
FURSIN, G., O'BOYLE, M., AND KNIJNENBURG, P. Evaluating iterative compilation. In Proceedings of the 15th Workshop on Languages and Compilers for Parallel Computers (LCPC) (2002), pp. 305--315.
[12]
HALL, M. W., ANDERSON, J. M., AMARASINGHE, S. P., MURPHY, B. R., LIAO, S.-W., BUGNION, E., AND LAM, M. S. Maximizing multiprocessor performance with the suif compiler. Computer 29, 12 (1996), 84--89.
[13]
IPEK, E., MCKEE, S. A., CARUANA, R., DE SUPINSKI, B. R., AND SCHULZ, M. Efficiently exploring architectural design spaces via predictive modeling. Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-XII) 34, 5 (October 2006), 195--206.
[14]
KARKHANIS, T. S., AND SMITH, J. E. A first-order superscalar processor model. In Proceedings of the 1st Annual International Symposium on Computer Architecture (ISCA'04) (2004), p. 338.
[15]
KARURI, K., FARUQUE, M. A. A., KRAEMER, S., LEUPERS, R., ASCHEID, G., AND MEYR, H. Fine-grained application source code profiling for asip design. In DAC '05: Proceedings of the 42nd annual conference on Design automation (New York, NY, USA, 2005), ACM Press, pp. 329--334.
[16]
KULKARNI, P., ZHAO, W., MOON, H., CHO, K., WHALLEY, D., DAVIDSON, J., BAILEY, M., PAEK, Y., AND GALLIVAN, K. Finding effective optimization phase sequences. In Proceedings of the Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES) (2003), pp. 12--23.
[17]
LEE, C. Utdsp benchmark suite. In http://www.eecg.toronto.edu/~corinna/DSP/infrastructure/UTDSP.html (1998).
[18]
MONSIFROT, A., BODIN, F., AND QUINIOU, R. A machine learning approach to automatic production of compiler heuristics. In AIMSA '02: Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications (London, UK, 2002), Springer-Verlag, pp. 41--50.
[19]
PAN, Z., AND EIGENMANN, R. Fast, automatic, procedure-level performance tuning. In PACT '06: Proceedings of the 15th international conference on Parallel architectures and compilation techniques (New York, NY, USA, 2006), ACM Press, pp. 173--181.
[20]
SAGHIR, M., CHOW, P., AND LEE, C. A comparison of traditional and vliw dsp architecture for compiled dsp applications. In Proceedings of the International Workshop on Compiler and Architecture Support for Embedded Systems (CASES) (1998).
[21]
STEPHENSON, M., AND AMARASINGHE, S. Predicting unroll factors using supervised classification. In CGO '05: Proceedings of the international symposium on Code generation and optimization (Washington, DC, USA, 2005), IEEE Computer Society, pp. 123--134.
[22]
TRIANTAFYLLIS, S., VACHHARAJANI, M., VACHHARAJANI, N., AND AUGUST, D. I. Compiler optimization-space exploration. In CGO '03: Proceedings of the international symposium on Code generation and optimization (Washington, DC, USA, 2003), IEEE Computer Society, pp. 204--215.
[23]
WENISCH, T. F., WUNDERLICH, R. E., FALSAFI, B., AND HOE, J. C. Turbosmarts: accurate microarchitecture simulation sampling in minutes. SIGMETRICS Perform. Eval. Rev. 33, 1 (2005), 408--409.
[24]
WUNDERLICH, R. E., WENISCH, T. F., FALSAFI, B., AND HOE, J. C. Smarts: accelerating microarchitecture simulation via rigorous statistical sampling. In ISCA '03: Proceedings of the 30th annual international symposium on Computer architecture (New York, NY, USA, 2003), ACM Press, pp. 84--97.

Cited By

View all
  • (2023)TpuGraphsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669205(70355-70375)Online publication date: 10-Dec-2023
  • (2022)An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks2022 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC55918.2022.00017(79-91)Online publication date: Nov-2022
  • (2022)GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation2022 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC55918.2022.00012(14-26)Online publication date: Nov-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CF '07: Proceedings of the 4th international conference on Computing frontiers
May 2007
300 pages
ISBN:9781595936837
DOI:10.1145/1242531
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. architecture
  2. artificial neural networks
  3. compiler optimisation
  4. learning
  5. machine
  6. performance modelling

Qualifiers

  • Article

Conference

CF07
Sponsor:
CF07: Computing Frontiers Conference
May 7 - 9, 2007
Ischia, Italy

Acceptance Rates

Overall Acceptance Rate 273 of 785 submissions, 35%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)2
Reflects downloads up to 14 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)TpuGraphsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3669205(70355-70375)Online publication date: 10-Dec-2023
  • (2022)An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks2022 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC55918.2022.00017(79-91)Online publication date: Nov-2022
  • (2022)GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation2022 IEEE International Symposium on Workload Characterization (IISWC)10.1109/IISWC55918.2022.00012(14-26)Online publication date: Nov-2022
  • (2022)Reduced O3 subsequence labelling: a stepping stone towards optimisation sequence predictionConnection Science10.1080/09540091.2022.204476134:1(2860-2877)Online publication date: 1-Mar-2022
  • (2021)Inherent Parallelism and Speedup Estimation of Sequential ProgramsAnnals of Emerging Technologies in Computing10.33166/AETiC.2021.02.0065:2(62-77)Online publication date: 1-Apr-2021
  • (2021)Using machine learning to predict the code size impact of duplication heuristics in a dynamic compilerProceedings of the 18th ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes10.1145/3475738.3480943(127-135)Online publication date: 29-Sep-2021
  • (2021)Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF)ACM Transactions on Architecture and Code Optimization10.1145/342709318:1(1-26)Online publication date: 20-Jan-2021
  • (2021)Enabling Reliability-Driven Optimization Selection with Gate Graph Attention Neural NetworkInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402040024030:11n12(1641-1665)Online publication date: 21-Jan-2021
  • (2021)A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT52795.2021.00008(1-16)Online publication date: Sep-2021
  • (2020)High-level hardware feature extraction for GPU performance prediction of stencilsProceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit10.1145/3366428.3380769(21-30)Online publication date: 23-Feb-2020
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media