skip to main content
10.1145/2872362.2872387acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article
Open access

Scaling up Superoptimization

Published: 25 March 2016 Publication History

Abstract

Developing a code optimizer is challenging, especially for new, idiosyncratic ISAs. Superoptimization can, in principle, discover machine-specific optimizations automatically by searching the space of all instruction sequences. If we can increase the size of code fragments a superoptimizer can optimize, we will be able to discover more optimizations. We develop LENS, a search algorithm that increases the size of code a superoptimizer can synthesize by rapidly pruning away invalid candidate programs. Pruning is achieved by selectively refining the abstraction under which candidates are considered equivalent, only in the promising part of the candidate space. LENS also uses a bidirectional search strategy to prune the candidate space from both forward and backward directions. These pruning strategies allow LENS to solve twice as many benchmarks as existing enumerative search algorithms, while LENS is about 11-times faster.
Additionally, we increase the effective size of the superoptimized fragments by relaxing the correctness condition using contexts (surrounding code). Finally, we combine LENS with complementary search techniques into a cooperative superoptimizer, which exploits the stochastic search to make random jumps in a large candidate space, and a symbolic (SAT-solver-based) search to synthesize arbitrary constants. While existing superoptimizers consistently solve 9--16 out of 32 benchmarks, the cooperative superoptimizer solves 29 benchmarks. It can synthesize code fragments that are up to 82% faster than code generated by gcc -O3 from WiBench and MiBench.

References

[1]
Souper. http://github.com/google/souper. URL http://github.com/google/souper.
[2]
T. Akiba, K. Imajo, H. Iwami, Y. Iwata, T. Kataoka, N. Takahashi, M. Moskal, and N. Swamy. Calibrating research in program synthesis using 72,000 hours of programmer time. Technical report, MSR, 2013.
[3]
R. Alur, R. Bodik, E. Dallal, D. Fisman, P. Garg, G. Juniwal, H. Kress-Gazit, P. Madhusudan, M. M. K. Martin, M. Raghothaman, S. Saha, S. A. Seshia, R. Singh, A. Solar-Lezama, E. Torlak, and A. Udupa. Syntax-guided synthesis. In SyGus Competition, 2014.
[4]
ARM. Cortex-A9: Technical Reference Manual, 2012. URL http://infocenter.arm.com/help/topic/com.arm.doc.ddi0388i/DDI0388I_cortex_a9_r4p1_trm.pdf.
[5]
S. Bansal and A. Aiken. Automatic generation of peephole superoptimizers. In ASPLOS, 2006.
[6]
G. Barthe, J. M. Crespo, S. Gulwani, C. Kunz, and M. Marron. From relational verification to simd loop synthesis. In PPoPP, 2013.
[7]
J. Bungo. The use of compiler optimizations for embedded systems software. Crossroads, 15 (1): 8--15, Sept. 2008.
[8]
A. Duller, D. Towner, G. Panesar, A. Gray, and W. Robbins. picoarray technology: the tool's story. In Design, Automation and Test in Europe, 2005.
[9]
J. Galenson, P. Reames, R. Bodik, B. Hartmann, and K. Sen. Codehint: Dynamic and interactive synthesis of code snippets. In ICSE, 2014.
[10]
V. Govindaraju, C.-H. Ho, T. Nowatzki, J. Chhugani, N. Satish, K. Sankaralingam, and C. Kim. Dyser: Unifying functionality and parallelism specialization for energy-efficient computing. Micro, IEEE, Sept 2012.
[11]
T. Granlund and R. Kenner. Eliminating branches using a superoptimizer and the gnu c compiler. In PLDI, 1992.
[12]
GreenArrays. Product Brief: GreenArrays GA144, 2010. URL http://www.greenarraychips.com/home/documents/greg/PB001--100503-GA144--1--10.pdf.
[13]
S. Gulwani, S. Jha, A. Tiwari, and R. Venkatesan. Synthesis of loop-free programs. In PLDI, 2011.
[14]
M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. Mudge, R. Brown, and T. Austin. Mibench: a free, commercially representative embedded benchmark suite. In IEEE International Symposium on Workload Characterization, 2001.
[15]
Intel. Reducing Data Center Energy Consumption. Technical report, 2008.
[16]
M. Kandemir, N. Vijaykrishnan, and M. Irwin. Compiler optimizations for low power systems. In Power Aware Computing, Series in Computer Science. Springer US, 2002.
[17]
N. P. Lopes, D. Menendez, S. Nagarakatte, and J. Regehr. Provably correct peephole optimizations with alive. In PLDI, 2015.
[18]
H. Massalin. Superoptimizer: a look at the smallest program. In ASPLOS, 1987.
[19]
P. Merolla, J. Arthur, F. Akopyan, N. Imam, R. Manohar, and D. Modha. A digital neurosynaptic core using embedded crossbar memory with 45pj per spike in 45nm. In Custom Integrated Circuits Conference (CICC), 2011 IEEE, 2011.
[20]
P. M. Phothilimthana, T. Jelvis, R. Shah, N. Totla, S. Chasins, and R. Bodik. Chlorophyll: Synthesis-aided compiler for low-power spatial architectures. In PLDI, 2014.
[21]
P. M. Phothilimthana, A. Thakur, R. Bodik, and D. Dhurjati. Greenthumb: Superoptimizer construction framework. In Proceedings of International Conference on Compiler Construction, 2016.
[22]
W. Qadeer, R. Hameed, O. Shacham, P. Venkatesan, C. Kozyrakis, and M. A. Horowitz. Convolution engine: Balancing efficiency and flexibility in specialized computing. In ISCA, 2013.
[23]
A. Reynolds, M. Deters, V. Kuncak, C. Tinelli, and C. Barrett. Counterexample-guided quantifier instantiation for synthesis in smt. In CAV, 2015.
[24]
E. Schkufza, R. Sharma, and A. Aiken. Stochastic superoptimization. In ASPLOS, 2013.
[25]
E. Schkufza, R. Sharma, and A. Aiken. Stochastic optimization of floating-point programs with tunable precision. In PLDI, 2014.
[26]
R. Sharma. Personal communication, June 2015.
[27]
A. Solar-Lezama, L. Tancau, R. Bodik, S. Seshia, and V. Saraswat. Combinatorial sketching for finite programs. In ASPLOS, 2006.
[28]
V. Srinivasan and T. Reps. Synthesis of machine code from semantics. In PLDI, 2015.
[29]
The Linley Group. Processor watch: Getting way out of box. http://www.linleygroup.com/newsletters/newsletter_detail.php?num=5038, 2013. Accessed: 2014--11--13.
[30]
E. Torlak and R. Bodik. Growing solver-aided languages with Rosette. In Onward!, 2013.
[31]
E. Torlak and R. Bodik. A lightweight symbolic virtual machine for solver-aided host languages. In PLDI, 2014.
[32]
A. Udupa, A. Raghavan, J. V. Deshmukh, S. Mador-Haim, M. M. Martin, and R. Alur. Transit: Specifying protocols with concolic snippets. In PLDI, 2013.
[33]
H. S. Warren. Hacker's Delight. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2002.
[34]
H. S. Warren. A hacker's assistant. Oct. 2008. URL http://www.hackersdelight.org/aha/aha.pdf.
[35]
Wikipedia. List of arm microarchitectures. http://en.wikipedia.org/wiki/List_of_ARM_microarchitectures, 2014. Accessed: 2014--11--13.
[36]
C. Zhang. Dynamically Reconfigurable Architectures for Real-time Baseband Processing. PhD thesis, Lund University, 2014.
[37]
Q. Zheng, Y. Chen, R. Dreslinski, C. Chakrabarti, A. Anastasopoulos, S. Mahlke, and T. Mudge. Wibench: An open source kernel suite for benchmarking wireless systems. In Workload Characterization (IISWC), 2013 IEEE International Symposium on, 2013.

Cited By

View all
  • (2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
  • (2024)SuperStack: Superoptimization of Stack-Bytecode via Greedy, Constraint-Based, and SAT TechniquesProceedings of the ACM on Programming Languages10.1145/36564358:PLDI(1437-1462)Online publication date: 20-Jun-2024
  • (2024)Superfusion: Eliminating Intermediate Data Structures via Inductive SynthesisProceedings of the ACM on Programming Languages10.1145/36564158:PLDI(939-964)Online publication date: 20-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '16: Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems
March 2016
824 pages
ISBN:9781450340915
DOI:10.1145/2872362
  • General Chair:
  • Tom Conte,
  • Program Chair:
  • Yuanyuan Zhou
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 March 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. SMT
  2. program synthesis
  3. superoptimization

Qualifiers

  • Research-article

Funding Sources

Conference

ASPLOS '16

Acceptance Rates

ASPLOS '16 Paper Acceptance Rate 53 of 232 submissions, 23%;
Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)367
  • Downloads (Last 6 weeks)48
Reflects downloads up to 05 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
  • (2024)SuperStack: Superoptimization of Stack-Bytecode via Greedy, Constraint-Based, and SAT TechniquesProceedings of the ACM on Programming Languages10.1145/36564358:PLDI(1437-1462)Online publication date: 20-Jun-2024
  • (2024)Superfusion: Eliminating Intermediate Data Structures via Inductive SynthesisProceedings of the ACM on Programming Languages10.1145/36564158:PLDI(939-964)Online publication date: 20-Jun-2024
  • (2024)Hydra: Generalizing Peephole Optimizations with Program SynthesisProceedings of the ACM on Programming Languages10.1145/36498378:OOPSLA1(725-753)Online publication date: 29-Apr-2024
  • (2024)A Case for Synthesis of Recursive Quantum Unitary ProgramsProceedings of the ACM on Programming Languages10.1145/36329018:POPL(1759-1788)Online publication date: 5-Jan-2024
  • (2024)Efficient Bottom-Up Synthesis for Programs with Local VariablesProceedings of the ACM on Programming Languages10.1145/36328948:POPL(1540-1568)Online publication date: 5-Jan-2024
  • (2024)Optimal Program Synthesis via Abstract InterpretationProceedings of the ACM on Programming Languages10.1145/36328588:POPL(457-481)Online publication date: 5-Jan-2024
  • (2024)Boost Linear Algebra Computation Performance via Efficient VNNI UtilizationProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651333(149-163)Online publication date: 27-Apr-2024
  • (2024)Hydride: A Retargetable and Extensible Synthesis-based Compiler for Modern Hardware ArchitecturesProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640385(514-529)Online publication date: 27-Apr-2024
  • (2024)Morpheus: A Run Time Compiler and Optimizer for Software Data PlanesIEEE/ACM Transactions on Networking10.1109/TNET.2023.334628632:3(2269-2284)Online publication date: 1-Jun-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media