: Search

research-article

Open Access

All you need is superword-level parallelism: systematic control-flow vectorization with SLP

PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and ImplementationJune 2022, Pages 301–315https://doi.org/10.1145/3519939.3523701

Superword-level parallelism (SLP) vectorization is a proven technique for vectorizing straight-line code. It works by replacing independent, isomorphic instructions with equivalent vector instructions. Larsen and Amarasinghe originally proposed using ...

research-article

Open Access

Autoscheduling for sparse tensor algebra with an asymptotic cost model

PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and ImplementationJune 2022, Pages 269–285https://doi.org/10.1145/3519939.3523442

While loop reordering and fusion can make big impacts on the constant-factor performance of dense tensor programs, the effects on sparse tensor programs are asymptotic, often leading to orders of magnitude performance differences in practice. Sparse ...

research-article

Open Access

Automatic generation of efficient sparse tensor format conversion routines

PLDI 2020: Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and ImplementationJune 2020, Pages 823–838https://doi.org/10.1145/3385412.3385963

This paper shows how to generate code that efficiently converts sparse tensors between disparate storage formats (data layouts) such as CSR, DIA, ELL, and many others. We decompose sparse tensor conversion into three logical phases: coordinate remapping,...

research-article

The three pillars of machine programming

MAPL 2018: Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming LanguagesJune 2018, Pages 69–80https://doi.org/10.1145/3211346.3211355

In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research. Those pillars are: (i) intention, (ii) invention, and (iii) adaptation. Intention emphasizes advancements ...

research-article

Open Access

Helium: lifting high-performance stencil kernels from stripped x86 binaries to halide DSL code

PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationJune 2015, Pages 391–402https://doi.org/10.1145/2737924.2737974

Highly optimized programs are prone to bit rot, where performance quickly becomes suboptimal in the face of new hardware and compiler techniques. In this paper we show how to automatically lift performance-critical stencil kernels from a stripped x86 ...

Also Published in:

ACM SIGPLAN Notices: Volume 50 Issue 6, June 2015

research-article

Public Access

Autotuning algorithmic choice for input sensitivity

PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationJune 2015, Pages 379–390https://doi.org/10.1145/2737924.2737969

A daunting challenge faced by program performance autotuning is input sensitivity, where the best autotuned configuration may vary with different input sets. This paper presents a novel two-level input learning algorithm to tackle the challenge for an ...

Also Published in:

ACM SIGPLAN Notices: Volume 50 Issue 6, June 2015

research-article

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

PLDI '13: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and ImplementationJune 2013, Pages 519–530https://doi.org/10.1145/2491956.2462176

Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. ...

Also Published in:

ACM SIGPLAN Notices: Volume 48 Issue 6, June 2013

research-article

PetaBricks: a language and compiler for algorithmic choice

PLDI '09: Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and ImplementationJune 2009, Pages 38–49https://doi.org/10.1145/1542476.1542481

It is often impossible to obtain a one-size-fits-all solution for high performance algorithms when considering different choices for data distributions, parallelism, transformations, and blocking. The best solution to these choices is often tightly ...

Also Published in:

ACM SIGPLAN Notices: Volume 44 Issue 6, June 2009

proceeding

PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation

The Program Committee and I are pleased to present the proceedings of the ACM SIGPLAN 2008 Conference on Programming Language Design and Implementation. This volume contains 34 papers selected out of 184 submissions. We believe this volume demonstrates ...

section

Session details: Compiler and simulator construction

Saman Amarasinghe

PLDI '04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementationJune 2004https://doi.org/10.1145/3244311

Article

Meta optimization: improving compiler heuristics with machine learning

PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementationJune 2003, Pages 77–90https://doi.org/10.1145/781131.781141

Compiler writers have crafted many heuristics over the years to approximately solve NP-hard problems efficiently. Finding a heuristic that performs well on a broad range of applications is a tedious and difficult process. This paper introduces Meta ...

Also Published in:

ACM SIGPLAN Notices: Volume 38 Issue 5, May 2003

Article

Linear analysis and optimization of stream programs

PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementationJune 2003, Pages 12–25https://doi.org/10.1145/781131.781134

As more complex DSP algorithms are realized in practice, there is an increasing need for high-level stream abstractions that can be compiled without sacrificing efficiency. Toward this end, we present a set of aggressive optimizations that target linear ...

Also Published in:

ACM SIGPLAN Notices: Volume 38 Issue 5, May 2003

Article

A unified framework for schedule and storage optimization

PLDI '01: Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementationJune 2001, Pages 232–242https://doi.org/10.1145/378795.378852

We present a unified mathematical framework for analyzing the tradeoffs between parallelism and storage allocation within a parallelizing compiler. Using this framework, we show how to find a good storage mapping for a given schedule, a good schedule ...

Also Published in:

ACM SIGPLAN Notices: Volume 36 Issue 5, May 2001

Article

Free

Exploiting superword level parallelism with multimedia instruction sets

PLDI '00: Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementationAugust 2000, Pages 145–156https://doi.org/10.1145/349299.349320

Increasing focus on multimedia applications has prompted the addition of multimedia extensions to most existing general purpose microprocessors. This added functionality comes primarily with the addition of short SIMD instructions. Unfortunately, ...

Also Published in:

ACM SIGPLAN Notices: Volume 35 Issue 5, May 2000

Article

Free

Bidwidth analysis with application to silicon compilation

PLDI '00: Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementationAugust 2000, Pages 108–120https://doi.org/10.1145/349299.349317

This paper introduces Bitwise, a compiler that minimizes the bitwidth the number of bits used to represent each operand for both integers and pointers in a program. By propagating 70 static information both forward and backward in the program dataflow ...

Also Published in:

ACM SIGPLAN Notices: Volume 35 Issue 5, May 2000

Article

Free

Communication optimization and code generation for distributed memory machines

PLDI '93: Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementationAugust 1993, Pages 126–138https://doi.org/10.1145/155090.155102

This paper presents several algorithms to solve code generation and optimization problems specific to machines with distributed address spaces. Given a description of how the computation is to be partitioned across the processors in a machine, our ...

Also Published in:

ACM SIGPLAN Notices: Volume 28 Issue 6, June 1993

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Reproducibility Badges

Publication Date

Save to Binder

Also Published in:

Also Published in:

Also Published in:

Also Published in:

Also Published in:

Also Published in:

Also Published in:

Also Published in:

Also Published in:

Also Published in: