Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleAugust 2024
Planter: Rapid Prototyping of In-Network Machine Learning Inference
- Changgang Zheng,
- Mingyuan Zang,
- Xinpeng Hong,
- Liam Perreault,
- Riyad Bensoussane,
- Shay Vargaftik,
- Yaniv Ben-Itzhak,
- Noa Zilberman
ACM SIGCOMM Computer Communication Review (SIGCOMM-CCR), Volume 54, Issue 1Pages 2–21https://doi.org/10.1145/3687230.3687232In-network machine learning inference provides high throughput and low latency. It is ideally located within the network, power efficient, and improves applications' performance. Despite its advantages, the bar to in-network machine learning research is ...
- research-articleDecember 2023
Grape: Practical and Efficient Graphed Execution for Dynamic Deep Neural Networks on GPUs
MICRO '23: Proceedings of the 56th Annual IEEE/ACM International Symposium on MicroarchitecturePages 1364–1380https://doi.org/10.1145/3613424.3614248Achieving high performance in machine learning workloads is a crucial yet difficult task. To achieve high runtime performance on hardware platforms such as GPUs, graph-based executions such as CUDA graphs are often used to eliminate CPU runtime ...
- research-articleJune 2021
Pure tensor program rewriting via access patterns (representation pearl)
- Gus Henry Smith,
- Andrew Liu,
- Steven Lyubomirsky,
- Scott Davidson,
- Joseph McMahan,
- Michael Taylor,
- Luis Ceze,
- Zachary Tatlock
MAPS 2021: Proceedings of the 5th ACM SIGPLAN International Symposium on Machine ProgrammingPages 21–31https://doi.org/10.1145/3460945.3464953Tensor kernels in machine learning (ML) often correspond to pure mathematical expressions, making term rewriting an attractive strategy for optimization and mapping to specialized hardware accelerators. However, existing ML intermediate representations (...