Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleOctober 2024
Adaptive Spatio-temporal Graph Learning for Bus Station Profiling
ACM Transactions on Spatial Algorithms and Systems (TSAS), Volume 10, Issue 3Article No.: 25, Pages 1–23https://doi.org/10.1145/3636459Understanding and managing public transportation systems require capturing complex spatio-temporal correlations within datasets. Existing studies often use predefined graphs in graph learning frameworks, neglecting shifted spatial and long-term temporal ...
- research-articleSeptember 2024
The Cost of Profiling in the HotSpot Virtual Machine
MPLR 2024: Proceedings of the 21st ACM SIGPLAN International Conference on Managed Programming Languages and RuntimesPages 112–126https://doi.org/10.1145/3679007.3685055Modern language runtimes use just-in-time compilation to execute applications natively. Typically, multiple compiler tiers cooperate so that compilation at a later stage can leverage profiling information generated by earlier tiers. This allows for ...
- research-articleJune 2024
Evaluating Finalization-Based Object Lifetime Profiling
ISMM 2024: Proceedings of the 2024 ACM SIGPLAN International Symposium on Memory ManagementPages 30–42https://doi.org/10.1145/3652024.3665514Using object lifetime information enables performance improvement through memory optimizations such as pretenuring and tuning garbage collector parameters. However, profiling object lifetimes is nontrivial and often requires a specialized virtual machine ...
- research-articleMay 2024
EVScout2.0: Electric Vehicle Profiling through Charging Profile
ACM Transactions on Cyber-Physical Systems (TCPS), Volume 8, Issue 2Article No.: 11, Pages 1–24https://doi.org/10.1145/3565268Electric Vehicles (EVs) represent a green alternative to traditional fuel-powered vehicles. To enforce their widespread use, both the technical development and the security of users shall be guaranteed. Users’ privacy represents a possible threat that ...
- research-articleApril 2024
Characterizing Power Management Opportunities for LLMs in the Cloud
- Pratyush Patel,
- Esha Choukse,
- Chaojie Zhang,
- Íñigo Goiri,
- Brijesh Warrier,
- Nithish Mahalingam,
- Ricardo Bianchini
ASPLOS '24: Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3Pages 207–222https://doi.org/10.1145/3620666.3651329Recent innovation in large language models (LLMs), and their myriad use cases have rapidly driven up the compute demand for datacenter GPUs. Several cloud providers and other enterprises plan to substantially grow their datacenter capacity to support ...
-
- research-articleApril 2024
Comparative Profiling: Insights Into Latent Diffusion Model Training
EuroMLSys '24: Proceedings of the 4th Workshop on Machine Learning and SystemsPages 176–183https://doi.org/10.1145/3642970.3655847Generative AI models are at the forefront of advancing creative and analytical tasks, pushing the boundaries of what machines can generate and comprehend. Among these, latent diffusion models represent significant advancements in generating high-fidelity ...
- research-articleSeptember 2024
EasyView: Bringing Performance Profiles into Integrated Development Environments
CGO '24: Proceedings of the 2024 IEEE/ACM International Symposium on Code Generation and OptimizationPages 386–398https://doi.org/10.1109/CGO57630.2024.10444840Dynamic program performance analysis (also known as profiling) is well-known for its powerful capabilities of identifying performance inefficiencies in software packages. Although a large number of profiling techniques are developed in academia and ...
- research-articleFebruary 2024
Optimal Model Partitioning with Low-Overhead Profiling on the PIM-based Platform for Deep Learning Inference
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 29, Issue 2Article No.: 28, Pages 1–22https://doi.org/10.1145/3628599Recently Processing-in-Memory (PIM) has become a promising solution to achieve energy-efficient computation in data-intensive applications by placing computation near or inside the memory. In most Deep Learning (DL) frameworks, a user manually partitions ...
- research-articleApril 2024
nnPerf: Demystifying DNN Runtime Inference Latency on Mobile Platforms
SenSys '23: Proceedings of the 21st ACM Conference on Embedded Networked Sensor SystemsPages 125–137https://doi.org/10.1145/3625687.3625797We present nnPerf, a real-time on-device profiler designed to collect and analyze the DNN model run-time inference latency on mobile platforms. nnPerf demystifies the hidden layers and metrics used for pursuing DNN optimizations and adaptations at the ...
- research-articleNovember 2023
PEAK: a Light-Weight Profiler for HPC Systems
SC-W '23: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and AnalysisPages 677–680https://doi.org/10.1145/3624062.3624143In the context of the expanding landscape of contemporary High-Performance Computing (HPC) applications from petascale to exascale, the pursuit of performance optimization emerges as a significant impediment within software development endeavors. In the ...
- research-articleOctober 2023Best Paper
Going Incognito in the Metaverse: Achieving Theoretically Optimal Privacy-Usability Tradeoffs in VR
UIST '23: Proceedings of the 36th Annual ACM Symposium on User Interface Software and TechnologyArticle No.: 61, Pages 1–16https://doi.org/10.1145/3586183.3606754Virtual reality (VR) telepresence applications and the so-called “metaverse” promise to be the next major medium of human-computer interaction. However, with recent studies demonstrating the ease at which VR users can be profiled and deanonymized, ...
How Profilers Can Help Navigate Type Migration
Proceedings of the ACM on Programming Languages (PACMPL), Volume 7, Issue OOPSLA2Article No.: 241, Pages 544–573https://doi.org/10.1145/3622817Sound migratory typing envisions a safe and smooth refactoring of untyped code bases to typed ones. However, the cost of enforcing safety with run-time checks is often prohibitively high, thus performance regressions are a likely occurrence. ...
- short-paperJuly 2023
Profiling and Visualizing Dynamic Pruning Algorithms
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 3125–3129https://doi.org/10.1145/3539618.3591806Efficiently retrieving the top-k documents for a given query is a fundamental operation in many search applications. Dynamic pruning algorithms accelerate top-k retrieval over inverted indexes by skipping documents that are not able to enter the current ...
- research-articleJuly 2023
DroidPerf: Profiling Memory Objects on Android Devices
ACM MobiCom '23: Proceedings of the 29th Annual International Conference on Mobile Computing and NetworkingArticle No.: 6, Pages 1–15https://doi.org/10.1145/3570361.3592503Optimizing performance inefficiencies in memory hierarchies is well-known for native languages, such as C and C++. There are few studies, however, on exploring memory inefficiencies in Android Runtime (ART). Running in ART, managed languages, such as ...
- research-articleJune 2023
Optimization-Aware Compiler-Level Event Profiling
ACM Transactions on Programming Languages and Systems (TOPLAS), Volume 45, Issue 2Article No.: 10, Pages 1–50https://doi.org/10.1145/3591473Tracking specific events in a program’s execution, such as object allocation or lock acquisition, is at the heart of dynamic analysis. Despite the apparent simplicity of this task, quantifying these events is challenging due to the presence of compiler ...
- research-articleJune 2023
Profiling Hyperscale Big Data Processing
- Abraham Gonzalez,
- Aasheesh Kolli,
- Samira Khan,
- Sihang Liu,
- Vidushi Dadu,
- Sagar Karandikar,
- Jichuan Chang,
- Krste Asanovic,
- Parthasarathy Ranganathan
ISCA '23: Proceedings of the 50th Annual International Symposium on Computer ArchitectureArticle No.: 47, Pages 1–16https://doi.org/10.1145/3579371.3589082Computing demand continues to grow exponentially, largely driven by "big data" processing on hyperscale data stores. At the same time, the slowdown in Moore's law is leading the industry to embrace custom computing in large-scale systems. Taken together, ...
- research-articleJune 2023
CDPU: Co-designing Compression and Decompression Processing Units for Hyperscale Systems
- Sagar Karandikar,
- Aniruddha N. Udipi,
- Junsun Choi,
- Joonho Whangbo,
- Jerry Zhao,
- Svilen Kanev,
- Edwin Lim,
- Jyrki Alakuijala,
- Vrishab Madduri,
- Yakun Sophia Shao,
- Borivoje Nikolic,
- Krste Asanovic,
- Parthasarathy Ranganathan
ISCA '23: Proceedings of the 50th Annual International Symposium on Computer ArchitectureArticle No.: 39, Pages 1–17https://doi.org/10.1145/3579371.3589074General-purpose lossless data compression and decompression ("(de)compression") are used widely in hyperscale systems and are key "datacenter taxes". However, designing optimal hardware compression and decompression processing units ("CDPUs") is ...
- research-articleJune 2023
Flexible and Effective Object Tiering for Heterogeneous Memory Systems
ISMM 2023: Proceedings of the 2023 ACM SIGPLAN International Symposium on Memory ManagementPages 163–175https://doi.org/10.1145/3591195.3595277Computing platforms that package multiple types of memory, each with their own performance characteristics, are quickly becoming mainstream. To operate efficiently, heterogeneous memory architectures require new data management solutions that are able ...
- research-articleMay 2023
Profiling and Monitoring Deep Learning Training Tasks
EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and SystemsPages 18–25https://doi.org/10.1145/3578356.3592589The embarrassingly parallel nature of deep learning training tasks makes CPU-GPU co-processors the primary commodity hardware for them. The computing and memory requirements of these tasks, however, do not always align well with the available GPU ...
- demonstrationApril 2023
PerfoRT: A Tool for Software Performance Regression
ICPE '23 Companion: Companion of the 2023 ACM/SPEC International Conference on Performance EngineeringPages 119–120https://doi.org/10.1145/3578245.3584928In this paper, we present PerfoRT, a tool to ease software performance regression measurement of Java systems. Its main characteristics include: minimal configuration to ease automation and hide complexity to the end user; a broad scope of performance ...