research-article

Open access

RaceInjector: Injecting Races to Evaluate and Learn Dynamic Race Detection Algorithms

Authors:

Shashank Srikant,

Malavika Samak,

Una-May O’ReillyAuthors Info & Claims

SOAP 2023: Proceedings of the 12th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis

Pages 63 - 70

https://doi.org/10.1145/3589250.3596142

Published: 06 June 2023 Publication History

Abstract

There exist no sound, scalable methods to assemble comprehensive datasets of concurrent programs annotated with data races. As a consequence, it is unclear how well the multiple heuristics and SMT-based algorithms, that have been proposed over the last three decades to detect data races, perform. To address this problem, we propose —an SMT-based approach which, for any given program, creates arbitrarily many program traces of it containing injected data races. The injected races are guaranteed to follow the given program’s semantics. hence can produce an arbitrarily large, labeled benchmark which is independent of how detection algorithms work. We demonstrate by injecting races into popular program benchmarks and generating a small dataset of traces with races in them. Among the traces generates, we begin to find counterexamples which four state-of-the-art race detection algorithms fail to detect. We thus demonstrate the utility of generating such datasets, and recommend using them to train machine learning-based models which can potentially replace and improve upon existing race-detection heuristics.

References

[1]

[n. d.]. ASM bytecode analysis framework. https://asm.ow2.io/

[2]

Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA ’06). Association for Computing Machinery, New York, NY, USA. 169–190. isbn:1595933484 https://doi.org/10.1145/1167473.1167488

Digital Library

[3]

Benjamin Bowman, Craig Laprade, Yuede Ji, and H Howie Huang. 2020. Detecting Lateral Movement in Enterprise Computer Networks with Unsupervised Graph AI. In RAID. 257–268.

[4]

Joshua Bundt, Andrew Fasano, Brendan Dolan-Gavitt, William Robertson, and Tim Leek. 2021. Evaluating Synthetic Bugs. In Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security (ASIA CCS ’21). Association for Computing Machinery, New York, NY, USA. 716–730. isbn:9781450382878 https://doi.org/10.1145/3433210.3453096

Digital Library

[5]

Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Michael F P O’Boyle, and Hugh Leather. 2021. ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations. In Proceedings of the 38th International Conference on Machine Learning, Marina Meila and Tong Zhang (Eds.) (Proceedings of Machine Learning Research, Vol. 139). PMLR, 2244–2253. https://proceedings.mlr.press/v139/cummins21a.html

[6]

Dino Distefano, Manuel Fähndrich, Francesco Logozzo, and Peter W O’Hearn. 2019. Scaling static analyses at Facebook. Commun. ACM, 62, 8 (2019), 62–70.

Digital Library

[7]

Cormac Flanagan and Stephen N. Freund. 2009. FastTrack: Efficient and Precise Dynamic Race Detection. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’09). Association for Computing Machinery, New York, NY, USA. 121–133. isbn:9781605583921 https://doi.org/10.1145/1542476.1542490

Digital Library

[8]

Cormac Flanagan and Stephen N. Freund. 2010. The RoadRunner Dynamic Analysis Framework for Concurrent Programs. In Proceedings of the 9th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE ’10). Association for Computing Machinery, New York, NY, USA. 1–8. isbn:9781450300827 https://doi.org/10.1145/1806672.1806674

Digital Library

[9]

Jian Gao, Xin Yang, Yu Jiang, Han Liu, Weiliang Ying, and Xian Zhang. 2018. Jbench: A Dataset of Data Races for Concurrency Testing. In Proceedings of the 15th International Conference on Mining Software Repositories (MSR ’18). Association for Computing Machinery, New York, NY, USA. 6–9. isbn:9781450357166 https://doi.org/10.1145/3196398.3196451

Digital Library

[10]

Jeff Huang. 2015. Stateless Model Checking Concurrent Programs with Maximal Causality Reduction. SIGPLAN Not., 50, 6 (2015), jun, 165–174. issn:0362-1340 https://doi.org/10.1145/2813885.2737975

Digital Library

[11]

Jeff Huang, Patrick O’Neil Meredith, and Grigore Rosu. 2014. Maximal sound predictive race detection with control flow abstraction. In Proceedings of the 35th ACM SIGPLAN conference on programming language design and implementation. 337–348.

Digital Library

[12]

Jeff Huang, Patrick O’Neil Meredith, and Grigore Rosu. 2014. Maximal Sound Predictive Race Detection with Control Flow Abstraction. SIGPLAN Not., 49, 6 (2014), jun, 337–348. issn:0362-1340 https://doi.org/10.1145/2666356.2594315

Digital Library

[13]

Nicholas Jalbert, Cristiano Pereira, Gilles Pokam, and Koushik Sen. 2011. RADBench: A Concurrency Bug Benchmark Suite. In 3rd USENIX Workshop on Hot Topics in Parallelism (HotPar 11). USENIX Association, Berkeley, CA. https://www.usenix.org/conference/hotpar-11/radbench-concurrency-bug-benchmark-suite

[14]

Minseok Jeon, Myungho Lee, and Hakjoo Oh. 2020. Learning graph-based heuristics for pointer analysis without handcrafting application-specific features. Proceedings of the ACM on Programming Languages, 4, OOPSLA (2020), 1–30.

Digital Library

[15]

Pallavi Joshi, Mayur Naik, Chang-Seo Park, and Koushik Sen. 2009. CalFuzzer: An Extensible Active Testing Framework for Concurrent Programs. In Computer Aided Verification, Ahmed Bouajjani and Oded Maler (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 675–681. isbn:978-3-642-02658-4

[16]

Christian Gram Kalhauge and Jens Palsberg. 2018. Sound Deadlock Prediction. Proc. ACM Program. Lang., 2, OOPSLA (2018), Article 146, oct, 29 pages. https://doi.org/10.1145/3276516

Digital Library

[17]

Dileep Kini, Umang Mathur, and Mahesh Viswanathan. 2017. Dynamic Race Prediction in Linear Time. SIGPLAN Not., 52, 6 (2017), jun, 157–170. issn:0362-1340 https://doi.org/10.1145/3140587.3062374

Digital Library

[18]

Lamport. 1979. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs. IEEE Trans. Comput., C-28, 9 (1979), 690–691. https://doi.org/10.1109/TC.1979.1675439

Digital Library

[19]

Leslie Lamport. 1978. Time, Clocks, and the Ordering of Events in a Distributed System. Commun. ACM, 21, 7 (1978), jul, 558–565. issn:0001-0782 https://doi.org/10.1145/359545.359563

Digital Library

[20]

Ziyi Lin, Darko Marinov, Hao Zhong, Yuting Chen, and Jianjun Zhao. 2015. JaConTeBe: A Benchmark Suite of Real-World Java Concurrency Bugs (T). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). 178–189. https://doi.org/10.1109/ASE.2015.87

Digital Library

[21]

Yiling Lou, Qihao Zhu, Jinhao Dong, Xia Li, Zeyu Sun, Dan Hao, Lu Zhang, and Lingming Zhang. 2021. Boosting coverage-based fault localization via graph-based representation learning. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 664–676.

Digital Library

[22]

Umang Mathur, Dileep Kini, and Mahesh Viswanathan. 2018. What happens-after the first race? enhancing the predictive power of happens-before based dynamic race detection. Proceedings of the ACM on Programming Languages, 2, OOPSLA (2018), 1–29.

Digital Library

[23]

Umang Mathur, Andreas Pavlogiannis, and Mahesh Viswanathan. 2020. The Complexity of Dynamic Data Race Prediction. In Proceedings of the 35th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS ’20). Association for Computing Machinery, New York, NY, USA. 713–727. isbn:9781450371049 https://doi.org/10.1145/3373718.3394783

Digital Library

[24]

Umang Mathur, Andreas Pavlogiannis, and Mahesh Viswanathan. 2021. Optimal Prediction of Synchronization-Preserving Races. Proc. ACM Program. Lang., 5, POPL (2021), Article 36, jan, 29 pages. https://doi.org/10.1145/3434317

Digital Library

[25]

Jibesh Patra and Michael Pradel. 2021. Semantic Bug Seeding: A Learning-Based Approach for Creating Realistic Bugs. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021). Association for Computing Machinery, New York, NY, USA. 906–918. isbn:9781450385626 https://doi.org/10.1145/3468264.3468623

Digital Library

[26]

Jake Roemer, Kaan Genç, and Michael D Bond. 2020. SmartTrack: efficient predictive race detection. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 747–762.

Digital Library

[27]

Mahmoud Said, Chao Wang, Zijiang Yang, and Karem Sakallah. 2011. Generating Data Race Witnesses by an SMT-Based Analysis. In NASA Formal Methods, Mihaela Bobaru, Klaus Havelund, Gerard J. Holzmann, and Rajeev Joshi (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 313–327. isbn:978-3-642-20398-5

[28]

Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas Anderson. 1997. Eraser: A dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems (TOCS), 15, 4 (1997), 391–411.

Digital Library

[29]

Yannis Smaragdakis, Jacob Evans, Caitlin Sadowski, Jaeheon Yi, and Cormac Flanagan. 2012. Sound Predictive Race Detection in Polynomial Time. Association for Computing Machinery, New York, NY, USA. 387–400. isbn:9781450310833 https://doi.org/10.1145/2103656.2103702

Digital Library

[30]

Mosaad Al Thokair, Minjian Zhang, Umang Mathur, and Mahesh Viswanathan. 2023. Dynamic Race Detection with O(1) Samples. Proc. ACM Program. Lang., 7, POPL (2023), Article 45, jan, 30 pages. https://doi.org/10.1145/3571238

Digital Library

[31]

Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. 1999. Soot - a Java Bytecode Optimization Framework. In Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative Research (CASCON ’99). IBM Press, 13.

Digital Library

[32]

Chao Wang, Sudipta Kundu, Malay Ganai, and Aarti Gupta. 2009. Symbolic predictive analysis for concurrent programs. In International Symposium on Formal Methods. 256–272.

Digital Library

[33]

Ting Yuan, Guangwei Li, Jie Lu, Chen Liu, Lian Li, and Jingling Xue. 2021. GoBench: A Benchmark Suite of Real-World Go Concurrency Bugs. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO). 187–199. https://doi.org/10.1109/CGO51591.2021.9370317

Digital Library

Cited By

Index Terms

RaceInjector: Injecting Races to Evaluate and Learn Dynamic Race Detection Algorithms
1. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Concurrent programming languages
        Parallel programming languages

Recommendations

Verifying dynamic race detection
CPP 2017: Proceedings of the 6th ACM SIGPLAN Conference on Certified Programs and Proofs

Writing race-free concurrent code is notoriously difficult, and data races can result in bugs that are difficult to isolate and reproduce. Dynamic race detection can catch data races that cannot (easily) be detected statically. One approach to dynamic ...
Algorithms for Data-Race Detection in Multithreaded Programs
Low-level detection of language-level data races with LARD
ASPLOS '14

Researchers have proposed always-on data-race exceptions as a way to avoid the ill effects of data races, but slow performance of accurate dynamic data-race detection remains a barrier to the adoption of always-on data-race exceptions. Proposals for ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SOAP 2023: Proceedings of the 12th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis

June 2023

70 pages

ISBN:9798400701702

DOI:10.1145/3589250

General Chairs:
Pietro Ferrara
Ca' Foscari University of Venice, Italy
,
Liana Hadarean
AWS, USA

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution 4.0 International License.

Sponsors

SIGPLAN: ACM Special Interest Group on Programming Languages

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SOAP '23

Sponsor:

SIGPLAN

SOAP '23: 12th ACM SIGPLAN International Workshop on the State Of the Art in Program Analysis

June 17, 2023

FL, Orlando, USA

Acceptance Rates

Overall Acceptance Rate 11 of 11 submissions, 100%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
238
Total Downloads

Downloads (Last 12 months)137
Downloads (Last 6 weeks)23

Reflects downloads up to 04 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents