skip to main content
10.1145/3243734.3243838acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Public Access

Effective Program Debloating via Reinforcement Learning

Published: 15 October 2018 Publication History

Abstract

Prevalent software engineering practices such as code reuse and the "one-size-fits-all" methodology have contributed to significant and widespread increases in the size and complexity of software. The resulting software bloat has led to decreased performance and increased security vulnerabilities. We propose a system called Chisel to enable programmers to effectively customize and debloat programs. Chisel takes as input a program to be debloated and a high-level specification of its desired functionality. The output is a reduced version of the program that is correct with respect to the specification. Chisel significantly improves upon existing program reduction systems by using a novel reinforcement learning-based approach to accelerate the search for the reduced program and scale to large programs. Our evaluation on a suite of 10 widely used UNIX utility programs each comprising 13-90 KLOC of C source code demonstrates that Chisel is able to successfully remove all unwanted functionalities and reduce attack surfaces. Compared to two state-of-the-art program reducers C-Reduce and Perses, which time out on 6 programs and 2 programs espectively in 12 hours, Chisel runs up to 7.1x and 3.7x faster and finishes on all programs.

Supplementary Material

MP4 File (p380-heo.mp4)

References

[1]
Hiralal Agrawal and Joseph R. Horgan. 1990. Dynamic Program Slicing. In Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation (PLDI '90).
[2]
Samuel Bates and Susan Horwitz. 1993. Incremental Program Testing Using Program Dependence Graphs. In Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '93).
[3]
Suparna Bhattacharya, Kanchi Gopinath, and Mangala Gowri Nanda. 2013. Combining Concern Input with Program Analysis for Bloat Detection. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA '13).
[4]
Michael Carbin, Sasa Misailovic, Michael Kling, and Martin C. Rinard. 2011. Detecting and Escaping Infinite Loops with Jolt. Proceedings of the 25th European Conference on Object-oriented Programming (ECOOP'11) .
[5]
Kostas Ferles, Valentin Wüstholz, Maria Christakis, and Isil Dillig. 2017. Failure-directed Program Trimming. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE '17).
[6]
Sumit Gulwani, Oleksandr Polozov, and Rishabh Singh. 2017. Program Synthesis. Foundations and Trends in Programming Languages (2017).
[7]
Satia Herfert, Jibesh Patra, and Michael Pradel. 2017. Automatically Reducing Tree-structured Test Inputs. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE '17).
[8]
Abram Hindle, Earl T. Barr, Mark Gabel, Zhendong Su, and Premkumar T. Devanbu. 2016. On the Naturalness of Software. Communications of ACM (CACM) (2016).
[9]
Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with Code Fragments. In Proceedings of the 21th USENIX Security Symposium (USENIX Security '12').
[10]
Ranjit Jhala and Rupak Majumdar. 2005. Path Slicing. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '05).
[11]
Yufei Jiang, Qinkun Bao, Shuai Wang, Xiao Liu, and Dinghao Wu. 2018. RedDroid: Android Application Redundancy Customization Based on Static Analysis. In Proceedings of the 29th IEEE International Symposium on Software Reliability Engineering (ISSRE '18) .
[12]
Yufei Jiang, Dinghao Wu, and Peng Liu. 2016. Jred: Program Customization and Bloatware Mitigation Based on Static Analysis. In Proceedings of the 40th IEEE Computer Society International Conference Computer on Software and Applications Conference (COMPSAC '16) .
[13]
Yufei Jiang, Can Zhang, Dinghao Wu, and Peng Liu. 2015. A Preliminary Analysis and Case Study of Feature-Based Software Customization (Extended Abstract). In IEEE International Conference on Software Quality, Reliability and Security (QRS '15) .
[14]
Yufei Jiang, Can Zhang, Dinghao Wu, and Peng Liu. 2016. Feature-Based Software Customization: Preliminary Analysis, Formalization, and Methods. In 17th IEEE International Symposium on High Assurance Systems Engineering (HASE '16) .
[15]
Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondvrej Lhoták, J. Nelson Amaral, Bor-Yuh Evan Chang, Samuel Guyer, Uday Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: A manifesto. Communications of the ACM (CACM) (2015).
[16]
Ghassan Misherghi and Zhendong Su. 2006. HDD: Hierarchical Delta Debugging. In Proceedings of the 28th International Conference on Software Engineering (ICSE '06).
[17]
Khanh Nguyen, Kai Wang, Yingyi Bu, Lu Fang, and Guoqing Xu. 2018. Understanding and Combating Memory Bloat in Managed Data-Intensive Systems. ACM Transactions on Software Engineering and Methodology (TOSEM) (2018).
[18]
Anh Quach, Aravind Prakash, and Lok-Kwong Yan. 2018. Debloating Software through Piece-Wise Compilation and Loading. CoRR, Vol. abs/1802.00759 (2018). http://arxiv.org/abs/1802.00759
[19]
Vaibhav Rastogi, Drew Davidson, Lorenzo De Carli, Somesh Jha, and Patrick McDaniel. 2017. Cimplifier: Automatically Debloating Containers. Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering (ESEC/FSE '17).
[20]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case Reduction for C Compiler Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '12) .
[21]
Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-Guided Program Reduction. In Proceedings of the 40th Internaltional Conference on Software Engineering (ICSE '18) .
[22]
Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction.
[23]
Mark Weiser. 1981. Program Slicing. In Proceedings of the 5th International Conference on Software Engineering (ICSE '81).
[24]
Guoqing Xu, Matthew Arnold, Nick Mitchell, Atanas Rountev, and Gary Sevitsky. 2009. Go with the Flow: Profiling Copies to Find Runtime Bloat. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '09).
[25]
Guoqing Xu, Nick Mitchell, Matthew Arnold, Atanas Rountev, Edith Schonberg, and Gary Sevitsky. 2014. Scalable Runtime Bloat Detection Using Abstract Dynamic Slicing. ACM Transactions on Software Engineering and Methodology (TOSEM) (2014).
[26]
Guoqing Xu, Nick Mitchell, Matthew Arnold, Atanas Rountev, and Gary Sevitsky. 2010. Software Bloat Analysis: Finding, Removing, and Preventing Performance Problems in Modern Large-scale Object-oriented Applications. In Proceedings of the FSE/SDP workshop on Future of software engineering research .
[27]
Guoqing Xu and Atanas Rountev. 2010. Detecting Inefficiently-used Containers to Avoid Bloat. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '10).
[28]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '11).
[29]
Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and Isolating Failure-Inducing Input. IEEE Transactions on Software Engineering (TSE) (2002).

Cited By

View all
  • (2024)LeanBin: Harnessing Lifting and Recompilation to Debloat BinariesProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695515(1434-1446)Online publication date: 27-Oct-2024
  • (2024)Machine Learning Systems are Bloated and VulnerableACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365506452:1(37-38)Online publication date: 13-Jun-2024
  • (2024)Bloat beneath Python’s Scales: A Fine-Grained Inter-Project Dependency AnalysisProceedings of the ACM on Software Engineering10.1145/36608211:FSE(2584-2607)Online publication date: 12-Jul-2024
  • Show More Cited By

Index Terms

  1. Effective Program Debloating via Reinforcement Learning

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CCS '18: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security
    October 2018
    2359 pages
    ISBN:9781450356930
    DOI:10.1145/3243734
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 October 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. program debloating
    2. reinforcement learning

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CCS '18
    Sponsor:

    Acceptance Rates

    CCS '18 Paper Acceptance Rate 134 of 809 submissions, 17%;
    Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)337
    • Downloads (Last 6 weeks)36
    Reflects downloads up to 01 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)LeanBin: Harnessing Lifting and Recompilation to Debloat BinariesProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695515(1434-1446)Online publication date: 27-Oct-2024
    • (2024)Machine Learning Systems are Bloated and VulnerableACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365506452:1(37-38)Online publication date: 13-Jun-2024
    • (2024)Bloat beneath Python’s Scales: A Fine-Grained Inter-Project Dependency AnalysisProceedings of the ACM on Software Engineering10.1145/36608211:FSE(2584-2607)Online publication date: 12-Jul-2024
    • (2024)Machine Learning Systems are Bloated and VulnerableAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655064(37-38)Online publication date: 10-Jun-2024
    • (2024)C2D2: Extracting Critical Changes for Real-World Bugs with Dependency-Sensitive Delta DebuggingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652129(300-312)Online publication date: 11-Sep-2024
    • (2024)LPR: Large Language Models-Aided Program ReductionProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652126(261-273)Online publication date: 11-Sep-2024
    • (2024)Towards Speedy Permission-Based Debloating for Android AppsProceedings of the IEEE/ACM 11th International Conference on Mobile Software Engineering and Systems10.1145/3647632.3651390(84-87)Online publication date: 14-Apr-2024
    • (2024)Improving Program Debloating with 1-DU Chain MinimalityProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings10.1145/3639478.3643518(384-385)Online publication date: 14-Apr-2024
    • (2024)Machine Learning Systems are Bloated and VulnerableProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/36390328:1(1-30)Online publication date: 21-Feb-2024
    • (2024)Efficiently Trimming the Fat: Streamlining Software Dependencies with Java Reflection and Dependency AnalysisProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639123(1-12)Online publication date: 20-May-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media