skip to main content
10.1145/3643991.3644921acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

DRMiner: A Tool For Identifying And Analyzing Refactorings In Dockerfile

Published: 02 July 2024 Publication History
  • Get Citation Alerts
  • Abstract

    Software containerization using Docker has recently become the de facto standard for delivering reusable software artifacts. Integral to Docker's functionality are Dockerfiles, which serve as scripts that define the layers and components to be incorporated within a container. Although these files serve as the bedrock of container creation, their maintenance presents intricate challenges. Specifically, the task of Dockerfile refactoring is compounded by its inherent complexity. Although the importance of refactoring inside Docker ecosystems is apparent, detecting it remains challenging. Developers usually avoid documenting their refactoring efforts, often combining them with other changes.
    While previous research works have delved into Docker refactoring, the predominant focus has been on empirical foundations, resulting in a constrained and narrow viewpoint. Despite all endeavors, there remains a clear gap for an exhaustive tool that can adeptly navigate the complexities of Dockerfile refactoring detection. To fill this gap, we introduce DRMiner, the first tool for identifying and analyzing refactoring in Dockerfile. Our solution, designed, implemented, and evaluated in terms of correctness and generalization, relies on a novel E-AST(Enhanced Abstract Syntax Tree) based component-matching algorithm and a set of detection rules to determine refactoring candidates. This work will serve as a fundamental building block for the refactoring detection in the realm of Docker.

    References

    [1]
    [n. d.]. hadolint/hadolint: Dockerfile linter, validate inline bash, written in Haskell. https://github.com/hadolint/hadolint. (Accessed on 11/17/2023).
    [2]
    Eman Abdullah AlOmar, Hussein AlRubaye, Mohamed Wiem Mkaouer, Ali Ouni, and Marouane Kessentini. 2021. Refactoring practices in the context of modern code review: An industrial case study at Xerox. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 348--357.
    [3]
    Anonymos Author(s). 2023. Replication Package MSR 2024. https://sites.google.com/view/msr24/home.
    [4]
    Giuliano Antoniol, Massimiliano Di Penta, and Ettore Merlo. 2004. An Automatic Approach to identify Class Evolution Discontinuities. In 7th International Workshop on Principles of Software Evolution. IEEE, 31--40.
    [5]
    Amine Barrak, Marc-André Laverdière, Foutse Khomh, Le An, and Ettore Merlo. 2018. Just-in-time detection of protection-impacting changes on WordPress and MediaWiki. In Proceedings of the 28th Annual International Conference on Computer Science and Software Engineering. 178--188.
    [6]
    Gabriele Bavota, Bernardino De Carluccio, Andrea De Lucia, Massimiliano Di Penta, Rocco Oliveto, and Orazio Strollo. 2012. When Does a Refactoring Induce Bugs? An Empirical Study. In Proceedings of the IEEE 12th International Working Conference on Source Code Analysis and Manipulation (SCAM '12). 104--113.
    [7]
    Gabriele Bavota, Andrea De Lucia, Massimiliano Di Penta, Rocco Oliveto, and Fabio Palomba. 2015. An Experimental Investigation on the Innate Relationship Between Quality and Refactoring. Journal of Systems and Software 107 (Sep 2015), 1--14.
    [8]
    J. Cito, G. Schermann, J. E. Wittern, P. Leitner, S. Zumberi, and H. C. Gall. 2017. An Empirical Analysis of the Docker Container Ecosystem on GitHub. In Proceedings of the IEEE/ACM 14th International Conference on Mining Software Repositories (MSR '17). 323--333.
    [9]
    Serge Demeyer, Stéphane Ducasse, and Oscar Nierstrasz. 2000. Finding refactorings via change metrics. ACM SIGPLAN Notices 35, 10 (2000), 166--177.
    [10]
    Danny Dig, Can Comertoglu, Darko Marinov, and Ralph Johnson. 2005. Automatic detection of refactorings for libraries and frameworks. In Proceedings of Workshop on Object Oriented Reengineering (WOOR'05).
    [11]
    Jordan Henkel, Christian Bird, Shuvendu K. Lahiri, and Thomas Reps. 2020. Learning from, Understanding, and Supporting DevOps Artifacts for Docker. In Proceedings of the 42nd International Conference on Software Engineering (ICSE '20).
    [12]
    Jordan Henkel, Christian Bird, Shuvendu K. Lahiri, and Thomas Reps. 2020. Learning from, understanding, and supporting devops artifacts for docker. In Proceedings of the 42nd International Conference on Software Engineering (ICSE '20).
    [13]
    Docker Inc. [n. d.]. Docker Docs. https://docs.docker.com/. (Accessed on 11/17/2023).
    [14]
    Toshihiro Kamiya, Shinji Kusumoto, and Katsuro Inoue. 2002. CCFinder: A multilinguistic token-based code clone detection system for large scale source code. IEEE transactions on software engineering 28, 7 (2002), 654--670.
    [15]
    A. Ketkar, N. Tsantalis, and D. Dig. 2020. Understanding Type Changes in Java. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Association for Computing Machinery, New York, NY, USA.
    [16]
    Miryung Kim, Dongxiang Cai, and Sunghun Kim. 2011. An Empirical Investigation into the Role of API-level Refactorings During Software Evolution. In Proceedings of the 33rd International Conference on Software Engineering (ICSE '11). ACM, New York, NY, USA, 151--160.
    [17]
    Miryung Kim, Matthew Gee, Alex Loh, and Napol Rachatasumrit. 2010. Reffinder: a refactoring reconstruction tool based on logic query templates. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering. 371--372.
    [18]
    Emna Ksontini, Marouane Kessentini, Thiago do N. Ferreira, and Foyzul Hassan. 2021. Refactorings and Technical Debt for Docker Projects. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE.
    [19]
    Vladimir I Levenshtein et al. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, Vol. 10. Soviet Union, 707--710.
    [20]
    Alex Loh and Miryung Kim. 2010. LSdiff: a program differencing tool to identify systematic structural differences. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering-Volume 2. 263--266.
    [21]
    Zhigang Lu, Jiwei Xu, Yuewen Wu, Tao Wang, and Tao Huang. 2019. An Empirical Case Study on the Temporary File Smell in Dockerfiles. IEEE Access PP (03 2019), 1--1.
    [22]
    Daniel D McCracken and Edwin D Reilly. 2003. Backus-naur form (bnf). In Encyclopedia of Computer Science. 129--131.
    [23]
    E. C. Neto, D. A. da Costa, and U. Kulesza. 2018. The Impact of Refactoring Changes on the SZZ Algorithm: An Empirical Study. In 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER). 380--390.
    [24]
    Fabio Palomba, Andy Zaidman, Rocco Oliveto, and Andrea De Lucia. 2017. An Exploratory Study on the Relationship Between Changes and Refactoring. In Proceedings of the 25th International Conference on Program Comprehension (ICPC '17). IEEE Press, Piscataway, NJ, USA, 176--185.
    [25]
    Napol Rachatasumrit and Miryung Kim. 2012. An empirical investigation into the impact of refactoring on regression testing. In Proceedings of the 28th IEEE International Conference on Software Maintenance (ICSM '12). 357--366.
    [26]
    Giovanni Rosa, Simone Scalabrino, Gabriele Bavota, and Rocco Oliveto. 2023. What Quality Aspects Influence the Adoption of Docker Images? ACM Transactions on Software Engineering and Methodology 32, 6 (2023), 1--30.
    [27]
    Muhammad Fakhrur Rozi, Tao Ban, Seiichi Ozawa, Akira Yamada, Takeshi Takahashi, Sangwook Kim, and Daisuke Inoue. 2023. Detecting Malicious JavaScript Using Structure-Based Analysis of Graph Representation. IEEE Access (2023).
    [28]
    Danilo Silva, Joao Paulo da Silva, Gustavo Santos, Ricardo Terra, and Marco Tulio Valente. 2020. Refdiff 2.0: A multi-language refactoring detection tool. IEEE Transactions on Software Engineering 47, 12 (2020), 2786--2802.
    [29]
    K. Stroggylos and D. Spinellis. 2007. Refactoring-Does it Improve Software Quality?. In Fifth International Workshop on Software Quality (WoSQ'07: ICSE Workshops 2007). 10--10.
    [30]
    Takeru Tanaka, Hideaki Hata, Bodin Chinthanet, Raula Gaikovina Kula, and Kenichi Matsumoto. 2023. Meta-Maintanance for Dockerfiles: Are We There Yet? arXiv preprint arXiv:2305.03251 (2023).
    [31]
    N. Tsantalis, M. Mansouri, L. M. Eshkevari, D. Mazinanian, and D. Dig. 2018. Accurate and Efficient Refactoring Detection in Commit History. In Proceedings of the 40th International Conference on Software Engineering. New York, NY, 483--494.
    [32]
    Nikolaos Tsantalis, Matin Mansouri, Laleh M. Eshkevari, Davood Mazinanian, and Danny Dig. 2018. Accurate and Efficient Refactoring Detection in Commit History. In Proceedings of the 40th International Conference on Software Engineering (Gothenburg, Sweden) (ICSE '18). ACM, New York, NY, USA, 483--494.
    [33]
    Yu Wang, Yi Sun, Zhaowen Lin, and Jiangsong Min. 2020. Container-Based Performance Isolation for Multi-Tenant SaaS Applications in Micro-Service Architecture. Journal of Physics: Conference Series 1486, 5 (2020), 052032.
    [34]
    Peter Weissgerber and Stephan Diehl. 2006. Are Refactorings Less Error-prone Than Other Changes?. In Proceedings of the 2006 International Workshop on Mining Software Repositories (MSR '06). ACM, New York, NY, USA, 112--118.
    [35]
    Yiwen Wu. 2020. Exploring the relationship between dockerfile quality and project characteristics. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Companion Proceedings. 128--130.
    [36]
    Yiwen Wu, Yang Zhang, Tao Wang, and Huaimin Wang. 2020. Characterizing the Occurrence of Dockerfile Smells in Open-Source Software: An Empirical Study. IEEE Access (2020).
    [37]
    Yiwen Wu, Yang Zhang, Tao Wang, and Huaimin Wang. 2020. Dockerfile Changes in Practice: A Large-Scale Empirical Study of 4,110 Projects on GitHub. In 2020 27th Asia-Pacific Software Engineering Conference (APSEC). 247--256.
    [38]
    Zhenchang Xing and Eleni Stroulia. 2005. UMLDiff: an algorithm for object-oriented design differencing. In Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering. 54--65.
    [39]
    Robert K Yin. 2009. Case study research: Design and methods. Vol. 5. sage.
    [40]
    Yang Zhang, Huaimin Wang, and Vladimir Filkov. 2019. A clustering-based approach for mining dockerfile evolutionary trajectories. Science China Information Sciences 62 (2019), 1--3.

    Index Terms

    1. DRMiner: A Tool For Identifying And Analyzing Refactorings In Dockerfile

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MSR '24: Proceedings of the 21st International Conference on Mining Software Repositories
      April 2024
      788 pages
      ISBN:9798400705878
      DOI:10.1145/3643991
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 July 2024

      Check for updates

      Author Tags

      1. docker
      2. dockerfiles
      3. AST
      4. refactoring
      5. commit
      6. git

      Qualifiers

      • Research-article

      Conference

      MSR '24
      Sponsor:

      Upcoming Conference

      ICSE 2025

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 29
        Total Downloads
      • Downloads (Last 12 months)29
      • Downloads (Last 6 weeks)25
      Reflects downloads up to 14 Aug 2024

      Other Metrics

      Citations

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media