skip to main content
10.1109/ASE.2019.00052acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

V2: fast detection of configuration drift in Python

Published: 07 February 2020 Publication History

Abstract

Code snippets are prevalent, but are hard to reuse because they often lack an accompanying environment configuration. Most are not actively maintained, allowing for drift between the most recent possible configuration and the code snippet as the snippet becomes out-of-date over time. Recent work has identified the problem of validating and detecting out-of-date code snippets as the most important consideration for code reuse. However, determining if a snippet is correct, but simply out-of-date, is a non-trivial task. In the best case, breaking changes are well documented, allowing developers to manually determine when a code snippet contains an out-of-date API usage. In the worst case, determining if and when a breaking change was made requires an exhaustive search through previous dependency versions.
We present V2, a strategy for determining if a code snippet is out-of-date by detecting discrete instances of configuration drift, where the snippet uses an API which has since undergone a breaking change. Each instance of configuration drift is classified by a failure encountered during validation and a configuration patch, consisting of dependency version changes, which fixes the underlying fault. V2 uses feedback-directed search to explore the possible configuration space for a code snippet, reducing the number of potential environment configurations that need to be validated. When run on a corpus of public Python snippets from prior research, V2 identifies 248 instances of configuration drift.

References

[1]
C. Parnin, C. Treude, and M. A. Storey, "Blogging developer knowledge: Motivations, challenges, and future directions," in 2013 21st International Conference on Program Comprehension (ICPC), May 2013, pp. 211--214.
[2]
W. Wang, G. Poo-Caamaño, E. Wilde, and D. M. German, "What is the gist?: Understanding the use of public gists on github," in Proceedings of the 12th Working Conference on Mining Software Repositories, ser. MSR '15. Piscataway, NJ, USA: IEEE Press, 2015, pp. 314--323. [Online]. Available: http://dl.acm.org/citation.cfm?id=2820518.2820556
[3]
E. Horton and C. Parnin, "Gistable: Evaluating the executability of python code snippets on github," in 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2018.
[4]
A. Rule, A. Tabard, and J. D. Hollan, "Exploration and explanation in computational notebooks," in Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, ser. CHI '18. New York, NY, USA: ACM, 2018, pp. 32:1--32:12. [Online]. Available
[5]
C. Treude and M. Aniche, "Where does google find api documentation?" in IEEE/ACM 2nd International Workshop on API Usage and Evolution, ser. WAPI'18. New York, NY, USA: ACM, 2018.
[6]
D. Yang, P. Martins, V. Saini, and C. Lopes, "Stack overflow in github: Any snippets there?" in Proceedings of the 14th International Conference on Mining Software Repositories, ser. MSR '17. Piscataway, NJ, USA: IEEE Press, 2017, pp. 280--290. [Online]. Available
[7]
D. Yang, A. Hussain, and C. V. Lopes, "From query to usable code: An analysis of stack overflow code snippets," in Proceedings of the 13th International Conference on Mining Software Repositories, ser. MSR '16. New York, NY, USA: ACM, 2016, pp. 391--402. [Online]. Available
[8]
M. Sulír and J. Porubän, "A quantitative study of java software buildability," in Proceedings of the 7th International Workshop on Evaluation and Usability of Programming Languages and Tools, ser. PLATEAU 2016. New York, NY, USA: ACM, 2016, pp. 17--25. [Online]. Available
[9]
E. Horton and C. Parnin, "Dockerizeme: Automatic inference of environment dependencies for python code snippets," in Proceedings of the 41st International Conference on Software Engineering, ser. ICSE '19, 2019.
[10]
J. F. Pimentel, L. Murta, V. Braganholo, and J. Freire, "A large-scale study about quality and reproducibility of jupyter notebooks," in Proceedings of the 16th International Conference on Mining Software Repositories, ser. MSR '19, 2019.
[11]
Y. Wu, S. Wang, C.-P. Bezemer, and K. Inoue, "How do developers utilize source code from stack overflow?" Empirical Software Engineering, pp. 1--37, 2018.
[12]
C. Macho, S. McIntosh, and M. Pinzger, "Automatically repairing dependency-related build breakage," in 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), March 2018, pp. 106--117.
[13]
F. Hassan and X. Wang, "Hirebuild: An automatic approach to history-driven repair of build scripts," in Proceedings of the 2018 ACM/IEEE 40th International Conference on Software Engineering, ser. ICSE 2018, 2018.
[14]
B. Peterson, "Pep 373 - python 2.7 release schedule," https://www.python.org/dev/peps/pep-0373/, online; Accessed May 7, 2019.
[15]
W. Weimer, Z. P. Fry, and S. Forrest, "Leveraging program equivalence for adaptive program repair: Models and first results," in 2013 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), Nov 2013, pp. 356--366.
[16]
E. T. Barr, M. Harman, P. McMinn, M. Shahbaz, and S. Yoo, "The oracle problem in software testing: A survey," IEEE Transactions on Software Engineering, vol. 41, no. 5, pp. 507--525, May 2015.
[17]
C. Bogart, C. Kästner, and J. Herbsleb, "When it breaks, it breaks: How ecosystem developers reason about the stability of dependencies," in 2015 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW), Nov 2015, pp. 86--89.
[18]
H. Seo, C. Sadowski, S. Elbaum, E. Aftandilian, and R. Bowdidge, "Programmers' build errors: A case study (at google)," in International Conference on Software Engineering (ICSE), 2014.
[19]
S. Urli, Z. Yu, L. Seinturier, and M. Monperrus, "How to design a program repair bot? insights from the repairnator project," in 40th International Conference on Software Engineering, Track Software Engineering in Practice (SEIP), ser. ICSE 2018, 2018, pp. 1--10. [Online]. Available: https://hal.inria.fr/hal-01691496/document
[20]
E. Ruiz, S. Mostafa, and X. Wang, "Beyond api signatures: An empirical study on behavioral backward incompatibilities of java software libraries," Department of Computer Science, University of Texas at San Antonio, Tech. Rep., 2015. [Online]. Available: http://xywang.100871.net/TechReport_EmpIncomp.pdf
[21]
W. Weimer, "Patches as better bug reports," in Proceedings of the 5th International Conference on Generative Programming and Component Engineering, ser. GPCE '06. New York, NY, USA: ACM, 2006, pp. 181--190. [Online]. Available
[22]
S. McIntosh, B. Adams, T. H. Nguyen, Y. Kamei, and A. E. Hassan, "An empirical study of build maintenance effort," in Proceedings of the 33rd International Conference on Software Engineering, ser. ICSE '11. New York, NY, USA: ACM, 2011, pp. 141--150. [Online]. Available
[23]
L. Xavier, A. Hora, and M. T. Valente, "Why do we break apis? first answers from developers," in 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), Feb 2017, pp. 392--396.
[24]
J. Dietrich, D. J Pearce, J. Stringer, A. Tahir, and K. Blincoe, "Dependency versioning in the wild," 03 2019.
[25]
J. Cito, G. Schermann, J. E. Wittern, P. Leitner, S. Zumberi, and H. C. Gall, "An empirical analysis of the docker container ecosystem on github," in 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), May 2017, pp. 323--333.
[26]
A. Decan, T. Mens, and M. Claes, "An empirical comparison of dependency issues in oss packaging ecosystems," in 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), Feb 2017, pp. 2--12.
[27]
S. Mirhosseini and C. Parnin, "Can automated pull requests encourage software developers to upgrade out-of-date dependencies?" in Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering, ser. ASE 2017. Piscataway, NJ, USA: IEEE Press, 2017, pp. 84--94. [Online]. Available: http://dl.acm.org/citation.cfm?id=3155562.3155577
[28]
Z. Xing and E. Stroulia, "Api-evolution support with diff-catchup," IEEE Transactions on Software Engineering, vol. 33, no. 12, pp. 818--836, Dec 2007.
[29]
J. Henkel and A. Diwan, "Catchup!: Capturing and replaying refactorings to support api evolution," in Proceedings of the 27th International Conference on Software Engineering, ser. ICSE '05. New York, NY, USA: ACM, 2005, pp. 274--283. [Online]. Available
[30]
B. Dagenais and M. P. Robillard, "Recommending adaptive changes for framework evolution," ACM Trans. Softw. Eng. Methodol., vol. 20, no. 4, pp. 19:1--19:35, Sep. 2011. [Online]. Available
[31]
A. Weiss, A. Guha, and Y. Brun, "Tortoise: Interactive system configuration repair," in Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering, ser. ASE 2017. Piscataway, NJ, USA: IEEE Press, 2017, pp. 625--636. [Online]. Available: http://dl.acm.org/citation.cfm?id=3155562.3155641
[32]
V. Nair, Z. Yu, T. Menzies, N. Siegmund, and S. Apel, "Finding faster configurations using flash," IEEE Transactions on Software Engineering, vol. PP, 01 2018.
[33]
X. B. D. Le, D. Lo, and C. L. Goues, "History driven program repair," in 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), vol. 1, March 2016, pp. 213--224.

Cited By

View all
  • (2024)How to Pet a Two-Headed Snake? Solving Cross-Repository Compatibility Issues with HeraProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695064(694-705)Online publication date: 27-Oct-2024
  • (2024)Less is More? An Empirical Study on Configuration Issues in Python PyPI EcosystemProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639077(1-12)Online publication date: 20-May-2024
  • (2023)Automatically Resolving Dependency-Conflict Building Failures via Behavior-Consistent Loosening of Library Version ConstraintsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616264(198-210)Online publication date: 30-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASE '19: Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering
November 2019
1333 pages
ISBN:9781728125084

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 07 February 2020

Check for updates

Author Tags

  1. configuration drift
  2. configuration management
  3. configuration repair
  4. dependencies
  5. environment inference

Qualifiers

  • Research-article

Conference

ASE '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)How to Pet a Two-Headed Snake? Solving Cross-Repository Compatibility Issues with HeraProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695064(694-705)Online publication date: 27-Oct-2024
  • (2024)Less is More? An Empirical Study on Configuration Issues in Python PyPI EcosystemProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639077(1-12)Online publication date: 20-May-2024
  • (2023)Automatically Resolving Dependency-Conflict Building Failures via Behavior-Consistent Loosening of Library Version ConstraintsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616264(198-210)Online publication date: 30-Nov-2023
  • (2022)smartPip: A Smart Approach to Resolving Python Dependency Conflict IssuesProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3560437(1-12)Online publication date: 10-Oct-2022
  • (2022)Knowledge-based environment dependency inference for python programsProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510127(1245-1256)Online publication date: 21-May-2022
  • (2021)Finding broken Linux configuration specifications by statically analyzing the Kconfig languageProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3468578(893-905)Online publication date: 20-Aug-2021
  • (2021)Fixing dependency errors for Python build reproducibilityProceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3460319.3464797(439-451)Online publication date: 11-Jul-2021
  • (2021)Restoring Execution Environments of Jupyter NotebooksProceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00144(1622-1633)Online publication date: 22-May-2021
  • (2021)An Evolutionary Study of Configuration Design and Implementation in Cloud SystemsProceedings of the 43rd International Conference on Software Engineering10.1109/ICSE43902.2021.00029(188-200)Online publication date: 22-May-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media