skip to main content
research-article

Accelerating Continuous Integration by Caching Environments and Inferring Dependencies

Published: 01 June 2022 Publication History

Abstract

To facilitate the rapid release cadence of modern software (on the order of weeks, days, or even hours), software development organizations invest in practices like Continuous Integration (CI), where each change submitted by developers is built (e.g., compiled, tested, linted) to detect problematic changes early. A fast and efficient build process is crucial to provide timely CI feedback to developers. If CI feedback is too slow, developers may switch contexts to other tasks, which is known to be a costly operation for knowledge workers. Thus, minimizing the build execution time for CI services is an important task. While recent work has made several important advances in the acceleration of CI builds, optimizations often depend upon explicitly defined build dependency graphs (e.g., make, Gradle, CloudBuild, Bazel). These hand-maintained graphs may be (a) underspecified, leading to incorrect build behaviour; or (b) overspecified, leading to missed acceleration opportunities. In this paper, we propose <sc>Kotinos</sc>&#x2014;a language-agnostic approach to infer data from which build acceleration decisions can be made without relying upon build specifications. After inferring this data, our approach accelerates CI builds by caching the build environment and skipping unaffected build steps. <sc>Kotinos</sc> is at the core of a commercial CI service with a growing customer base. To evaluate <sc>Kotinos</sc>, we mine 14,364 historical CI build records spanning three proprietary and seven open-source software projects. We find that: (1) at least 87.9 percent of the builds activate at least one <sc>Kotinos</sc> acceleration; and (2) 74 percent of accelerated builds achieve a speed-up of two-fold with respect to their non-accelerated counterparts. Moreover, (3) the benefits of <sc>Kotinos</sc> can also be replicated in open source software systems; and (4) <sc>Kotinos</sc> imposes minimal resource overhead (i.e., <inline-formula><tex-math notation="LaTeX">$&#x003C;$</tex-math><alternatives><mml:math><mml:mo>&#x003C;</mml:mo></mml:math><inline-graphic xlink:href="gallaba-ieq1-3048335.gif"/></alternatives></inline-formula> 1 percent median CPU usage, 2 MB &#x2013; 2.2 GB median memory usage, and 0.4 GB &#x2013; 5.2 GB median storage overhead) and does not compromise build outcomes. Our results suggest that migration to <sc>Kotinos</sc> yields substantial benefits with minimal investment of effort (e.g., no migration of build systems is necessary).

References

[1]
R. Abdalkareem, S. Mujahid, and E. Shihab, “A machine learning approach to improve the detection of CI skip commits,” IEEE Trans. Softw. Eng., to be published.
[2]
R. Abdalkareem, S. Mujahid, E. Shihab, and J. Rilling, “Which commits can be CI skipped?,” IEEE Trans. Softw. Eng., to be published.
[3]
B. Adams, H. Tromp, K. de Schutter, and W. de Meuter, “Design recovery and maintenance of build systems,” in Proc. Int. Conf. Softw. Maintenance, 2007, pp. 114–123.
[4]
B. Adams, K. de Schutter, H. Tromp, and W. de Meuter, “The evolution of the linux build system,” Electron. Commun. ECEASST, vol. 8, 2008. [Online]. Available: https://doi.org/10.14279/tuj.eceasst.8.115
[5]
C. AtLee, L. Blakk, J. O’Duinn, and A. Z. Gasparnian, “Firefox release engineering,” in A. Brown and G. Wilson, Eds., The Architecture of Open Source Applications: Structure, Scale, and a Few More Fearless Hacks, Chapter 2. Mountain View, CA, USA: Creative Commons, 2012.
[6]
C.-P. Bezemer, S. McIntosh, B. Adams, D. M. German, and A. E. Hassan, “An empirical study of unspecified dependencies in make-based build systems,” Empir. Softw. Eng., vol. 22, no. 6, pp. 3117–3148, 2017.
[7]
G. Brooks, “Team pace keeping build times down,” in Proc. Agile Conf., 2008, pp. 294–297.
[8]
W. J. Brown, H. W. McCormick III, and S. W. Thomas, AntiPatterns and Patterns in Software Configuration Management. Hoboken, NJ, USA: Wiley, 1999.
[9]
Q. Cao, R. Wen, and S. McIntosh, “Forecasting the duration of incremental build jobs,” in Proc. Int. Conf. Softw. Maintenance Evol., 2017, pp. 524–528.
[10]
P. M. Duvall, S. Matyas, and A. Glover, Continuous Integration: Improving Software Quality and Reducing Risk. London, U.K.: Pearson Education, 2007.
[11]
S. Elbaum, G. Rothermel, and J. Penix, “Techniques for improving regression testing in continuous integration development environments,” in Proc. Int. Symp. Found. Softw. Eng., 2014, pp. 235–245.
[12]
H. Esfahaniet al., “CloudBuild: Microsoft’s distributed and caching build service,” in Proc. Int. Conf. Softw. Eng. Companion, 2016, pp. 11–20.
[13]
S. I. Feldman, “Make — A program for maintaining computer programs,” Softw.: Pract. Experience, vol. 9, no. 4, pp. 255–265, 1979.
[14]
W. Felidré, L. Furtado, D. A. da Costa, B. Cartaxo, and G. Pinto, “Continuous integration theater,” in Proc. ACM/IEEE Int. Symp. Empir. Softw. Eng. Meas., 2019, pp. 1–10.
[15]
K. Gallaba, C. Macho, M. Pinzger, and S. McIntosh, “Noise and heterogeneity in historical build data: An empirical study of Travis CI,” in Proc. Int. Conf. Autom. Softw. Eng., 2018, pp. 87–97.
[16]
K. Gallaba and S. McIntosh, “Use and misuse of continuous integration features: An empirical study of projects that (mis)use Travis CI,” IEEE Trans. Softw. Eng., vol. 46, no. 1, pp. 33–50, Jan. 2020.
[17]
T. A. Ghaleb, D. A. da Costa, and Y. Zou, “An empirical study of the long duration of continuous integration builds,” Empir. Softw. Eng., vol. 24, pp. 2102–2139, 2019.
[18]
M. Gligoric, W. Schulte, C. Prasad, D. van Velzen, I. Narasamdya, and B. Livshits, “Automated migration of build scripts using dynamic analysis and search-based refactoring,” in Proc. Int. Conf. Object Oriented Program. Syst. Lang. Appl., 2014, pp. 599–616.
[19]
F. Hassan and X. Wang, “HireBuild: An automatic approach to history-driven repair of build scripts,” in Proc. Int. Conf. Softw. Eng., 2018, pp. 1078–1089.
[20]
M. Hilton, N. Nelson, T. Tunnell, D. Marinov, and D. Dig, “Trade-offs in continuous integration: Assurance, security, and flexibility,” in Proc. Joint Meeting Eur. Softw. Eng. Conf. Int. Symp. Found. Softw. Eng., 2017, pp. 197–207.
[21]
M. Hilton, T. Tunnell, K. Huang, D. Marinov, and D. Dig, “Usage, costs, and benefits of continuous integration in open-source projects,” in Proc. Int. Conf. Autom. Softw. Eng., 2016, pp. 426–437.
[22]
S. Holm, “A simple sequentially rejective multiple test procedure,” Scand. J. Statist., vol. 6, pp. 65–70, 1979.
[23]
C. Lebeuf, E. Voyloshnikova, K. Herzig, and M.-A. Storey, “Understanding, debugging, and optimizing distributed software builds: A design study,” in Proc. Int. Conf. Softw. Maintenance Evol., 2018, pp. 496–507.
[24]
Y. Li, J. Wang, Y. Yang, and Q. Wang, “Method-level test selection for continuous integration with static dependencies and dynamic execution rules,” in Proc. Int. Conf. Softw. Qual. Rel. Secur., 2019, pp. 350–361.
[25]
C. Macho, S. McIntosh, and M. Pinzger, “Predicting build co-changes with source code change and commit categories,” in Proc. IEEE 23rd Int. Conf. Softw. Anal. Evol. Reeng., 2016, pp. 541–551.
[26]
C. Macho, S. McIntosh, and M. Pinzger, “Automatically repairing dependency-related build breakage,” in Proc. Int. Conf. Softw. Anal. Evol. Reeng., 2018, pp. 106–117.
[27]
S. McIntosh, B. Adams, M. Nagappan, and A. E. Hassan, “Mining co-change information to understand when build changes are necessary,” in Proc. IEEE Int. Conf. Softw. Maintenance Evol., 2014, pp. 241–250.
[28]
S. McIntosh, B. Adams, M. Nagappan, and A. E. Hassan, “Identifying and understanding header file hotspots in C/C++ build processes,” Autom. Softw. Eng., vol. 23, no. 4, pp. 619–647, 2015.
[29]
A. Memonet al., “Taming Google-scale continuous testing,” in Proc. IEEE/ACM 39th Int. Conf. Softw. Eng.: Softw. Eng. Pract. Track, 2017, pp. 233–242.
[30]
A. N. Meyer, L. E. Barton, G. C. Murphy, T. Zimmermann, and T. Fritz, “The work life of developers: Activities, switches and perceived productivity,” IEEE Trans. Softw. Eng., vol. 43, no. 12, pp. 1178–1193, Dec. 2017.
[31]
A. Miller, “A hundred days of continuous integration,” in Proc. Agile Conf., 2008, pp. 289–293.
[32]
G. Pinto, F. Castor, R. Bonifacio, and M. Rebouças, “Work practices and challenges in continuous integration: A survey with Travis CI users,” Softw.: Pract. Experience, vol. 48, no. 12, pp. 2223–2236, 2018.
[33]
G. Pinto, M. Reboucas, and F. Castor, “Inadequate testing, time pressure, and (over) confidence: A tale of continuous integration users,” in Proc. Int. Workshop Cooperative Hum. Aspects Softw. Eng., 2017, pp. 74–77.
[34]
T. Rausch, W. Hummer, P. Leitner, and S. Schulte, “An empirical analysis of build failures in the continuous integration workflows of java-based open-source software,” in Proc. Int. Conf. Mining Softw. Repositories, 2017, pp. 345–355.
[35]
H. Seo, C. Sadowski, S. Elbaum, E. Aftandilian, and R. Bowdidge, “Programmers’ build errors: A case study (at Google),” in Proc. Int. Conf. Softw. Eng., 2014, pp. 724–734.
[36]
A. Shi, P. Zhao, and D. Marinov, “Understanding and improving regression test selection in continuous integration,” in Proc. IEEE 30th Int. Symp. Softw. Rel. Eng., 2019, pp. 228–238.
[37]
R. Suvorov, M. Nagappan, A. E. Hassan, Y. Zou, and B. Adams, “An empirical study of build system migrations in practice: Case studies on KDE and the linux kernel,” in Proc. Int. Conf. Softw. Maintenance, 2012, pp. 160–169.
[38]
C. Tantithamthavorn, S. McIntosh, A. E. Hassan, and K. Matsumoto, “An empirical comparison of model validation techniques for defect prediction models,” IEEE Trans. Softw. Eng., vol. 43, no. 1, pp. 1–18, Jan. 2017.
[39]
M. Tufano, H. Sajnani, and K. Herzig, “Towards predicting the impact of software changes on building activities,” in Proc. Int. Conf. Softw. Eng.: New Ideas Emerg. Results, 2019, pp. 49–52.
[40]
M. Vakilian, R. Sauciuc, J. D. Morgenthaler, and V. Mirrokni, “Automated decomposition of build targets,” in Proc. IEEE/ACM 37th IEEE Int. Conf. Softw. Eng., 2015, pp. 123–133.
[41]
B. Vasilescu, Y. Yu, H. Wang, P. Devanbu, and V. Filkov, “Quality and productivity outcomes relating to continuous integration in GitHub,” in Proc. Joint Meeting Eur. Softw. Eng. Conf. Int. Symp. Found. Softw. Eng., 2015, pp. 805–816.
[42]
C. Vassallo, S. Proksch, H. C. Gall, and M. Di Penta, “Automated reporting of anti-patterns and decay in continuous integration,” in Proc. Int. Conf. Softw. Eng., 2019, pp. 105–115.
[43]
C. Vassallo, S. Proksch, T. Zemp, and H. C. Gall, “Un-break my build: Assisting developers with build repair hints,” in Proc. Int. Conf. Program Comprehension, 2018, pp. 41–51.
[44]
C. Vassalloet al., “A tale of CI build failures: An open source and a financial organization perspective,” in Proc. Int. Conf. Softw. Maintenance Evol., 2017, pp. 183–193.
[45]
D. G. Widder, M. Hilton, C. Kästner, and B. Vasilescu, “A conceptual replication of continuous integration pain points in the context of Travis CI,” in Proc. Joint Meeting Eur. Softw. Eng. Conf. Int. Symp. Found. Softw. Eng., 2019, pp. 647–658.
[46]
X. Xia, D. Lo, S. McIntosh, E. Shihab, and A. E. Hassan, “Cross-project build co-change prediction,” in Proc. IEEE 22nd Int. Conf. Softw. Anal. Evol. Reeng., 2015, pp. 311–320.
[47]
S. Yoo and M. Harman, “Regression testing minimization, selection and prioritization: A survey,” Softw. Testing Verification Rel., vol. 22, no. 2, pp. 67–120, 2012.
[48]
Y. Yu, H. Wang, V. Filkov, P. Devanbu, and B. Vasilescu, “Wait for it: Determinants of pull request evaluation latency on GitHub,” in Proc. Int. Conf. Mining Softw. Repositories, 2015, pp. 367–371.
[49]
F. Zampetti, C. Vassallo, S. Panichella, G. Canfora, H. Gall, and M. D. Penta, “An empirical characterization of bad practices in continuous integration,” Empir. Softw. Eng., vol. 25, no. 2, pp. 1095–1135, 2020.
[50]
Y. Zhao, A. Serebrenik, Y. Zhou, V. Filkov, and B. Vasilescu, “The impact of continuous integration on other software development practices: A large-scale empirical study,” in Proc. Int. Conf. Autom. Softw. Eng., 2017, pp. 60–71.
[51]
M. Züger and T. Fritz, “Interruptibility of software developers and its prediction using psycho-physiological sensors,” in Proc. Int. Conf. Hum. Factors Comput. Syst., 2015, pp. 2981–2990.

Cited By

View all
  • (2024)Dependency-Induced Waste in Continuous Integration: An Empirical Study of Unused Dependencies in the npm EcosystemProceedings of the ACM on Software Engineering10.1145/36608231:FSE(2632-2655)Online publication date: 12-Jul-2024
  • (2024)Detecting Build Dependency Errors in Incremental BuildsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652105(1-12)Online publication date: 11-Sep-2024
  • (2024)Mining Our Way Back to Incremental Builds for DevOps PipelinesProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3649106(48-49)Online publication date: 15-Apr-2024
  • Show More Cited By

Index Terms

  1. Accelerating Continuous Integration by Caching Environments and Inferring Dependencies
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image IEEE Transactions on Software Engineering
    IEEE Transactions on Software Engineering  Volume 48, Issue 6
    June 2022
    355 pages

    Publisher

    IEEE Press

    Publication History

    Published: 01 June 2022

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Dependency-Induced Waste in Continuous Integration: An Empirical Study of Unused Dependencies in the npm EcosystemProceedings of the ACM on Software Engineering10.1145/36608231:FSE(2632-2655)Online publication date: 12-Jul-2024
    • (2024)Detecting Build Dependency Errors in Incremental BuildsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3652105(1-12)Online publication date: 11-Sep-2024
    • (2024)Mining Our Way Back to Incremental Builds for DevOps PipelinesProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3649106(48-49)Online publication date: 15-Apr-2024
    • (2024)The Impact of Code Ownership of DevOps Artefacts on the Outcome of DevOps CI BuildsProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644924(543-555)Online publication date: 15-Apr-2024
    • (2024)RavenBuild: Context, Relevance, and Dependency Aware Build Outcome PredictionProceedings of the ACM on Software Engineering10.1145/36437711:FSE(996-1018)Online publication date: 12-Jul-2024
    • (2024)Code Impact Beyond Disciplinary Boundaries: Constructing a Multidisciplinary Dependency Graph and Analyzing Cross-Boundary ImpactProceedings of the 46th International Conference on Software Engineering: Software Engineering in Practice10.1145/3639477.3639726(122-133)Online publication date: 14-Apr-2024
    • (2024)Resource Usage and Optimization Opportunities in Workflows of GitHub ActionsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623303(1-12)Online publication date: 20-May-2024
    • (2024)How Trustworthy Is Your Continuous Integration (CI) Accelerator?: A Comparison of the Trustworthiness of CI Acceleration ProductsIEEE Software10.1109/MS.2024.339561641:6(82-90)Online publication date: 1-Nov-2024
    • (2023)Accelerating Continuous Integration with Parallel Batch TestingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616255(55-67)Online publication date: 30-Nov-2023
    • (2023)HybridCISave: A Combined Build and Test Selection Approach in Continuous IntegrationACM Transactions on Software Engineering and Methodology10.1145/357603832:4(1-39)Online publication date: 26-May-2023
    • Show More Cited By

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media