research-article

Bug Analysis in Jupyter Notebook Projects: An Empirical Study

Authors:

Taijara Loiola De Santana,

Paulo Anselmo Da Mota Silveira Neto,

Eduardo Santana De Almeida,

Iftekhar AhmedAuthors Info & Claims

ACM Transactions on Software Engineering and Methodology, Volume 33, Issue 4

Article No.: 101, Pages 1 - 34

https://doi.org/10.1145/3641539

Published: 18 April 2024 Publication History

Abstract

Computational notebooks, such as Jupyter, have been widely adopted by data scientists to write code for analyzing and visualizing data. Despite their growing adoption and popularity, few studies have been found to understand Jupyter development challenges from the practitioners’ point of view. This article presents a systematic study of bugs and challenges that Jupyter practitioners face through a large-scale empirical investigation. We mined 14,740 commits from 105 GitHub open source projects with Jupyter Notebook code. Next, we analyzed 30,416 StackOverflow posts, which gave us insights into bugs that practitioners face when developing Jupyter Notebook projects. Next, we conducted 19 interviews with data scientists to uncover more details about Jupyter bugs and to gain insight into Jupyter developers’ challenges. Finally, to validate the study results and proposed taxonomy, we conducted a survey with 91 data scientists. We highlight bug categories, their root causes, and the challenges that Jupyter practitioners face.

References

[1]

Amritanshu Agrawal, Akond Rahman, Rahul Krishna, Alexander Sobran, and Tim Menzies. 2018. We don’t need another hero? The impact of “heroes” on software development. In Proceedings of the 40th International Conference on Software Engineering:Software Engineering in Practice (ICSE’18). ACM, 245–253. DOI:

Digital Library

[2]

Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). 487–499.

Digital Library

[3]

Andrew Begel and Thomas Zimmermann. 2014. Analyze this! 145 questions for data scientists in software engineering. In Proceedings of the 36th International Conference on Software Engineering (ICSE’14). ACM, 12–23. DOI:

Digital Library

[4]

Longbing Cao. 2017. Data science:A comprehensive overview. ACM Comput. Surv. 50, 3 (June2017), 1–42.

Digital Library

[5]

Souti Chattopadhyay, Ishita Prasad, Austin Z. Henley, Anita Sarma, and Titus Barik. 2020. What’s wrong with computational notebooks? Pain points, needs, and design opportunities. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI’20). ACM, 1–12.

[6]

Vasant Dhar. 2013. Data science and prediction. Commun. ACM 56, 12 (Dec. 2013), 64–73.

Digital Library

[7]

P. Fusch and L. Ness. 2015. Are we there yet? Data saturation in qualitative research. In Qualitative Report. Nova Southeastern University, Minneapolis, MN, 1408–1416.

[8]

Joshua Garcia, Yang Feng, Junjie Shen, Sumaya Almanee, Yuan Xia, and Qi Alfred Chen. 2020. A comprehensive study of autonomous vehicle bugs. In Proceedings of the 42nd International Conference on Software Engineering (ICSE’20). ACM, 385–396.

Digital Library

[9]

Andrew Head, Fred Hohman, Titus Barik, Steven Mark Drucker, and Robert DeLine. 2019. Managing messes in computational notebooks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI’19). ACM, 270. DOI:

Digital Library

[10]

Md. Johirul Islam, Giang Nguyen, Rangeet Pan, and Hridesh Rajan. 2019. A comprehensive study on deep learning bug characteristics. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’19). ACM, 510–520.

Digital Library

[11]

Project Jupyter. 2015. Project Jupyter:Computational Narratives as the Engine of Collaborative Data Science. Retrieved January 26, 2024 from https://blog.jupyter.org/project-jupyter-computational-narratives-as-the-engine-of-collaborative-data-science-2b5fb94c3c58

[12]

Sean Kandel, Andreas Paepcke, Joseph M. Hellerstein, and Jeffrey Heer. 2012. Enterprise data analysis and visualization:An interview study. IEEE Trans. Vis. Comput. Graph. 18, 12 (2012), 2917–2926.

Digital Library

[13]

Mary Beth Kery, Marissa Radensky, Mahima Arya, Bonnie E. John, and Brad A. Myers. 2018. The story in the notebook:Exploratory data science using a literate programming tool. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI’18). ACM, 174. DOI:

Digital Library

[14]

Mary Beth Kery, Donghao Ren, Fred Hohman, Dominik Moritz, Kanit Wongsuphasawat, and Kayur Patel. 2020. Mage:Fluid moves between code and graphical work in computational notebooks. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology (UIST’20). ACM, 140–151. DOI:

Digital Library

[15]

Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2016. The emerging role of data scientists on software development teams. In Proceedings of the 38th International Conference on Software Engineering(ICSE’16). ACM, 96–107. DOI:

Digital Library

[16]

Barbara A. Kitchenham and Shari L. Pfleeger. 2008. Personal opinion surveys. In Guide to Advanced Empirical Software Engineering, Forrest Shull, Janice Singer, and Dag I. K. Sjøberg (Eds.). Springer London, 63–92.

[17]

Andreas P. Koenzen, Neil A. Ernst, and Margaret-Anne D. Storey. 2020. Code duplication and reuse in Jupyter Notebooks. In Proceedings of the IEEE Symposium on Visual Languages and Human-Centric Computing (VL/IICC’20). IEEE, 1–9.

[18]

Rahul Krishna, Amritanshu Agrawal, Akond Rahman, Alexander Sobran, and Tim Menzies. 2018. What is the connection between issues, bugs, and enhancements? Lessons learned from 800+ software projects. In Proceedings of the 40th International Conference on Software Engineering:Software Engineering in Practice (ICSE’18). ACM, 306–315. DOI:

Digital Library

[19]

J. Richard Landis and Gary G. Koch. 1977. The measurement of observer agreement for categorical data. Biometrics 33, 1 (1977), 159–174.

[20]

Amir Makhshari and Ali Mesbah. 2021. IoT bugs and development challenges. In Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering (ICSE’21). IEEE, 460–472.

Digital Library

[21]

Nuthan Munaiah, Steven Kroh, Craig Cabrey, and Meiyappan Nagappan. 2016. Curating GitHub for engineered software projects. PeerJ Prepr. 4 (2016), e2617. DOI:

[22]

Jibesh Patra and Michael Pradel. 2021. Nalin:Learning from runtime behavior to find name-value inconsistencies in Jupyter Notebooks. In Proceedings of the 44rd IEEE/ACM International Conference on Software Engineering (ICSE’22). ACM.

[23]

João Felipe Pimentel, Leonardo Murta, Vanessa Braganholo, and Juliana Freire. 2019. A large-scale study about quality and reproducibility of Jupyter Notebooks. In Proceedings of the 16th International Conference on Mining Software Repositories (MSR’19). IEEE, 507–517.

Digital Library

[24]

João Felipe Pimentel, Leonardo Murta, Vanessa Braganholo, and Juliana Freire. 2021. Understanding and improving the quality and reproducibility of Jupyter Notebooks. Empir. Softw. Eng. 26, 4 (2021), 65. DOI:

Digital Library

[25]

Akond Rahman, Amritanshu Agrawal, Rahul Krishna, and Alexander Sobran. 2018. Characterizing the influence of continuous integration:Empirical results from 250+ open source and proprietary projects. In Proceedings of the 4th ACM SIGSOFT International Workshop on Software Analytics (SWAN@ESEC/SIGSOFT FSE’18). ACM, 8–14. DOI:

Digital Library

[26]

Akond Rahman, Effat Farhana, Chris Parnin, and Laurie Williams. 2020. Gang of eight:A defect taxonomy for infrastructure as code scripts. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering(ICSE’20). ACM, 752–764.

Digital Library

[27]

Adam Rule, Ian Drosos, Aurélien Tabard, and James D. Hollan. 2018a. Aiding collaborative reuse of computational notebooks with annotated cell folding. Proc. ACM Hum.-Comput. Interact. 2, CSCW (Nov. 2018), Article 150, 12 pages. DOI:

Digital Library

[28]

Adam Rule, Aurélien Tabard, and James D. Hollan. 2018b. Exploration and explanation in computational notebooks. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI’18). ACM, 32.

Digital Library

[29]

Adam Rule, Aurélien Tabard, and James D. Hollan. 2018c. Exploration and explanation in computational notebooks. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (CHI’18). ACM, 1–12.

Digital Library

[30]

J. Saldaña. 2009. The Coding Manual for Qualitative Researchers. SAGE. 01475499 https://books.google.com.br/books?id=OE7LngEACAAJ

[31]

J. Saldana. 2015. The Coding Manual for Qualitative Researchers. SAGE.

[32]

I. Seidman. 2006. Interviewing as Qualitative Research:A Guide for Researchers in Education and the Social Sciences (3rd ed.). Teachers College Press.

[33]

Olivier Serrat. 2017. The five whys technique. In Knowledge Solutions. Springer Singapore, 307–310. DOI:

[34]

Seyyed Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Ahmed E. Hassan, and Meiyappan Nagappan. 2013. Predicting bugs using antipatterns. In Proceedings of the 2013 IEEE International Conference on Software Maintenance. 270–279. DOI:

Digital Library

[35]

Yida Tao, Jiefang Jiang, Yepang Liu, Zhiwu Xu, and Shengchao Qin. 2020. Understanding performance concerns in the API documentation of data science libraries. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE’20). ACM, 895–906.

Digital Library

[36]

Ferdian Thung, Shaowei Wang, David Lo, and Lingxiao Jiang. 2012. An empirical study of bugs in machine learning systems. In Proceedings of the 23rd IEEE International Symposium on Software Reliability Engineering (ISSRE’12). IEEE, 271–280.

Digital Library

[37]

April Yi Wang, Anant Mittal, Christopher Brooks, and Steve Oney. 2019. How data scientists use computational notebooks for real-time collaboration. Proc. ACM Hum.-Comput. Interact. 3, CSCW (Nov. 2019), Article 39, 30 pages. DOI:

Digital Library

[38]

Dinghua Wang, Shuqing Li, Guanping Xiao, Yepang Liu, and Yulei Sui. 2021b. An exploratory study of autopilot software bugs in unmanned aerial vehicles. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering(ESEC/FSE’21). ACM, 20–31.

Digital Library

[39]

Jiawei Wang, Tzu-Yang Kuo, Li Li, and Andreas Zeller. 2020a. Assessing and restoring reproducibility of Jupyter Notebooks. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering (ASE’20). ACM, 138–149.

Digital Library

[40]

Jiawei Wang, Li Li, and Andreas Zeller. 2020b. Better code, better sharing:On the need of analyzing Jupyter Notebooks. In Proceedings of the 42nd International Conference on Software Engineering, New Ideas, and Emerging Results (ICSE-NIER’20). ACM, 53–56.

[41]

Jiawei Wang, Li Li, and Andreas Zeller. 2020c. Better code, better sharing:On the need of analyzing Jupyter Notebooks. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering:New Ideas and Emerging Results(ICSE/NIER’20). ACM, 53–56. DOI:

Digital Library

[42]

Jiawei Wang, Li Li, and Andreas Zeller. 2021a. Restoring execution environments of Jupyter Notebooks. In Proceedings of the 43rd IEEE/ACM International Conference on Software Engineering (ICSE’21). IEEE, 1622–1633.

Digital Library

[43]

Chenyang Yang, Shurui Zhou, Jin L. C. Guo, and Christian Kästner. 2021. Subtle bugs everywhere:Generating documentation for data wrangling code. In Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE’21). IEEE, 304–316. DOI:

Digital Library

[44]

Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on TensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis(ISSTA’18). ACM, 129–140.

Digital Library

Cited By

Sato SNakamaru T(2024)Multiverse Notebook: Shifting Data Scientists to Time TravelersProceedings of the ACM on Programming Languages10.1145/36498388:OOPSLA1(754-783)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649838

Index Terms

Bug Analysis in Jupyter Notebook Projects: An Empirical Study
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Software Engineering and Methodology

ACM Transactions on Software Engineering and Methodology Volume 33, Issue 4

May 2024

940 pages

EISSN:1557-7392

DOI:10.1145/3613665

Editor:
Mauro Pezzè
USI Università della Svizzera italiana and SIT Schaffhausen Institute of Technology, Switzerland

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 April 2024

Online AM: 22 January 2024

Accepted: 03 January 2024

Revised: 20 December 2023

Received: 11 October 2022

Published in TOSEM Volume 33, Issue 4

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

INES, CNPq
CAPES
FACEPE
PRONEX
FAPESB INCITE

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
264
Total Downloads

Downloads (Last 12 months)264
Downloads (Last 6 weeks)44

Reflects downloads up to 04 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sato SNakamaru T(2024)Multiverse Notebook: Shifting Data Scientists to Time TravelersProceedings of the ACM on Programming Languages10.1145/36498388:OOPSLA1(754-783)Online publication date: 29-Apr-2024
https://dl.acm.org/doi/10.1145/3649838

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents