skip to main content
10.1145/3524610.3527890acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Estimating developers' cognitive load at a fine-grained level using eye-tracking measures

Published: 20 October 2022 Publication History

Abstract

The comprehension of source code is a task inherent to many software development activities. Code change, code review and debugging are examples of these activities that depend heavily on developers' understanding of the source code. This ability is threatened when developers' cognitive load approaches the limits of their working memory, which in turn affects their understanding and makes them more prone to errors. Measures capturing humans' behavior and changes in their physiological state have been proposed in a number of studies to investigate developers' cognitive load. However, the majority of the existing approaches operate at a coarse-grained task level estimating the difficulty of the source code as a whole. Hence, they cannot be used to pinpoint the mentally demanding parts of it. We address this limitation in this paper through a non-intrusive approach based on eye-tracking. We collect users' behavioral and physiological features while they are engaging with source code and train a set of machine learning models to estimate the mentally demanding parts of code. The evaluation of our models returns F1, recall, accuracy and precision scores up to 85.65%, 84.25%, 86.24% and 88.61%, respectively, when estimating the mental demanding fragments of code. Our approach enables a fine-grained analysis of cognitive load and allows identifying the parts challenging the comprehension of source code. Such an approach provides the means to test new hypotheses addressing the characteristics of specific parts within the source code and paves the road for novel techniques for code review and adaptive e-learning.

References

[1]
Charu C Aggarwal. 2015. Data mining: the textbook. Springer.
[2]
Taylor Armerding. 2018. Hard Questions Raised When A Software 'Glitch' Takes Down An Airliner. (2018). https://www.forbes.com/sites/taylorarmerding/2018/11/20/hard-questions-raised-when-a-software-glitch-takes-down-an-airliner/#5cf4e9907b1d.
[3]
Jackson Beatty. 1982. Task-evoked pupillary responses, processing load, and the structure of processing resources. Psychological bulletin 91, 2 (1982), 276.
[4]
Nicolas Bourdillon, Laurent Schmitt, Sasan Yazdani, Jean-Marc Vesin, and Grégoire P Millet. 2017. Minimal window duration for accurate HRV recording in athletes. Frontiers in neuroscience 11 (2017), 456.
[5]
F Brooks and H Kugler. 1987. No silver bullet. April.
[6]
Fang Chen, Jianlong Zhou, Yang Wang, Kun Yu, Syed Z Arshad, Ahmad Khawaji, and Dan Conway. 2016. Robust multimodal cognitive load measurement. Springer.
[7]
Ricardo Couceiro, Raul Barbosa, Joáo Duráes, Gonçalo Duarte, Joáo Castelhano, Catarina Duarte, Cesar Teixeira, Nuno Laranjeiro, Júlio Medeiros, Paulo Carvalho, et al. 2019. Spotting Problematic Code Lines using Nonintrusive Programmers' Biofeedback. In 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 93--103.
[8]
Clive Davidson. 2012. A dark knight for algos. Risk 25, 9 (2012), 32.
[9]
Rachel N Denison, Jacob A Parker, and Marisa Carrasco. 2020. Modeling pupil responses to rapid sequential events. Behavior research methods 52, 5 (2020), 1991--2007.
[10]
Sarah Fakhoury, Devjeet Roy, Yuzhan Ma, Venera Arnaoudova, and Olusola Adesope. 2020. Measuring the impact of lexical and structural inconsistencies on developers' cognitive load during bug localization. Empirical Software Engineering 25, 3 (2020), 2140--2178.
[11]
Alberto Fernández, Salvador García, Mikel Galar, Ronaldo C Prati, Bartosz Krawczyk, and Francisco Herrera. 2018. Learning from imbalanced data sets. Vol. 10. Springer.
[12]
Martin Fowler. 2018. Refactoring: improving the design of existing code. Addison-Wesley Professional.
[13]
Thomas Fritz, Andrew Begel, Sebastian C Müller, Serap Yigit-Elliott, and Manuela Züger. 2014. Using psycho-physiological measures to assess task difficulty in software development. In Proceedings of the 36th international conference on software engineering. 402--413.
[14]
Andreas Glöckner and Ann-Katrin Herbold. 2008. Information processing in decisions under risk: Evidence for compensatory strategies based on automatic processes. MPI collective goods preprint 2008/42 (2008).
[15]
Lucian Gonçales, Kleinner Farias, Bruno da Silva, and Jonathan Fessler. 2019. Measuring the cognitive load of software developers: A systematic mapping study. In IEEE/ACM 27th International Conference on Program Comprehension. 42--52.
[16]
Lucian Gonçales, Kleinner Farias, and Bruno C da Silva. 2021. Measuring the cognitive load of software developers: An extended Systematic Mapping Study. Information and Software Technology (2021), 106563.
[17]
Dan Gopstein, Jake Iannacone, Yu Yan, Lois DeLong, Yanyan Zhuang, Martin K-C Yeh, and Justin Cappos. 2017. Understanding misunderstandings in source code. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 129--139.
[18]
Drew T Guarnera, Corey A Bryant, Ashwin Mishra, Jonathan I Maletic, and Bonita Sharif. 2018. itrace: Eye tracking infrastructure for development environments. In Proceedings of the 2018 ACM Symposium on Eye Tracking Research & Applications. 1--3.
[19]
Eija Haapalainen, SeungJun Kim, Jodi F Forlizzi, and Anind K Dey. 2010. Psycho-physiological measures for assessing cognitive load. In Proceedings of the 12th ACM international conference on Ubiquitous computing. 301--310.
[20]
Haytham Hijazi, Ricardo Couceiro, João Castelhano, Paulo De Carvalho, Miguel Castelo-Branco, and Henrique Madeira. 2021. Intelligent Biofeedback Augmented Content Comprehension (TellBack). IEEE Access 9 (2021), 28393--28406.
[21]
K. Holmqvist, M. Nyström, R. Andersson, R. Dewhurst, H. Jarodzka, and J. van de Weijer. 2011. Eye Tracking: A comprehensive guide to methods and measures. OUP Oxford.
[22]
Joel Jordan and Mel Slater. 2009. An analysis of eye scanpath entropy in a progressively forming virtual environment. Presence 18, 3 (2009), 185--199.
[23]
Marcel A Just and Patricia A Carpenter. 1980. A theory of reading: From eye fixations to comprehension. Psychological review 87, 4 (1980), 329.
[24]
Merve Keskin, Kristien Ooms, Ahmet Ozgur Dogru, and Philippe De Maeyer. 2020. Exploring the Cognitive Load of Expert and Novice Map Users Using EEG and Eye Tracking. ISPRS International Journal of Geo-Information 9, 7 (2020).
[25]
Katja Kevic, Braden M Walters, Timothy R Shaffer, Bonita Sharif, David C Shepherd, and Thomas Fritz. 2015. Tracing software developers' eyes and interactions for change tasks. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 202--213.
[26]
Katja Kevic, Braden M Walters, Timothy R Shaffer, Bonita Sharif, David C Shepherd, and Thomas Fritz. 2017. Eye gaze and interaction contexts for change tasks-Observations and potential. Journal of Systems and Software 128 (2017), 252--266.
[27]
Jeff Klingner. 2010. Fixation-aligned pupillary response averaging. In Proceedings of the 2010 symposium on eye-tracking research & applications. 275--282.
[28]
Timothy B. LEE. 2018. Report: Software bug led to death in Uber's self-driving crash. (2018). https://arstechnica.com/tech-policy/2018/05/report-software-bug-led-to-death-in-ubers-self-driving-crash/.
[29]
Nancy G Leveson and Clark S Turner. 1993. An investigation of the Therac-25 accidents. Computer 26, 7 (1993), 18--41.
[30]
Jacques-Louis Lions, Lennart Luebeck, Jean-Luc Fauquembergue, Gilles Kahn, Wolfgang Kubbat, Stefan Levedag, Leonardo Mazzini, Didier Merle, and Colin O'Halloran. 1996. Ariane 5 flight 501 failure report by the inquiry board.
[31]
Norman H Mackworth. 1965. Visual noise causes tunnel vision. Psychonomic science 3, 1 (1965), 67--68.
[32]
James G May, Robert S Kennedy, Mary C Williams, William P Dunlap, and Julie R Brannan. 1990. Eye movement indices of mental workload. Acta psychologica 75, 1 (1990), 75--89.
[33]
Júlio Medeiros, Ricardo Couceiro, Gonçalo Duarte, João Durães, João Castelhano, Catarina Duarte, Miguel Castelo-Branco, Henrique Madeira, Paulo de Carvalho, and César Teixeira. 2021. Can EEG Be Adopted as a Neuroscience Reference for Assessing Software Programmers' Cognitive Load? Sensors 21, 7 (2021), 2338.
[34]
Juliano Paulo Menzen, Kleinner Farias, and Vinicius Bischoff. 2021. Using bio-metric data in software engineering: a systematic mapping study. Behaviour & Information Technology 40, 9 (2021), 880--902.
[35]
George A Miller. 1956. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological review 63, 2 (1956), 81.
[36]
Unaizah Obaidellah, Mohammed Al Haek, and Peter C-H Cheng. 2018. A survey on the usage of eye-tracking in computer programming. ACM Computing Surveys (CSUR) 51, 1 (2018), 1--58.
[37]
Anneli Olsen. 2012. The Tobii I-VT Fixation Filter. (2012).
[38]
Fred Paas, Juhani E Tuovinen, Huib Tabbers, and Pascal WM Van Gerven. 2003. Cognitive load measurement as a means to advance cognitive load theory. Educational psychologist 38, 1 (2003), 63--71.
[39]
Norman Peitek, Sven Apel, Chris Parnin, André Brechmann, and Janet Siegmund. 2021. Program comprehension and code complexity metrics: An fmri study. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 524--536.
[40]
Gillian Porter, Tom Troscianko, and Iain D Gilchrist. 2007. Effort during visual search and counting: Insights from pupillometry. Quarterly journal of experimental psychology 60, 2 (2007), 211--229.
[41]
Christos Saltapidas and Ramin Maghsood. 2018. Financial Risk The fall of Knight Capital Group. (2018).
[42]
Zohreh Sharafi, Timothy Shaffer, Bonita Sharif, and Yann-Gaël Guéhéneuc. 2015. Eye-tracking metrics in software engineering. In 2015 Asia-Pacific Software Engineering Conference (APSEC). IEEE, 96--103.
[43]
Zohreh Sharafi, Zéphyrin Soh, and Yann-Gaël Guéhéneuc. 2015. A systematic literature reviewon the usage of eye-tracking in software engineering. Information and Software Technology 67 (2015), 79--107.
[44]
Stuart R Steinhauer, Greg J Siegle, Ruth Condray, and Misha Pless. 2004. Sympathetic and parasympathetic innervation of pupillary dilation during sustained processing. International journal of psychophysiology 52, 1 (2004), 77--86.
[45]
Robert J Sternberg and Karin Sternberg. 2016. Cognitive psychology. Nelson Education.
[46]
John Sweller. 2011. Cognitive load theory. In Psychology of learning and motivation. Vol. 55. Elsevier, 37--76.
[47]
Pauline van der Wel and Henk van Steenbergen. 2018. Pupil dilation as an index of effort in cognitive control tasks: A review. Psychonomic bulletin & review 25, 6 (2018), 2005--2015.
[48]
Boris M Velichkovsky. 1999. From levels of processing to stratification of cognition Converging evidence from three domains of research. Stratification in cognition and consciousness 15 (1999), 203.
[49]
Yuanchao Wang, Zhichen Pan, Jianhua Zheng, Lei Qian, and Mingtao Li. 2019. A hybrid ensemble method for pulsar candidate classification. Astrophysics and Space Science 364, 8 (2019), 1--13.
[50]
Barbara Weber, Thomas Fischer, and René Riedl. 2021. Brain and autonomic nervous system activity measurement in software engineering: A systematic literature review. Journal of Systems and Software 178 (2021).
[51]
Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in software engineering. Springer Science & Business Media.
[52]
Jeremy M Wolfe. 1994. Guided search 2.0 a revised model of visual search. Psychonomic bulletin & review 1, 2 (1994), 202--238.
[53]
Martin K-C Yeh, Dan Gopstein, Yu Yan, and Yanyan Zhuang. 2017. Detecting and comparing brain activity in short program comprehension using EEG. In 2017 IEEE Frontiers in Education Conference (FIE). IEEE, 1--5.
[54]
Min Zhang, Tracy Hall, and Nathan Baddoo. 2011. Code bad smells: a review of current knowledge. Journal of Software Maintenance and Evolution: research and practice 23, 3 (2011), 179--202.
[55]
Robert Z Zheng. 2017. Cognitive load measurement and application: a theoretical framework for meaningful research and practice. Routledge.
[56]
Stefan Zugal, Jakob Pinggera, Manuel Neurauter, Thomas Maran, and Barbara Weber. 2017. Cheetah experimental platform web 1.0: cleaning pupillary data. arXiv preprint arXiv:1703.09468 (2017).

Cited By

View all
  • (2024)Cognitive state detection with eye tracking in the field: an experience sampling study and its lessons learnedi-com10.1515/icom-2023-003523:1(109-129)Online publication date: 15-Apr-2024
  • (2024)Investigating the impact of logical reasoning in honing code comprehension skills: An empirical analysisProceedings of the 2024 Sixteenth International Conference on Contemporary Computing10.1145/3675888.3676024(35-41)Online publication date: 8-Aug-2024
  • (2024)Exploring Sparse Gaussian Processes for Bayesian Optimization in Convolutional Neural Networks for Autism ClassificationIEEE Access10.1109/ACCESS.2024.335116812(10631-10651)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Estimating developers' cognitive load at a fine-grained level using eye-tracking measures

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ICPC '22: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension
        May 2022
        698 pages
        ISBN:9781450392983
        DOI:10.1145/3524610
        • Conference Chairs:
        • Ayushi Rastogi,
        • Rosalia Tufano,
        • General Chair:
        • Gabriele Bavota,
        • Program Chairs:
        • Venera Arnaoudova,
        • Sonia Haiduc
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        In-Cooperation

        • IEEE CS

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 October 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. cognitive load
        2. eye-tracking
        3. machine learning
        4. program comprehension
        5. source code

        Qualifiers

        • Research-article

        Funding Sources

        • International Postdoctoral Fellowship (IPF) Grant, University of St. Gallen

        Conference

        ICPC '22
        Sponsor:

        Upcoming Conference

        ICSE 2025

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)315
        • Downloads (Last 6 weeks)48
        Reflects downloads up to 01 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Cognitive state detection with eye tracking in the field: an experience sampling study and its lessons learnedi-com10.1515/icom-2023-003523:1(109-129)Online publication date: 15-Apr-2024
        • (2024)Investigating the impact of logical reasoning in honing code comprehension skills: An empirical analysisProceedings of the 2024 Sixteenth International Conference on Contemporary Computing10.1145/3675888.3676024(35-41)Online publication date: 8-Aug-2024
        • (2024)Exploring Sparse Gaussian Processes for Bayesian Optimization in Convolutional Neural Networks for Autism ClassificationIEEE Access10.1109/ACCESS.2024.335116812(10631-10651)Online publication date: 2024
        • (2024)Exploring the Cognitive Effects of Ambiguity in Process ModelsBusiness Process Management10.1007/978-3-031-70396-6_28(493-510)Online publication date: 2-Sep-2024
        • (2024)Leveraging Digital Trace Data to Investigate and Support Human-Centered Work ProcessesEvaluation of Novel Approaches to Software Engineering10.1007/978-3-031-64182-4_1(1-23)Online publication date: 10-Jul-2024
        • (2024)An Analysis of Program Comprehension Process by Eye Movement Mapping to Syntax TreesNetworking and Parallel/Distributed Computing Systems10.1007/978-3-031-53274-0_10(137-152)Online publication date: 27-Apr-2024
        • (2023)How Well Can Masked Language Models Spot Identifiers That Violate Naming Guidelines?2023 IEEE 23rd International Working Conference on Source Code Analysis and Manipulation (SCAM)10.1109/SCAM59687.2023.00023(131-142)Online publication date: 2-Oct-2023
        • (2023)Evaluating a Language Workbench: from Working Memory Capacity to Comprehension to Acceptance2023 IEEE/ACM 31st International Conference on Program Comprehension (ICPC)10.1109/ICPC58990.2023.00017(54-58)Online publication date: May-2023
        • (2023)Conducting eye-tracking studies on large and interactive process models using EyeMindSoftwareX10.1016/j.softx.2023.10156424(101564)Online publication date: Dec-2023
        • (2023)On the relationship between source-code metrics and cognitive loadJournal of Systems and Software10.1016/j.jss.2023.111619198:COnline publication date: 1-Apr-2023

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media