skip to main content
research-article

To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making

Published: 22 April 2021 Publication History
  • Get Citation Alerts
  • Abstract

    People supported by AI-powered decision support tools frequently overrely on the AI: they accept an AI's suggestion even when that suggestion is wrong. Adding explanations to the AI decisions does not appear to reduce the overreliance and some studies suggest that it might even increase it. Informed by the dual-process theory of cognition, we posit that people rarely engage analytically with each individual AI recommendation and explanation, and instead develop general heuristics about whether and when to follow the AI suggestions. Building on prior research on medical decision-making, we designed three cognitive forcing interventions to compel people to engage more thoughtfully with the AI-generated explanations. We conducted an experiment (N=199), in which we compared our three cognitive forcing designs to two simple explainable AI approaches and to a no-AI baseline. The results demonstrate that cognitive forcing significantly reduced overreliance compared to the simple explainable AI approaches. However, there was a trade-off: people assigned the least favorable subjective ratings to the designs that reduced the overreliance the most. To audit our work for intervention-generated inequalities, we investigated whether our interventions benefited equally people with different levels of Need for Cognition (i.e., motivation to engage in effortful mental activities). Our results show that, on average, cognitive forcing interventions benefited participants higher in Need for Cognition more. Our research suggests that human cognitive motivation moderates the effectiveness of explainable AI solutions.

    References

    [1]
    Gagan Bansal, Besmira Nushi, Ece Kamar, Walter S Lasecki, Daniel S Weld, and Eric Horvitz. 2019. Beyond accuracy: The role of mental models in human-AI team performance. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7. 2--11.
    [2]
    Gagan Bansal, Tongshuang Wu, Joyce Zhu, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel S Weld. 2021. Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan)(CHI '21). Association for Computing Machinery, New York, NY, USA, 1--16. To appear.
    [3]
    Dale J Barr, Roger Levy, Christoph Scheepers, and Harry J Tily. 2013. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of memory and language, Vol. 68, 3 (2013), 255--278.
    [4]
    Eta S Berner and Mark L Graber. 2008. Overconfidence as a cause of diagnostic error in medicine. The American journal of medicine, Vol. 121, 5 (2008), S2--S23.
    [5]
    Umang Bhatt, Adrian Weller, and José M. F. Moura. 2020. Evaluating and Aggregating Feature-based Model Explanations. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, Christian Bessiere (Ed.). International Joint Conferences on Artificial Intelligence Organization, 3016--3022. https://doi.org/10.24963/ijcai.2020/417
    [6]
    Brian H Bornstein and A Christine Emler. 2001. Rationality in medical decision making: a review of the literature on doctors' decision-making biases. Journal of evaluation in clinical practice, Vol. 7, 2 (2001), 97--107.
    [7]
    Zana Buccinca, Phoebe Lin, Krzysztof Z. Gajos, and Elena L. Glassman. 2020. Proxy Tasks and Subjective Measures Can Be Misleading in Evaluating Explainable AI Systems. In Proceedings of the 25th International Conference on Intelligent User Interfaces (IUI '20). ACM, New York, NY, USA.
    [8]
    Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency. 77--91.
    [9]
    Adrian Bussone, Simone Stumpf, and Dympna O'Sullivan. 2015. The role of explanations on trust and reliance in clinical decision support systems. In 2015 International Conference on Healthcare Informatics. IEEE, 160--169.
    [10]
    John T. Cacioppo and Richard E. Petty. 1982. The need for cognition. Journal of Personality and Social Psychology, Vol. 42, 1 (1982), 116--131. https://doi.org/10.1037/0022-3514.42.1.116
    [11]
    J T Cacioppo, R E Petty, and C F Kao. 1984. The efficient assessment of need for cognition. Journal of personality assessment, Vol. 48, 3 (1984), 306--307. https://doi.org/10.1207/s15327752jpa4803_13
    [12]
    Giuseppe Carenini. 2001. An Analysis of the Influence of Need for Cognition on Dynamic Queries Usage. In CHI '01 Extended Abstracts on Human Factors in Computing Systems (Seattle, Washington) (CHI EA '01). ACM, New York, NY, USA, 383--384. https://doi.org/10.1145/634067.634293
    [13]
    Ana-Maria Cazan and Simona Elena Indreica. 2014. Need for cognition and approaches to learning among university students. Procedia-Social and Behavioral Sciences, Vol. 127 (2014), 134--138.
    [14]
    Jim Q Chen and Sang M Lee. 2003. An exploratory cognitive DSS for strategic decision making. Decision support systems, Vol. 36, 2 (2003), 147--160.
    [15]
    Glinda S Cooper and Vanessa Meterko. 2019. Cognitive bias research in forensic science: a systematic review. Forensic science international, Vol. 297 (2019), 35--46.
    [16]
    Pat Croskerry. 2003 a. Cognitive forcing strategies in clinical decisionmaking. Annals of emergency medicine, Vol. 41, 1 (2003), 110--120.
    [17]
    Pat Croskerry. 2003 b. The importance of cognitive errors in diagnosis and strategies to minimize them. Academic medicine, Vol. 78, 8 (2003), 775--780.
    [18]
    Louis Deslauriers, Logan S McCarty, Kelly Miller, Kristina Callaghan, and Greg Kestin. 2019. Measuring actual learning versus feeling of learning in response to being actively engaged in the classroom. Proceedings of the National Academy of Sciences (2019), 201821936.
    [19]
    Jennifer L Eberhardt. 2020. Biased: Uncovering the hidden prejudice that shapes what we see, think, and do. Penguin Books.
    [20]
    John W Ely, Mark L Graber, and Pat Croskerry. 2011. Checklists to reduce diagnostic errors. Academic Medicine, Vol. 86, 3 (2011), 307--313.
    [21]
    Gavan J Fitzsimons and Donald R Lehmann. 2004. Reactance to recommendations: When unsolicited advice yields contrary responses. Marketing Science, Vol. 23, 1 (2004), 82--94.
    [22]
    Krzysztof Z. Gajos and Krysta Chauncey. 2017. The Influence of Personality Traits and Cognitive Load on the Use of Adaptive User Interfaces. In Proceedings of the 22Nd International Conference on Intelligent User Interfaces (Limassol, Cyprus) (IUI '17). ACM, New York, NY, USA, 301--306. https://doi.org/10.1145/3025171.3025192
    [23]
    Neelansh Garg, Apuroop Sethupathy, Rudraksh Tuwani, Rakhi Nk, Shubham Dokania, Arvind Iyer, Ayushi Gupta, Shubhra Agrawal, Navjot Singh, Shubham Shukla, et almbox. 2018. FlavorDB: a database of flavor molecules. Nucleic acids research, Vol. 46, D1 (2018), D1210-D1216.
    [24]
    Bhavya Ghai, Q. Vera Liao, Yunfeng Zhang, Rachel Bellamy, and Klaus Mueller. 2021. Explainable Active Learning (XAL): Toward AI Explanations as Interfaces for Machine Teachers. Proc. ACM Hum.-Comput. Interact., Vol. 4, CSCW3, Article 235 (2021), 28 pages. https://doi.org/10.1145/3432934
    [25]
    Mark L Graber, Stephanie Kissam, Velma L Payne, Ashley ND Meyer, Asta Sorensen, Nancy Lenfestey, Elizabeth Tant, Kerm Henriksen, Kenneth LaBresh, and Hardeep Singh. 2012. Cognitive interventions to reduce diagnostic error: a narrative review. BMJ quality & safety, Vol. 21, 7 (2012), 535--557.
    [26]
    Ben Green and Yiling Chen. 2019. The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--24.
    [27]
    Curtis P Haugtvedt, Richard E Petty, and John T Cacioppo. 1992. Need for cognition and advertising: Understanding the role of personality variables in consumer behavior. Journal of Consumer Psychology, Vol. 1, 3 (1992), 239--260.
    [28]
    Sture Holm. 1979. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics, Vol. 6, 2 (1979), 65--70.
    [29]
    Jessica Hullman, Eytan Adar, and Priti Shah. 2011. Benefitting infovis with visual difficulties. IEEE Transactions on Visualization and Computer Graphics, Vol. 17, 12 (2011), 2213--2222.
    [30]
    Maia Jacobs, Melanie F. Pradier, Thomas H. McCoy Jr, Roy H. Perlis, Finale Doshi-Velez, and Krzysztof Z. Gajos. 2021. How machine-learning recommendations influence clinician treatment selections: the example of the antidepressant selection. Translational Psychiatry, Vol. 11 (2021). https://doi.org/10.1038/s41398-021-01224-x
    [31]
    Heinrich Jiang, Been Kim, Melody Y Guan, and Maya Gupta. 2018. To trust or not to trust a classifier. In Proceedings of the 32nd International Conference on Neural Information Processing Systems. 5546--5557.
    [32]
    Daniel Kahneman. 2011. Thinking, fast and slow. Macmillan.
    [33]
    Daniel Kahneman and Shane Frederick. 2002. Representativeness revisited: Attribute substitution in intuitive judgment. In Representativeness revisited: Attribute substitution in intuitive judgment. New York. Cambridge University Press., 49--81.
    [34]
    Daniel Kahneman, Stewart Paul Slovic, Paul Slovic, and Amos Tversky. 1982. Judgment under uncertainty: Heuristics and biases. Cambridge university press.
    [35]
    Ece Kamar. 2016. Directions in Hybrid Intelligence: Complementing AI Systems with Human Intelligence. In IJCAI. 4070--4073.
    [36]
    Ece Kamar, Severin Hacker, and Eric Horvitz. 2012. Combining human and machine intelligence in large-scale crowdsourcing. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, 467--474.
    [37]
    Eric Kearney, Diether Gebert, and Sven C Voelpel. 2009. When and how diversity benefits teams: The importance of team members' need for cognition. Academy of Management journal, Vol. 52, 3 (2009), 581--598.
    [38]
    Wouter Kool and Matthew Botvinick. 2018. Mental labour. Nature human behaviour, Vol. 2, 12 (2018), 899--908.
    [39]
    Todd Kulesza, Simone Stumpf, Margaret Burnett, Sherry Yang, Irwin Kwan, and Weng-Keen Wong. 2013. Too much, too little, or just right? Ways explanations impact end users' mental models. In 2013 IEEE Symposium on Visual Languages and Human Centric Computing. IEEE, 3--10.
    [40]
    Vivian Lai, Han Liu, and Chenhao Tan. 2020. "Why is 'Chicago' Deceptive?" Towards Building Model-Driven Tutorials for Humans. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI '20). Association for Computing Machinery, New York, NY, USA, 1--13. https://doi.org/10.1145/3313831.3376873
    [41]
    Vivian Lai and Chenhao Tan. 2019. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 29--38.
    [42]
    Kathryn Ann Lambe, Gary O'Reilly, Brendan D Kelly, and Sarah Curristan. 2016. Dual-process cognitive interventions to enhance diagnostic reasoning: a systematic review. BMJ quality & safety, Vol. 25, 10 (2016), 808--820.
    [43]
    G Daniel Lassiter, Michael A Briggs, and R David Slaw. 1991. Need for cognition, causal processing, and memory for behavior. Personality and Social Psychology Bulletin, Vol. 17, 6 (1991), 694--700.
    [44]
    John D Lee and Katrina A See. 2004. Trust in automation: Designing for appropriate reliance. Human factors, Vol. 46, 1 (2004), 50--80.
    [45]
    Chin-Lung Lin, Sheng-Hsien Lee, and Der-Juinn Horng. 2011. The effects of online reviews on purchasing intention: The moderating role of need for cognition. Social Behavior and Personality: an international journal, Vol. 39, 1 (2011), 71--81.
    [46]
    Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 4765--4774.
    [47]
    Martijn Millecamp, Nyi Nyi Htun, Cristina Conati, and Katrien Verbert. 2019. To Explain or Not to Explain: The Effects of Personal Characteristics When Explaining Music Recommendations. In Proceedings of the 24th International Conference on Intelligent User Interfaces (Marina del Ray, California) (IUI '19). Association for Computing Machinery, New York, NY, USA, 397--407. https://doi.org/10.1145/3301275.3302313
    [48]
    Martijn Millecamp, Nyi Nyi Htun, Cristina Conati, and Katrien Verbert. 2020. What's in a User? Towards Personalising Transparency for Music Recommender Interfaces. In Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization (Genoa, Italy) (UMAP '20). Association for Computing Machinery, New York, NY, USA, 173--182. https://doi.org/10.1145/3340631.3394844
    [49]
    Carol-anne E Moulton, Glenn Regehr, Maria Mylopoulos, and Helen M MacRae. 2007. Slowing down when you should: a new model of expert judgment. Academic Medicine, Vol. 82, 10 (2007), S109--S116.
    [50]
    Joon Sung Park, Rick Barber, Alex Kirlik, and Karrie Karahalios. 2019. A Slow Algorithm Improves Users' Assessments of the Algorithm's Accuracy. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, CSCW (2019), 1--15.
    [51]
    Avi Parush, Shir Ahuvia, and Ido Erev. 2007. Degradation in spatial knowledge acquisition when using automatic navigation systems. In International conference on spatial information theory. Springer, 238--254.
    [52]
    Richard E. Petty and John T. Cacioppo. 1986. The Elaboration Likelihood Model of Persuasion. Communication and Persuasion, Vol. 19 (1986), 1--24. https://doi.org/10.1007/978-1-4612-4964-1_1 arxiv: arXiv:1011.1669v3
    [53]
    Vlad L Pop, Alex Shrewsbury, and Francis T Durso. 2015. Individual differences in the calibration of trust in automation. Human factors, Vol. 57, 4 (2015), 545--556.
    [54]
    Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT* '20). Association for Computing Machinery, New York, NY, USA, 33--44. https://doi.org/10.1145/3351095.3372873
    [55]
    Jonathan Sherbino, Kulamakan Kulasegaram, Elizabeth Howey, and Geoffrey Norman. 2014. Ineffectiveness of cognitive forcing strategies to reduce biases in diagnostic reasoning: a controlled trial. Canadian Journal of Emergency Medicine, Vol. 16, 1 (2014), 34--40.
    [56]
    Maria Sicilia, Salvador Ruiz, and Jose L Munuera. 2005. Effects of interactivity in a web site: The moderating effect of need for cognition. Journal of advertising, Vol. 34, 3 (2005), 31--44.
    [57]
    J Spilke, HP Piepho, and X Hu. 2005. Analysis of unbalanced data by mixed linear models using the MIXED procedure of the SAS system. Journal of Agronomy and crop science, Vol. 191, 1 (2005), 47--54.
    [58]
    Tracy L Tuten and Michael Bosnjak. 2001. Understanding differences in web usage: The role of need for cognition and the five factor model of personality. Social Behavior and Personality: an international journal, Vol. 29, 4 (2001), 391--398.
    [59]
    Michelle Vaccaro and Jim Waldo. 2019. The effects of mixing machine learning and human judgment. Commun. ACM, Vol. 62, 11 (2019), 104--110.
    [60]
    Tiffany C Veinot, Hannah Mitchell, and Jessica S Ancker. 2018. Good intentions are not enough: how informatics interventions can worsen inequality. Journal of the American Medical Informatics Association, Vol. 25, 8 (2018), 1080--1088.
    [61]
    Jennifer Irvin Vidrine, Vani Nath Simmons, and Thomas H. Brandon. 2007. Construction of smoking-relevant risk perceptions among college students: The influence of need for cognition and message content. Journal of Applied Social Psychology, Vol. 37, 1 (2007), 91--114. https://doi.org/10.1111/j.0021-9029.2007.00149.x
    [62]
    Alan R Wagner, Jason Borenstein, and Ayanna Howard. 2018. Overtrust in the robotic age. Commun. ACM, Vol. 61, 9 (2018), 22--24.
    [63]
    Peter C Wason and J St BT Evans. 1974. Dual processes in reasoning? Cognition, Vol. 3, 2 (1974), 141--154.
    [64]
    Jacob Westfall, David A Kenny, and Charles M Judd. 2014. Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli. Journal of Experimental Psychology: General, Vol. 143, 5 (2014).
    [65]
    Pamela Williams-Piehota, Tamera R Schneider, Linda Mowad, and Peter Salovey. 2003. Matching Health Messages to Information-Processing Styles: Need for Cognition and Mammography Utilization. Health Communication, Vol. 15, 4 (2003), 375--392.
    [66]
    Ming Yin, Jennifer Wortman Vaughan, and Hanna Wallach. 2019. Understanding the effect of accuracy on trust in machine learning models. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1--12.
    [67]
    Yunfeng Zhang, Q. Vera Liao, and Rachel K. E. Bellamy. 2020. Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain) (FAT*'20). Association for Computing Machinery, New York, NY, USA, 295--305. https://doi.org/10.1145/3351095.3372852

    Cited By

    View all
    • (2024)Balancing Act: Exploring the Interplay Between Human Judgment and Artificial Intelligence in Problem-solving, Creativity, and Decision-makingIgMin Research10.61927/igmin1582:3(145-158)Online publication date: 25-Mar-2024
    • (2024)A Case Study of Big Data Analytics Capability and the Impact of Cognitive Bias in a Global Manufacturing OrganisationArtificial Intelligence of Things (AIoT) for Productivity and Organizational Transition10.4018/979-8-3693-0993-3.ch001(1-25)Online publication date: 23-Feb-2024
    • (2024)Suggestive answers strategy in human-chatbot interaction: a route to engaged critical decision makingFrontiers in Psychology10.3389/fpsyg.2024.138223415Online publication date: 28-Mar-2024
    • Show More Cited By

    Index Terms

    1. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the ACM on Human-Computer Interaction
      Proceedings of the ACM on Human-Computer Interaction  Volume 5, Issue CSCW1
      CSCW
      April 2021
      5016 pages
      EISSN:2573-0142
      DOI:10.1145/3460939
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 22 April 2021
      Published in PACMHCI Volume 5, Issue CSCW1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. artificial intelligence
      2. cognition
      3. explanations
      4. trust

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3,041
      • Downloads (Last 6 weeks)275
      Reflects downloads up to 14 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Balancing Act: Exploring the Interplay Between Human Judgment and Artificial Intelligence in Problem-solving, Creativity, and Decision-makingIgMin Research10.61927/igmin1582:3(145-158)Online publication date: 25-Mar-2024
      • (2024)A Case Study of Big Data Analytics Capability and the Impact of Cognitive Bias in a Global Manufacturing OrganisationArtificial Intelligence of Things (AIoT) for Productivity and Organizational Transition10.4018/979-8-3693-0993-3.ch001(1-25)Online publication date: 23-Feb-2024
      • (2024)Suggestive answers strategy in human-chatbot interaction: a route to engaged critical decision makingFrontiers in Psychology10.3389/fpsyg.2024.138223415Online publication date: 28-Mar-2024
      • (2024)Systematic research is needed on the potential effects of lifelong technology experience on cognition: a mini-review and recommendationsFrontiers in Psychology10.3389/fpsyg.2024.133586415Online publication date: 16-Feb-2024
      • (2024)The impact of AI errors in a human-in-the-loop processCognitive Research: Principles and Implications10.1186/s41235-023-00529-39:1Online publication date: 7-Jan-2024
      • (2024)The effects of over-reliance on AI dialogue systems on students' cognitive abilities: a systematic reviewSmart Learning Environments10.1186/s40561-024-00316-711:1Online publication date: 18-Jun-2024
      • (2024)Designing Collaborative Intelligence Systems for Employee-AI Service Co-ProductionJournal of Service Research10.1177/10946705241238751Online publication date: 18-Mar-2024
      • (2024)Understanding the User Perception and Experience of Interactive Algorithmic Recourse CustomizationACM Transactions on Computer-Human Interaction10.1145/3674503Online publication date: 28-Jun-2024
      • (2024)Measuring User Experience Inclusivity in Human-AI Interaction via Five User Problem-Solving StylesACM Transactions on Interactive Intelligent Systems10.1145/3663740Online publication date: 8-May-2024
      • (2024)Does More Advice Help? The Effects of Second Opinions in AI-Assisted Decision MakingProceedings of the ACM on Human-Computer Interaction10.1145/36537088:CSCW1(1-31)Online publication date: 26-Apr-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media