skip to main content
research-article

CoAIcoder: Examining the Effectiveness of AI-assisted Human-to-Human Collaboration in Qualitative Analysis

Published: 29 November 2023 Publication History

Abstract

While AI-assisted individual qualitative analysis has been substantially studied, AI-assisted collaborative qualitative analysis (CQA) – a process that involves multiple researchers working together to interpret data—remains relatively unexplored. After identifying CQA practices and design opportunities through formative interviews, we designed and implemented CoAIcoder, a tool leveraging AI to enhance human-to-human collaboration within CQA through four distinct collaboration methods. With a between-subject design, we evaluated CoAIcoder with 32 pairs of CQA-trained participants across common CQA phases under each collaboration method. Our findings suggest that while using a shared AI model as a mediator among coders could improve CQA efficiency and foster agreement more quickly in the early coding stage, it might affect the final code diversity. We also emphasize the need to consider the independence level when using AI to assist human-to-human collaboration in various CQA scenarios. Lastly, we suggest design implications for future AI-assisted CQA systems.

References

[1]
Roehl Sybing, How research AI can enhance your analysis. Retrieved September 16, 2023 from https://atlasti.com/research-hub/how-research-ai-can-enhance-your-analysis#how-research-ai-can-enhance-your-analysis
[2]
MaxQDA. MAXQDA TeamCloud for Interaction and Data Exchange. Retrieved September 16, 2023 from https://www.maxqda.com/teamcloud
[3]
IBM. What is text mining?. Retrieved September 16, 2023 from https://www.ibm.com/topics/text-mining
[4]
Ercan Akpınar, Demet Erol, and Bülent Aydoğdu. 2009. The role of cognitive conflict in constructivist theory: An implementation aimed at science teachers. Procedia - Social and Behavioral Sciences 1, 1 (2009), 2402–2407. DOI:World Conference on Educational Sciences: New Trends and Issues in Educational Sciences.
[5]
Douglas G. Altman and J. Martin Bland. 1995. Statistics notes: Absence of evidence is not evidence of absence. Bmj 311, 7003 (1995), 485. DOI:
[6]
Mohammad Amiryousefi, Masumeh Sadat Seyyedrezaei, Ana Gimeno-Sanz, and Manssor Tavakoli. 2021. Impact of etherpad-based collaborative writing instruction on EFL learners’ writing performance, writing self-efficacy, and attribution: A mixed-method approach. Two Quarterly Journal of English Language Teaching and Learning University of Tabriz 13, 28 (2021), 19–37. DOI:
[7]
Ross C. Anderson, Meg Guerreiro, and Joanna Smith. 2016. Are all biases bad? Collaborative grounded theory in developmental evaluation of education policy. Journal of Multidisciplinary Evaluation 12, 27 (2016), 44–57. DOI:
[8]
Zahra Ashktorab, Q. Vera Liao, Casey Dugan, James Johnson, Qian Pan, Wei Zhang, Sadhana Kumaravel, and Murray Campbell. 2020. Human-AI collaboration in a cooperative game setting: Measuring social perception and outcomes. Proc. ACM Hum.-Comput. Interact. 4, CSCW2 (2020), 20 pages. DOI:
[9]
Aneesha Bakharia, Peter Bruza, Jim Watters, Bhuva Narayan, and Laurianne Sitbon. 2016. Interactive topic modeling for aiding qualitative content analysis. In Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval.Association for Computing Machinery, New York, NY, 213–222. DOI:
[10]
Eric P. S. Baumer, David Mimno, Shion Guha, Emily Quan, and Geri K. Gay. 2017. Comparing grounded theory and topic modeling: Extreme divergence or unlikely convergence? Journal of the Association for Information Science and Technology 68, 6 (2017), 1397–1410. DOI:
[11]
Sarah Bebermeier and Denise Kerkhoff. 2019. Use and impact of the open source online editor etherpad in a psychology students’ statistics class. Psychology Teaching Review 25, 2 (2019), 30–38.
[12]
Alan Blackwell, Luke Church, Ian Hales, Matthew Jones, Richard Jones, Matthew Mahmoudi, Mariana Marasoiu, Sallyanne Meakins, Detlef Nauck, Karl Prince, Ana Semrov, Alexander Simpson, Martin Spott, Alain Vuylsteke, and Xiaomeng Wang. 2018. Computer says ‘don’t know”-interacting visually with incomplete AI models. In Proceedings of the Workshop on Designing Technologies to Support Human Problem Solving-VL/HCC. 5–14.
[13]
Tom Bocklisch, Joey Faulkner, Nick Pawlowski, and Alan Nichol. 2017. Rasa: Open Source Language Understanding and Dialogue Management.
[14]
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3, 2 (2006), 77–101. DOI:https://www.tandfonline.com/doi/pdf/10.1191/1478088706qp063oa
[15]
Antony Bryant, and Kathy Charmaz. 2007. The SAGE Handbook of Grounded Theory. SAGE Publications Ltd. DOI:
[16]
Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, and Alan Nichol. 2020. DIET: Lightweight Language Understanding for Dialogue Systems.
[17]
Junming Cao, Bihuan Chen, Longjie Hu, Jie Gao, Kaifeng Huang, and Xin Peng. 2023. Understanding the Complexity and Its Impact on Testing in ML-Enabled Systems.
[18]
Mariano Ceccato, Nadzeya Kiyavitskaya, Nicola Zeni, Luisa Mich, Daniel M. Berry. 2004. Ambiguity Identification and Measurement in Natural Language Texts. UNSPECIFIED. (Unpublished).
[19]
Kathy Charmaz. 2014. Constructing Grounded Theory. Sage Publications.
[20]
Nan-Chen Chen, Margaret Drouhard, Rafal Kocielnik, Jina Suh, and Cecilia R. Aragon. 2018. Using machine learning to support qualitative coding in social science: Shifting the focus to ambiguity. ACM Transactions on Interactive Intelligent Systems 8, 2 (2018), 20 pages. DOI:
[21]
Nan-chen Chen, Rafal Kocielnik, Margaret Drouhard, Vanessa Peña-Araya, Jina Suh, Keting Cen, Xiangyi Zheng, and Cecilia R. Aragon. 2016. Challenges of applying machine learning to qualitative coding. In Proceedings of the CHI 2016 Workshop on Human Centred Machine Learning.
[22]
Hao-Fei Cheng, Ruotong Wang, Zheng Zhang, Fiona O’Connell, Terrance Gray, F. Maxwell Harper, and Haiyi Zhu. 2019. Explaining decision-making algorithms through UI: Strategies to help non-expert stakeholders. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY, 1–12. DOI:
[23]
Bonnie Chinh, Himanshu Zade, Abbas Ganji, and Cecilia Aragon. 2019. Ways of qualitative coding: A case study of four strategies for resolving disagreements. In Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY, 1–6. DOI:
[24]
Juliet Corbin and Anselm Strauss. 2008. Basics of Qualitative Research (3rd ed.): Techniques and Procedures for Developing Grounded Theory. Thousand Oaks, CA: SAGE Publications, Inc. DOI:
[25]
Flora Cornish, Alex Gillespie, and Tania Zittoun. 2013. Collaborative analysis of qualitative data. The Sage Handbook of Qualitative Data Analysis 79 (2013), 93. DOI:
[26]
Kevin Crowston, Xiaozhong Liu, and Eileen E. Allen. 2010. Machine learning and rule-based automated coding of qualitative data. In Proceedings of the 73rd ASIS&T Annual Meeting on Navigating Streams in an Information Ecosystem. American Society for Information Science, 2 pages.
[27]
Aida Mostafazadeh Davani, Mark Díaz, and Vinodkumar Prabhakaran. 2022. Dealing with disagreements: Looking beyond the majority vote in subjective annotations. Transactions of the Association for Computational Linguistics 10 (2022), 92–110. DOI:https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00449/109286/Dealing-with-Disagreements-Looking-Beyond-the
[28]
Jessica T. DeCuir-Gunby, Patricia L. Marshall, and Allison W. McCulloch. 2011. Developing and using a codebook for the analysis of interview data: An example from a professional development research project. Field Methods 23, 2 (2011), 136–155. DOI:
[29]
Margaret Drouhard, Nan-Chen Chen, Jina Suh, Rafal Kocielnik, Vanessa Peña-Araya, Keting Cen, Xiangyi Zheng, and Cecilia R. Aragon. 2017. Aeonium: Visual analytics to support collaborative qualitative coding. In Proceedings of the 2017 IEEE Pacific Visualization Symposium. 220–229. DOI:
[30]
Jessica L. Feuston and Jed R. Brubaker. 2021. Putting tools in their place: The role of time and perspective in human-AI collaboration for qualitative analysis. Proceedings of the ACM on Human-Computer Interaction 5, CSCW2 (2021), 25 pages. DOI:
[31]
Uwe Flick. 2013. The SAGE Handbook of Qualitative Data Analysis. SAGE Publications Ltd. DOI:
[32]
Fábio Freitas, Jaime Ribeiro, Catarina Brandão, Luís Paulo Reis, Francislê N. de Souza, and António Pedro Costa. 2017. Learn for yourself: The self-learning tools for qualitative analysis software packages. Digital Education Review32 (2017), 97–117. https://files.eric.ed.gov/fulltext/EJ1166487.pdf
[33]
Abbas Ganji, Mania Orand, and David W. McDonald. 2018. Ease on down the code: Complex collaborative qualitative coding simplified with’code wizard’. Proceedings of the ACM on Human-Computer Interaction 2, CSCW (2018), 24 pages. DOI:
[34]
Jie Gao, Yuchen Guo, Gionnieve Lim, Tianqin Zhang, Zheng Zhang, Toby Jia-Jun Li, and Simon Tangi Perrault. 2023. CollabCoder: A GPT-Powered Workflow for Collaborative Qualitative Analysis.
[35]
Simret Araya Gebreegziabher, Zheng Zhang, Xiaohang Tang, Yihao Meng, Elena L. Glassman, and Toby Jia-Jun Li. 2023. PaTAT: Human-AI collaborative qualitative coding with explainable interactive rule synthesis. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY, 19 pages. DOI:
[36]
Christophe Giraud-Carrier. 2000. A note on the utility of incremental learning. AI Communications 13, 4 (2000), 215–223.
[37]
Nahid Golafshani. 2003. Understanding reliability and validity in qualitative research. The Qualitative Report 8, 4 (2003), 597–607. DOI:
[38]
Max Goldman, Greg Little, and Robert C. Miller. 2011. Real-time collaborative coding in a web IDE. In Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology. 155–164.
[39]
Ken Gorro, Jeffrey Rosario Ancheta, Kris Capao, Nathaniel Oco, Rachel Edita Roxas, Mary Jane Sabellano, Brandie Nonnecke, Shrestha Mohanty, Camille Crittenden, and Ken Goldberg. 2017. Qualitative data analysis of disaster risk reduction suggestions assisted by topic modeling and word2vec. In Proceedings of the 2017 International Conference on Asian Language Processing. 293–297. DOI:
[40]
G. Mark Grimes, Ryan M. Schuetzler, and Justin Scott Giboney. 2021. Mental models and expectation violations in conversational AI interactions. Decision Support Systems 144 (2021), 113515. DOI:
[41]
Allan Hackshaw. 2008. Small Studies: Strengths and Limitations. European Respiratory Journal 32, 5 (2008), 1141–1143 pages. DOI:
[42]
Matt-Heun Hong, Lauren A. Marsh, Jessica L. Feuston, Janet Ruppert, Jed R. Brubaker, and Danielle Albers Szafir. 2022. Scholastic: Graphical human-AI collaboration for inductive and interpretive text analysis. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology.Association for Computing Machinery, New York, NY, 12 pages. DOI:
[43]
Tim Hopper, Hong Fu, Kathy Sanford, and Thiago Alonso Hinkel. 2021. YouTube for transcribing and Google drive for collaborative coding: Cost-effective tools for collecting and analyzing interview data. The Qualitative Report 26, 3 (2021), 861–873. DOI:
[44]
Irving L. Janis. 2008. Groupthink. IEEE Engineering Management Review 36, 1 (2008), 36. DOI:
[45]
Jialun Aaron Jiang, Kandrea Wade, Casey Fiesler, and Jed R. Brubaker. 2021. Supporting serendipity: Opportunities and challenges for human-AI collaboration in qualitative analysis. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1 (2021), 23 pages. DOI:
[46]
Andreas Kaufmann, Ann Barcomb, and Dirk Riehle. 2020. Supporting interview analysis with autocoding. In Proceedings of the HICSS. 1–10. Retrieved from https://hdl.handle.net/10125/63833
[47]
Bran Knowles, Mark Rouncefield, Mike Harding, Nigel Davies, Lynne Blair, James Hannon, John Walden, and Ding Wang. 2015. Models and patterns of trust. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing.Association for Computing Machinery, New York, NY, 328–338. DOI:
[48]
Rafal Kocielnik, Saleema Amershi, and Paul N. Bennett. 2019. Will you accept an imperfect AI? exploring designs for adjusting end-user expectations of AI systems. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY, 1–14. DOI:
[49]
Karen S. Kurasaki. 2000. Intercoder reliability for validating conclusions drawn from open-ended interview data. Field Methods 12, 3 (2000), 179–194. DOI:
[50]
Jonathan Lazar, Jinjuan Heidi Feng, and Harry Hochheiser. 2017. Research Methods in Human-Computer Interaction (2nd Ed.). Morgan Kaufmann, Cambridge, MA.
[51]
William Leeson, Adam Resnick, Daniel Alexander, and John Rovers. 2019. Natural language processing (NLP) in qualitative public health research: A proof of concept study. International Journal of Qualitative Methods 18 (2019), 1609406919887021. DOI:
[52]
Robert P. Lennon, Robbie Fraleigh, Lauren J. Van Scoy, Aparna Keshaviah, Xindi C. Hu, Bethany L Snyder, Erin L Miller, William A. Calo, Aleksandra E. Zgierska, and Christopher Griffin. 2021. Developing and testing an automated qualitative assistant (AQUA) to support qualitative analysis. Family Medicine and Community Health 9, Suppl 1 (2021). DOI:
[53]
Q. Vera Liao, Daniel Gruen, and Sarah Miller. 2020. Questioning the AI: Informing design practices for explainable AI user experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY, 1–15. DOI:
[54]
Huiting Liu, Avinesh P. V. S., Siddharth Patwardhan, Peter Grasch, and Sachin Agarwal. 2022. Model Stability with Continuous Data Updates.
[55]
Brian Lubars and Chenhao Tan. 2019. Ask not what AI can do, but what AI should do: Towards a framework of task delegability. arXiv:1902.03245 Retrieved from http://arxiv.org/abs/1902.03245
[56]
Moira Maguire and Brid Delahunt. 2017. Doing a thematic analysis: A practical, step-by-step guide for learning and teaching scholars. All Ireland Journal of Higher Education 9, 3 (2017), 3351–33514.
[57]
Megh Marathe and Kentaro Toyama. 2018. Semi-automated coding for qualitative research: A user-centered inquiry and initial prototypes. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY, 1–12. DOI:
[58]
Mary L. McHugh. 2012. Interrater reliability: The kappa statistic. Biochemia Medica 22, 3 (2012), 276–282.
[59]
Iftekhar Naim, M. Iftekhar Tanveer, Daniel Gildea, and Mohammed Ehsan Hoque. 2015. Automated prediction and analysis of job interview performance: The role of what you say and how you say it. In Proceedings of the 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. 1–6. DOI:
[60]
Laura K. Nelson. 2020. Computational grounded theory: A methodological framework. Sociological Methods & Research 49, 1 (2020), 3–42. DOI:
[61]
Peter Nielsen. 2012. Collaborative Coding of Qualitative Data (White paper).
[62]
Austin G. Oswald. 2019. Improving outcomes with qualitative data analysis software: A reflective journey. Qualitative Social Work 18, 3 (2019), 436–442. DOI:
[63]
Cliodhna O’Connor and Helene Joffe. 2020. Intercoder Reliability in Qualitative Research: Debates and Practical Guidelines. International Journal of Qualitative Methods 19 (2020), 1.13. DOI:
[64]
David Porfirio, Evan Fisher, Allison Sauppé, Aws Albarghouthi, and Bilge Mutlu. 2019. Bodystorming human-robot interactions. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology.Association for Computing Machinery, New York, NY, 479–491. DOI:
[65]
K. Andrew R. Richards and Michael A. Hemphill. 2018. A practical guide to collaborative qualitative data analysis. Journal of Teaching in Physical Education 37, 2 (2018), 225–231. DOI:
[66]
Tim Rietz and Alexander Maedche. 2021. Cody: An AI-based system to semi-automate coding for qualitative research. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY, 14 pages. DOI:
[67]
Jonathan A. Smith. 2015. Qualitative psychology: A practical guide to research methods. Qualitative Psychology (2015), 1–312.
[68]
Helena Vasconcelos, Matthew Jörke, Madeleine Grunde-McLaughlin, Tobias Gerstenberg, Michael S. Bernstein, and Ranjay Krishna. 2023. Explanations can reduce overreliance on AI systems during decision-making. Proceedings of the ACM on Human-Computer Interaction 7, CSCW1 (2023), 38 pages. DOI:
[69]
Maike Vollstedt and Sebastian Rezat. 2019. An Introduction to Grounded Theory with a Special Focus on Axial Coding and the Coding Paradigm. Springer International Publishing, Cham, 81–100. DOI:
[70]
Jasy Liew Suet Yan, Nancy McCracken, and Kevin Crowston. 2014. Semi-automatic content analysis of qualitative data. In iConference, Berlin, Germany.
[71]
Himanshu Zade, Margaret Drouhard, Bonnie Chinh, Lu Gan, and Cecilia Aragon. 2018. Conceptualizing disagreement in qualitative coding. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems.Association for Computing Machinery, New York, NY, 1–11. DOI:
[72]
Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, and Toby Jia-Jun Li. 2023. VISAR: A Human-AI Argumentative Writing Assistant with Visual Programming and Rapid Draft Prototyping. arXiv preprint arXiv:2304.07810 (2023).

Cited By

View all
  • (2024)Human-AI Collaborative Taxonomy Construction: A Case Study in Profession-Specific Writing AssistantsProceedings of the Third Workshop on Intelligent and Interactive Writing Assistants10.1145/3690712.3690726(51-57)Online publication date: 11-May-2024
  • (2024)The Role of Generative AI in Qualitative Research: GPT-4's Contributions to a Grounded Theory AnalysisProceedings of the 2024 Symposium on Learning, Design and Technology10.1145/3663433.3663456(17-25)Online publication date: 21-Jun-2024
  • (2024)Human-AI Collaboration in Thematic Analysis using ChatGPT: A User Study and Design RecommendationsExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650732(1-7)Online publication date: 11-May-2024
  • Show More Cited By

Index Terms

  1. CoAIcoder: Examining the Effectiveness of AI-assisted Human-to-Human Collaboration in Qualitative Analysis

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Computer-Human Interaction
    ACM Transactions on Computer-Human Interaction  Volume 31, Issue 1
    February 2024
    517 pages
    EISSN:1557-7325
    DOI:10.1145/3613507
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 November 2023
    Online AM: 24 August 2023
    Accepted: 27 June 2023
    Revised: 09 June 2023
    Received: 06 April 2022
    Published in TOCHI Volume 31, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Qualitative coding
    2. collaboration
    3. AI-assisted qualitative analysis
    4. coding quality
    5. AI-assisted human-to-human collaboration

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,875
    • Downloads (Last 6 weeks)157
    Reflects downloads up to 24 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Human-AI Collaborative Taxonomy Construction: A Case Study in Profession-Specific Writing AssistantsProceedings of the Third Workshop on Intelligent and Interactive Writing Assistants10.1145/3690712.3690726(51-57)Online publication date: 11-May-2024
    • (2024)The Role of Generative AI in Qualitative Research: GPT-4's Contributions to a Grounded Theory AnalysisProceedings of the 2024 Symposium on Learning, Design and Technology10.1145/3663433.3663456(17-25)Online publication date: 21-Jun-2024
    • (2024)Human-AI Collaboration in Thematic Analysis using ChatGPT: A User Study and Design RecommendationsExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650732(1-7)Online publication date: 11-May-2024
    • (2024)Help Me Reflect: Leveraging Self-Reflection Interface Nudges to Enhance Deliberativeness on Online Deliberation PlatformsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642530(1-32)Online publication date: 11-May-2024
    • (2024)CollabCoder: A Lower-barrier, Rigorous Workflow for Inductive Collaborative Qualitative Analysis with Large Language ModelsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642002(1-29)Online publication date: 11-May-2024
    • (2024)NAVIGATING THE UNKNOWN: HOW HEALTHCARE ENTREPRENEURS MANAGE UNCERTAINTYJournal of Developmental Entrepreneurship10.1142/S108494672450014629:02Online publication date: 18-Jul-2024
    • (2024)Enhancing qualitative research in higher education assessment through generative AI integration: A path toward meaningful insights and a cautionary taleNew Directions for Teaching and Learning10.1002/tl.20631Online publication date: 4-Sep-2024
    • (2023)CollabCoder: A GPT-Powered WorkFlow for Collaborative Qualitative AnalysisCompanion Publication of the 2023 Conference on Computer Supported Cooperative Work and Social Computing10.1145/3584931.3607500(354-357)Online publication date: 14-Oct-2023

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media