skip to main content
10.1145/3491102.3501941acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Open access

How to Guide Task-oriented Chatbot Users, and When: A Mixed-methods Study of Combinations of Chatbot Guidance Types and Timings

Published: 29 April 2022 Publication History
  • Get Citation Alerts
  • Abstract

    The popularity of task-oriented chatbots is constantly growing, but smooth conversational progress with them remains profoundly challenging. In recent years, researchers have argued that chatbot systems should include guidance for users on how to converse with them. Nevertheless, empirical evidence about what to place in such guidance, and when to deliver it, has been lacking. Using a mixed-methods approach that integrates results from a between-subjects experiment and a reflection session, this paper compares the effectiveness of eight combinations of two guidance types (example-based and rule-based) at four guidance timings (service-onboarding, task-intro, after-failure, and upon-request), as measured by users’ task performance, improvement on subsequent tasks, and subjective experience. It establishes that each guidance type and timing has particular strengths and weaknesses, thus that each type/timing combination has a unique impact on performance metrics, learning outcomes, and user experience. On that basis, it presents guidance-design recommendations for future task-oriented chatbots.

    Supplementary Material

    Supplemental Materials (3491102.3501941-supplemental-materials.zip)
    MP4 File (3491102.3501941-talk-video.mp4)
    Talk Video

    References

    [1]
    Bill Albert and Tom Tullis. 2013. Measuring the user experience: collecting, analyzing, and presenting usability metrics. Newnes, Burlington, MA.
    [2]
    James F Allen, Donna K Byron, Myroslava Dzikovska, George Ferguson, Lucian Galescu, and Amanda Stent. 2001. Toward conversational human-computer interaction. AI magazine 22, 4 (2001), 27–27.
    [3]
    Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, 2019. Guidelines for human-AI interaction. In Proceedings of the 2019 chi conference on human factors in computing systems. Association for Computing Machinery, New York, NY, 1–13.
    [4]
    Zahra Ashktorab, Mohit Jain, Q Vera Liao, and Justin D Weisz. 2019. Resilient chatbots: Repair strategy preferences for conversational breakdowns. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, 1–12.
    [5]
    Robert K Atkinson, Alexander Renkl, and Mary Margaret Merrill. 2003. Transitioning from studying examples to solving problems: Effects of self-explanation prompts and fading worked-out steps.Journal of educational psychology 95, 4 (2003), 774.
    [6]
    Marion Boiteux. 2018. Messenger at F8 2018. Meta. Retrieved January 16, 2021 from https://blog.messengerdevelopers.com/messenger-at-f8-2018-44010dc9d2ea
    [7]
    Petter Bae Brandtzaeg and Asbjørn Følstad. 2017. Why people use chatbots. In International conference on internet science. Springer, pringer, Cham, New York, NY, 377–392.
    [8]
    Petter Bae Brandtzaeg and Asbjørn Følstad. 2018. Chatbots: changing user needs and motivations. Interactions 25, 5 (2018), 38–43.
    [9]
    Raluca Budiu. 2018. The user experience of chatbots. Nielsen Norman Group. Retrieved August 16, 2021 from https://www.nngroup.com/articles/chatbots/
    [10]
    Carrie J Cai, Jonas Jongejan, and Jess Holbrook. 2019. The effects of example-based explanations in a machine learning interface. In Proceedings of the 24th International Conference on Intelligent User Interfaces. Association for Computing Machinery, New York, NY, 258–262.
    [11]
    Donald J Campbell. 1988. Task complexity: A review and analysis. Academy of management review 13, 1 (1988), 40–52.
    [12]
    John M Carroll and Caroline Carrithers. 1984. Training wheels in a user interface. Commun. ACM 27, 8 (1984), 800–806.
    [13]
    Michael E Caspersen and Jens Bennedsen. 2007. Instructional design of a programming course: a learning theoretic approach. In Proceedings of the third international workshop on Computing education research. Association for Computing Machinery, New York, NY, 111–122.
    [14]
    Richard Catrambone and John M Carroll. 1986. Learning a word processing system with training wheels and guided exploration. ACM SIGCHI Bulletin 18, 4 (1986), 169–174.
    [15]
    Ana Paula Chaves and Marco Aurelio Gerosa. 2021. How should my chatbot interact? A survey on social characteristics in human–chatbot interaction design. International Journal of Human–Computer Interaction 37, 8(2021), 729–758.
    [16]
    Michelene TH Chi, Miriam Bassok, Matthew W Lewis, Peter Reimann, and Robert Glaser. 1989. Self-explanations: How students study and use examples in learning to solve problems. Cognitive science 13, 2 (1989), 145–182.
    [17]
    Rikke Friis Dam and Teo Yu Siang. 2020. Affinity Diagrams – Learn How to Cluster and Bundle Ideas and Facts. Interaction Design Foundation. Retrieved September 2, 2021 from https://www.interaction-design.org/literature/article/affinity-diagrams-learn-how-to-cluster-and-bundle-ideas-and-facts
    [18]
    Alan Dix. 2020. Statistics for HCI: Making Sense of Quantitative Data. Synthesis Lectures on Human-Centered Informatics 13, 2(2020), 1–181.
    [19]
    Jens Edlund, Joakim Gustafson, Mattias Heldner, and Anna Hjalmarsson. 2008. Towards human-like spoken dialogue systems. Speech communication 50, 8-9 (2008), 630–645.
    [20]
    Nick Ellis. 1993. Rules and instances in foreign language learning: Interactions of explicit and implicit knowledge. European Journal of Cognitive Psychology 5, 3 (1993), 289–318.
    [21]
    Dario Fiore, Matthias Baldauf, and Christian Thiel. 2019. ” Forgot your password again?” acceptance and user experience of a chatbot for in-company IT support. In Proceedings of the 18th International Conference on Mobile and Ubiquitous Multimedia. Association for Computing Machinery, New York, NY, 1–11.
    [22]
    Asbjørn Følstad and Ragnhild Halvorsrud. 2020. Communicating Service Offers in a Conversational User Interface: An Exploratory Study of User Preferences in Chatbot Interaction. In 32nd Australian Conference on Human-Computer Interaction. Association for Computing Machinery, New York, NY, 671–676.
    [23]
    Asbjørn Følstad and Marita Skjuve. 2019. Chatbots for customer service: user experience and motivation. In Proceedings of the 1st international conference on conversational user interfaces. Association for Computing Machinery, New York, NY, 1–9.
    [24]
    Asbjørn Følstad and Cameron Taylor. 2019. Conversational repair in chatbots for customer service: the effect of expressing uncertainty and suggesting alternatives. In International Workshop on Chatbot Research and Design. Springer, Springer, Cham, New York, NY, 201–214.
    [25]
    Milton Friedman. 1940. A comparison of alternative tests of significance for the problem of m rankings. The Annals of Mathematical Statistics 11, 1 (1940), 86–92.
    [26]
    Google. 2021. Conversation design. Google. Retrieved August 16, 2021 from https://developers.google.com/assistant/conversation-design/welcome
    [27]
    Frederick J Gravetter and Lori-Ann B Forzano. 2018. Research methods for the behavioral sciences. Cengage Learning, Boston,MA.
    [28]
    Jonathan Grudin and Richard Jacques. 2019. Chatbots, humbots, and the quest for artificial general intelligence. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, 1–11.
    [29]
    Lakisha Hall. 2018. 6 steps to successful conversational design. IBM. Retrieved August 16, 2021 from https://www.ibm.com/blogs/watson/2018/09/6-steps-to-successful-conversational-design/
    [30]
    Kai Halttunen. 2003. Scaffolding performance in IR instruction: Exploring learning experiences and performance in two learning environments. Journal of Information Science 29, 5 (2003), 375–390.
    [31]
    Kai Halttunen. 2011. Pedagogical design and evaluation of interactive information retrieval learning environment. In Teaching and learning in information retrieval. Springer, New York, NY, 61–73.
    [32]
    Andrew F Hayes and Klaus Krippendorff. 2007. Answering the call for a standard reliability measure for coding data. Communication methods and measures 1, 1 (2007), 77–89.
    [33]
    Robert R Hoffman, Shane T Mueller, Gary Klein, and Jordan Litman. 2018. Metrics for explainable AI: Challenges and prospects.
    [34]
    Reid Holmes and Gail C Murphy. 2005. Using structural context to recommend source code examples. In Proceedings of the 27th international conference on Software engineering. Association for Computing Machinery, New York, NY, 117–125.
    [35]
    Kristina Höök. 2000. Steps to take before intelligent user interfaces become real. Interacting with computers 12, 4 (2000), 409–426.
    [36]
    IBM. 2021. IBM Cloud Docs/Watson Assistant. IBM. Retrieved November 16, 2021 from https://cloud.ibm.com/docs/assistant?topic=assistant-dialog-slots
    [37]
    Mohit Jain, Pratyush Kumar, Ishita Bhansali, Q Vera Liao, Khai Truong, and Shwetak Patel. 2018. FarmChat: a conversational agent to answer farmer queries. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 4 (2018), 1–22.
    [38]
    Mohit Jain, Pratyush Kumar, Ramachandra Kota, and Shwetak N Patel. 2018. Evaluating and informing the design of chatbots. In Proceedings of the 2018 Designing Interactive Systems Conference. Association for Computing Machinery, New York, NY, 895–906.
    [39]
    Jiepu Jiang, Wei Jeng, and Daqing He. 2013. How do users respond to voice input errors? Lexical and phonetic query reformulation in voice search. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. Association for Computing Machinery, New York, NY, 143–152.
    [40]
    Philipp Kirschthaler, Martin Porcheron, and Joel E Fischer. 2020. What Can I Say? Effects of Discoverability in VUIs on Task Performance and User Experience. In Proceedings of the 2nd Conference on Conversational User Interfaces. Association for Computing Machinery, New York, NY, 1–9.
    [41]
    Terry K Koo and Mae Y Li. 2016. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of chiropractic medicine 15, 2 (2016), 155–163.
    [42]
    Klaus Krippendorff. 2004. Content analysis: An introduction to its methodology. Sage publications, Washington, D.C.
    [43]
    Klaus Krippendorff. 2011. Computing Krippendorff’s alpha-reliability.
    [44]
    Knut Kvale, Olav Alexander Sell, Stig Hodnebrog, and Asbjørn Følstad. 2019. Improving Conversations: Lessons Learnt from Manual Analysis of Chatbot Dialogues. In International Workshop on Chatbot Research and Design. Springer, Springer, Cham, New York, NY, 187–200.
    [45]
    Suna Kyun, Slava Kalyuga, and John Sweller. 2013. The effect of worked examples when learning to write essays in English literature. The Journal of Experimental Education 81, 3 (2013), 385–408.
    [46]
    Raina Langevin, Ross J Lordon, Thi Avrahami, Benjamin R Cowan, Tad Hirsch, and Gary Hsieh. 2021. Heuristic Evaluation of Conversational Agents. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, 1–15.
    [47]
    SeoYoung Lee and Junho Choi. 2017. Enhancing user experience with conversational agent for movie recommendation: Effects of self-disclosure and reciprocity. International Journal of Human-Computer Studies 103 (2017), 95–105.
    [48]
    Chi-Hsun Li, Su-Fang Yeh, Tang-Jie Chang, Meng-Hsuan Tsai, Ken Chen, and Yung-Ju Chang. 2020. A Conversation Analysis of Non-Progress and Coping Strategies with a Banking Task-Oriented Chatbot. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, 1–12.
    [49]
    Jing Li, Aixin Sun, and Zhenchang Xing. 2018. Learning to answer programming questions with software documentation through social context embedding. Information Sciences 448(2018), 36–52.
    [50]
    Xiujun Li, Yun-Nung Chen, Lihong Li, Jianfeng Gao, and Asli Celikyilmaz. 2017. End-to-end task-completion neural dialogue systems.
    [51]
    Matthew Lombard, Jennifer Snyder-Duch, and Cheryl Campanella Bracken. 2010. Practical resources for assessing and reporting intercoder reliability in content analysis research projects.
    [52]
    Ewa Luger and Abigail Sellen. 2016. ” Like Having a Really Bad PA” The Gulf between User Expectation and Experience of Conversational Agents. In Proceedings of the 2016 CHI conference on human factors in computing systems. Association for Computing Machinery, New York, NY, 5286–5297.
    [53]
    Michael Meng, Stephanie Steinhardt, and Andreas Schubert. 2018. Application programming interface documentation: what do software developers want?Journal of Technical Writing and Communication 48, 3(2018), 295–330.
    [54]
    Microsoft. 2021. Best practices for building a language understanding (LUIS) app. Microsoft. Retrieved August 16, 2021 from https://docs.microsoft.com/en-us/azure/cognitive-services/luis/luis-concept-best-practices
    [55]
    Mohammed Slim Ben Mimoun, Ingrid Poncin, and Marion Garnier. 2012. Case study—Embodied virtual agents: An analysis on reasons for failure. Journal of Retailing and Consumer services 19, 6 (2012), 605–612.
    [56]
    Christine Murad, Cosmin Munteanu, Leigh Clark, and Benjamin R Cowan. 2018. Design guidelines for hands-free speech interaction. In Proceedings of the 20th International Conference on Human-Computer Interaction with Mobile Devices and Services Adjunct. Association for Computing Machinery, New York, NY, 269–276.
    [57]
    Brad A Myers. 1986. Visual programming, programming by example, and program visualization: a taxonomy. ACM sigchi bulletin 17, 4 (1986), 59–66.
    [58]
    Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for how users overcome obstacles in voice user interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery, New York, NY, 1–7.
    [59]
    Jakob Nielsen. 1994. How to Conduct a Heuristic Evaluation.Nielsen Norman Group. Retrieved June 10, 2021 from https://www.nngroup.com/articles/how-to-conduct-a-heuristic-evaluation/
    [60]
    Jakob Nielsen. 1994. Usability engineering. Morgan Kaufmann, San Francisco, CA.
    [61]
    Jakob Nielsen. 2010. Mental Models. Nielsen Norman Group. Retrieved August 16, 2021 from https://www.nngroup.com/articles/mental-models/
    [62]
    Jakob Nielsen. 2011. Workflow Expectations: Presenting Steps at the Right Time. Nielsen Norman Group. Retrieved September 2, 2021 from https://www.nngroup.com/articles/workflow-expectations/
    [63]
    Alexander Renkl. 1997. Learning from worked-out examples: A study on individual differences. Cognitive science 21, 1 (1997), 1–29.
    [64]
    Alexander Renkl. 2002. Worked-out examples: Instructional explanations support learning by self-explanations. Learning and instruction 12, 5 (2002), 529–556.
    [65]
    Alexander Renkl. 2014. Toward an instructionally oriented theory of example-based learning. Cognitive science 38, 1 (2014), 1–37.
    [66]
    Alexander Renkl, Robert K Atkinson, and Cornelia S Große. 2004. How fading worked solution steps works–a cognitive load perspective. Instructional science 32, 1 (2004), 59–82.
    [67]
    Alexander Renkl, Tatjana Hilbert, and Silke Schworm. 2009. Example-based learning in heuristic domains: A cognitive load theory account. Educational Psychology Review 21, 1 (2009), 67–78.
    [68]
    Alexander Renkl, Robin Stark, Hans Gruber, and Heinz Mandl. 1998. Learning from worked-out examples: The effects of example variability and elicited self-explanations. Contemporary educational psychology 23, 1 (1998), 90–108.
    [69]
    Julian Roelle, Sara Hiller, Kirsten Berthold, and Stefan Rumann. 2017. Example-based learning: The benefits of prompting organization before providing examples. Learning and Instruction 49 (2017), 1–12.
    [70]
    Ruhi Sarikaya. 2017. The technology behind personal digital assistants: An overview of the system architecture and key components. IEEE Signal Processing Magazine 34, 1 (2017), 67–81.
    [71]
    Agnieszka Sienkiewicz. 2021. 11 Chatbot Statistics and Trends You Need to Know in 2021. Tidio. Retrieved January 16, 2021 from https://www.tidio.com/blog/chatbot-statistics/
    [72]
    Arjun Srinivasan, Mira Dontcheva, Eytan Adar, and Seth Walker. 2019. Discovering natural language commands in multimodal interfaces. In Proceedings of the 24th International Conference on Intelligent User Interfaces. Association for Computing Machinery, New York, NY, 661–672.
    [73]
    Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen Wong, Margaret Burnett, Thomas Dietterich, Erin Sullivan, and Jonathan Herlocker. 2009. Interacting meaningfully with machine learning systems: Three experiments. International journal of human-computer studies 67, 8 (2009), 639–662.
    [74]
    Jasper van der Waa, Elisabeth Nieuwburg, Anita Cremers, and Mark Neerincx. 2021. Evaluating XAI: A comparison of rule-based and example-based explanations. Artificial Intelligence 291 (2021), 103404.
    [75]
    Frank Wilcoxon. 1992. Individual comparisons by ranking methods. In Breakthroughs in statistics. Springer, New York, NY, 196–202.
    [76]
    Koos Winnips and Catherine McLoughlin. 2001. Six WWW based learner supports you can build. Association for the Advancement of Computing in Education (AACE), Waynesville, NC.
    [77]
    Svetlana Yarosh, Stryker Thompson, Kathleen Watson, Alice Chase, Ashwin Senthilkumar, Ye Yuan, and AJ Bernheim Brush. 2018. Children asking questions: speech interface reformulations and personification preferences. In Proceedings of the 17th ACM Conference on Interaction Design and Children. Association for Computing Machinery, New York, NY, 300–312.

    Cited By

    View all
    • (2024)HASI: A Model for Human-Agent Speech InteractionProceedings of the 6th ACM Conference on Conversational User Interfaces10.1145/3640794.3665885(1-8)Online publication date: 8-Jul-2024
    • (2024)Improving Grading Fairness and Transparency with Decentralized Collaborative Peer AssessmentProceedings of the ACM on Human-Computer Interaction10.1145/36373508:CSCW1(1-24)Online publication date: 26-Apr-2024
    • (2024)Listening to the Voices: Describing Ethical Caveats of Conversational User Interfaces According to Experts and Frequent UsersProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642542(1-18)Online publication date: 11-May-2024
    • Show More Cited By

    Index Terms

    1. How to Guide Task-oriented Chatbot Users, and When: A Mixed-methods Study of Combinations of Chatbot Guidance Types and Timings

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CHI '22: Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems
      April 2022
      10459 pages
      ISBN:9781450391573
      DOI:10.1145/3491102
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 April 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Badges

      • Honorable Mention

      Author Tags

      1. chatbot
      2. guidance
      3. lab study
      4. non-progress

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      CHI '22
      Sponsor:
      CHI '22: CHI Conference on Human Factors in Computing Systems
      April 29 - May 5, 2022
      LA, New Orleans, USA

      Acceptance Rates

      Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1,227
      • Downloads (Last 6 weeks)125
      Reflects downloads up to 14 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)HASI: A Model for Human-Agent Speech InteractionProceedings of the 6th ACM Conference on Conversational User Interfaces10.1145/3640794.3665885(1-8)Online publication date: 8-Jul-2024
      • (2024)Improving Grading Fairness and Transparency with Decentralized Collaborative Peer AssessmentProceedings of the ACM on Human-Computer Interaction10.1145/36373508:CSCW1(1-24)Online publication date: 26-Apr-2024
      • (2024)Listening to the Voices: Describing Ethical Caveats of Conversational User Interfaces According to Experts and Frequent UsersProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642542(1-18)Online publication date: 11-May-2024
      • (2024)Conversation-based hybrid UI for the repertory grid techniqueInternational Journal of Human-Computer Studies10.1016/j.ijhcs.2024.103227184:COnline publication date: 17-Apr-2024
      • (2023)XAIR: A Framework of Explainable AI in Augmented RealityProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581500(1-30)Online publication date: 19-Apr-2023
      • (2023)“Listen to Music, Listen to Yourself”: Design of a Conversational Agent to Support Self-Awareness While Listening to MusicProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581427(1-19)Online publication date: 19-Apr-2023
      • (2023)Collaborating with a Text-Based Chatbot: An Exploration of Real-World Collaboration Strategies Enacted during Human-Chatbot InteractionsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580995(1-17)Online publication date: 19-Apr-2023
      • (2023)Predicting and Exploring Abandonment Signals in a Banking Task-Oriented Chatbot ServiceInternational Journal of Human–Computer Interaction10.1080/10447318.2023.2282220(1-15)Online publication date: 20-Nov-2023

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media