skip to main content
10.1145/3581641.3584037acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article
Open access

The Programmer’s Assistant: Conversational Interaction with a Large Language Model for Software Development

Published: 27 March 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Large language models (LLMs) have recently been applied in software engineering to perform tasks such as translating code between programming languages, generating code from natural language, and autocompleting code as it is being written. When used within development tools, these systems typically treat each model invocation independently from all previous invocations, and only a specific limited functionality is exposed within the user interface. This approach to user interaction misses an opportunity for users to more deeply engage with the model by having the context of their previous interactions, as well as the context of their code, inform the model’s responses. We developed a prototype system – the Programmer’s Assistant – in order to explore the utility of conversational interactions grounded in code, as well as software engineers’ receptiveness to the idea of conversing with, rather than invoking, a code-fluent LLM. Through an evaluation with 42 participants with varied levels of programming experience, we found that our system was capable of conducting extended, multi-turn discussions, and that it enabled additional knowledge and capabilities beyond code generation to emerge from the LLM. Despite skeptical initial expectations for conversational programming assistance, participants were impressed by the breadth of the assistant’s capabilities, the quality of its responses, and its potential for improving their productivity. Our work demonstrates the unique potential of conversational interactions with LLMs for co-creative processes like software development.

    References

    [1]
    Rabe Abdalkareem, Emad Shihab, and Juergen Rilling. 2017. What Do Developers Use the Crowd For? A Study Using Stack Overflow. IEEE Software 34, 2 (2017), 53–60. https://doi.org/10.1109/MS.2017.31
    [2]
    Eleni Adamopoulou and Lefteris Moussiades. 2020. Chatbots: History, technology, and applications. Machine Learning with Applications 2 (2020), 100006.
    [3]
    Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, and Quoc V. Le. 2020. Towards a Human-like Open-Domain Chatbot.
    [4]
    Safinah Ali, Nisha Elizabeth Devasia, and Cynthia Breazeal. 2022. Escape! Bot: Social Robots as Creative Problem-Solving Partners. In Creativity and Cognition. 275–283.
    [5]
    Miltiadis Allamanis, Earl T Barr, Premkumar Devanbu, and Charles Sutton. 2018. A survey of machine learning for big code and naturalness. ACM Computing Surveys (CSUR) 51, 4 (2018), 1–37.
    [6]
    Irene Alvarado, Idan Gazit, and Amelia Wattenberger. 2022. GitHub Next | GitHub Copilot Labs. https://githubnext.com/projects/copilot-labs/
    [7]
    Hikari Ando, Rosanna Cousins, and Carolyn Young. 2014. Achieving saturation in thematic analysis: Development and refinement of a codebook. Comprehensive Psychology 3 (2014), 03–CP.
    [8]
    Craig Anslow, Stuart Marshall, James Noble, and Robert Biddle. 2013. Sourcevis: Collaborative software visualization for co-located environments. In 2013 First IEEE Working Conference on Software Visualization (VISSOFT). IEEE, 1–10.
    [9]
    Zahra Ashktorab, Michael Desmond, Josh Andres, Michael Muller, Narendra Nath Joshi, Michelle Brachman, Aabhas Sharma, Kristina Brimijoin, Qian Pan, Christine T Wolf, 2021. AI-Assisted Human Labeling: Batching for Efficiency without Overreliance. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1(2021), 1–27.
    [10]
    Catherine A Ashworth. 1996. GUI Users have trouble using graphic conventions on novel tasks. In Conference Companion on Human Factors in Computing Systems. 75–76.
    [11]
    Amanda Askell, Yuntao Bai, Anna Chen, Dawn Drain, Deep Ganguli, Tom Henighan, Andy Jones, Nicholas Joseph, Ben Mann, Nova DasSarma, 2021. A general language assistant as a laboratory for alignment. arXiv preprint arXiv:2112.00861(2021).
    [12]
    Leif Azzopardi, Paul Thomas, and Nick Craswell. 2018. Measuring the utility of search engine result pages: an information foraging based measure. In The 41st International ACM SIGIR conference on research & development in information retrieval. 605–614.
    [13]
    Shraddha Barke, Michael B James, and Nadia Polikarpova. 2022. Grounded Copilot: How Programmers Interact with Code-Generating Models. arXiv preprint arXiv:2206.15000(2022).
    [14]
    Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, 2021. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258(2021).
    [15]
    Joel Brandt, Mira Dontcheva, Marcos Weskamp, and Scott R Klemmer. 2010. Example-centric programming: integrating web search into the development environment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 513–522.
    [16]
    Virginia Braun and Victoria Clarke. 2022. Common challenges in Thematic Analysis and how to avoid them. Retrieved August 11 2022 from https://youtu.be/tpWLsckpM78
    [17]
    Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
    [18]
    Sallyann Bryant, Pablo Romero, and Benedict" du Boulay. 2006. The Collaborative Nature of Pair Programming. In Extreme Programming and Agile Processes in Software Engineering, Pekka Abrahamsson, Michele Marchesi, and Giancarlo Succi (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 53–64.
    [19]
    Andres Campero, Michelle Vaccaro, Jaeyoon Song, Haoran Wen, Abdullah Almaatouq, and Thomas W Malone. 2022. A Test for Evaluating Performance in Human-Computer Systems. arXiv preprint arXiv:2206.12390(2022).
    [20]
    Gaetano Cascini, Yukari Nagai, Georgi V Georgiev, Jader Zelaya, Niccolò Becattini, Jean-François Boujut, Hernan Casakin, Nathan Crilly, Elies Dekoninck, John Gero, 2022. Perspectives on design creativity and innovation research: 10 years later., 30 pages.
    [21]
    Stephen Cass. 2022. Top Programming Languages 2022. IEEE Spectrum (23 Aug 2022). https://spectrum.ieee.org/top-programming-languages-2022
    [22]
    Cristina Catalan Aguirre, Nuria Gonzalez Castro, Carlos Delgado Kloos, Carlos Alario-Hoyos, and Pedro José Muñoz Merino. 2021. Conversational agent for supporting learners on a MOOC on programming with Java. (2021).
    [23]
    Ana Paula Chaves and Marco Aurelio Gerosa. 2021. How should my chatbot interact? A survey on social characteristics in human–chatbot interaction design. International Journal of Human–Computer Interaction 37, 8(2021), 729–758.
    [24]
    Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde, Jared Kaplan, Harrison Edwards, Yura Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, David W. Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William H. Guss, Alex Nichol, Igor Babuschkin, S. Arun Balaji, Shantanu Jain, Andrew Carr, Jan Leike, Joshua Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew M. Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, and Wojciech Zaremba. 2021. Evaluating a Large Language Models Trained on Code.
    [25]
    Li-Te Cheng, R.B. De Souza, Susanne Hupfer, John Patterson, and Steven Ross. 2003. Building Collaboration into IDEs: Edit>Compile>Run>Debug>Collaborate?Queue 1, 9 (2003).
    [26]
    Carl Cook, Warwick Irwin, and Neville Churcher. 2005. A user evaluation of synchronous collaborative software engineering tools. In 12th Asia-Pacific Software Engineering Conference (APSEC’05). IEEE, 6–pp.
    [27]
    Claudio León de la Barra, Broderick Crawford, Ricardo Soto, Sanjay Misra, and Eric Monfroy. 2013. Agile Software Development: It Is about Knowledge Management and Creativity. In Computational Science and Its Applications – ICCSA 2013, Beniamino Murgante, Sanjay Misra, Maurizio Carlini, Carmelo M. Torre, Hong-Quang Nguyen, David Taniar, Bernady O. Apduhan, and Osvaldo Gervasi (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 98–113.
    [28]
    Uri Dekel and Steven Ross. 2004. Eclipse as a platform for research on interruption management in software development. In Proceedings of the 2004 OOPSLA workshop on Eclipse Technology eXchange (Vancouver, British Columbia, Canada), Michael G. Burke (Ed.). ACM, 12–16.
    [29]
    Bobbie Eicher, Kathryn Cunningham, Sydni Peterson Marissa Gonzales, and Ashok Goel. 2017. Toward mutual theory of mind as a foundation for co-creation. In International Conference on Computational Creativity, Co-Creation Workshop.
    [30]
    Stephen M Fiore, Eduardo Salas, and Janis A Cannon-Bowers. 2001. Group dynamics and shared mental model development. How people evaluate others in organizations 234 (2001).
    [31]
    Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. 1995. Design patterns: elements of reusable object-oriented software. Addison-Wesley.
    [32]
    GitHub, Inc.2022. GitHub copilot · your AI pair programmer. Retrieved August 5, 2022 from https://github.com/features/copilot/
    [33]
    Amelia Glaese, Nat McAleese, Maja Trębacz, John Aslanides, Vlad Firoiu, Timo Ewalds, Maribeth Rauh, Laura Weidinger, Martin Chadwick, Phoebe Thacker, Lucy Campbell-Gillingham, Jonathan Uesato, Po-Sen Huang, Ramona Comanescu, Fan Yang, Abigail See, Sumanth Dathathri, Rory Greig, Charlie Chen, Doug Fritz, Jaume Sanchez Elias, Richard Green, Soňa Mokrá, Nicholas Fernando, Boxi Wu, Rachel Foley, Susannah Young, Iason Gabriel, William Isaac, John Mellor, Demis Hassabis, Koray Kavukcuoglu, Lisa Anne Hendricks, and Geoffrey Irving. 2022. Improving alignment of dialogue agents via targeted human judgements. https://arxiv.org/abs/2209.14375
    [34]
    Stephanie Glen. 2022. ChatGPT writes code, but won’t replace developers. TechTarget (14 12 2022). Retrieved 20-Jan-2023 from https://www.techtarget.com/searchsoftwarequality/news/252528379/ChatGPT-writes-code-but-wont-replace-developers
    [35]
    Samuel Holmes, Anne Moorhead, Raymond Bond, Huiru Zheng, Vivien Coates, and Mike McTear. 2018. WeightMentor: a new automated chatbot for weight loss maintenance. In Proceedings of the 32nd International BCS Human Computer Interaction Conference 32. 1–5.
    [36]
    Xing Hu, Ge Li, Xin Xia, David Lo, and Zhi Jin. 2020. Deep code comment generation with hybrid lexical and syntactical information. Empirical Software Engineering 25, 3 (2020), 2179–2217.
    [37]
    Edwin L Hutchins, James D Hollan, and Donald A Norman. 1985. Direct manipulation interfaces. Human–computer interaction 1, 4 (1985), 311–338.
    [38]
    Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing source code using a neural attention model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2073–2083.
    [39]
    Andreas Jedlitschka and Markus Nick. 2003. Software Engineering Knowledge Repositories. Springer Berlin Heidelberg, Berlin, Heidelberg, 55–80.
    [40]
    Eirini Kalliamvakou. 2022. Research: Quantifying github copilot’s impact on developer productivity and happiness. https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/
    [41]
    Anna Kantosalo 2019. Human-Computer Co-Creativity: Designing, Evaluating and Modelling Computational Collaborators for Poetry Writing. (2019).
    [42]
    Sandeep Kaur Kuttal, Bali Ong, Kate Kwasny, and Peter Robe. 2021. Trade-Offs for Substituting a Human with an Agent in a Pair Programming Context: The Good, the Bad, and the Ugly. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 243, 20 pages.
    [43]
    Lauramaria Laine. 2021. Exploring Advertising Creatives’ Attitudes Towards Human-AI Collaboration. (2021).
    [44]
    Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, and et al.2022. Competition-level code generation with AlphaCode. https://arxiv.org/abs/2203.07814
    [45]
    Yaosheng Lou and Qi Sun. 2021. Over-reliance on database: A case study of using web of science. Human Behavior and Emerging Technologies 3, 3 (2021), 454–459.
    [46]
    David Lyell and Enrico Coiera. 2017. Automation bias and verification complexity: a systematic review. Journal of the American Medical Informatics Association 24, 2(2017), 423–431.
    [47]
    Wendy E Mackay and Anne-Laure Fayard. 1997. HCI, natural science and design: a framework for triangulation across disciplines. In Proceedings of the 2nd conference on Designing interactive systems: processes, practices, methods, and techniques. 223–234.
    [48]
    John E Mathieu, Tonia S Heffner, Gerald F Goodwin, Eduardo Salas, and Janis A Cannon-Bowers. 2000. The influence of shared mental models on team process and performance.Journal of applied psychology 85, 2 (2000), 273.
    [49]
    Cade Metz. 2022. Meet GPT-3. It Has Learned to Code (and Blog and Argue). (Published 2020). https://www.nytimes.com/2020/11/24/science/artificial-intelligence-ai-gpt3.html
    [50]
    Robert J. Moore and Raphael Arar. 2019. Conversational UX Design: A Practitioner’s Guide to the Natural Conversation Framework. Association for Computing Machinery, New York, NY, USA.
    [51]
    Ekaterina A Moroz, Vladimir O Grizkevich, and Igor M Novozhilov. 2022. The Potential of Artificial Intelligence as a Method of Software Developer’s Productivity Improvement. In 2022 Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). IEEE, 386–390.
    [52]
    Michael Muller, Stevean Ross, Stephanie Houde, Mayank Agarwal, Fernando Martinez, John Richards, Kartik Talamadupula, and Justin D Weisz. 2022. Drinking Chai with Your (AI) Programming Partner: A Design Fiction about Generative AI for Software Engineering. HAI-GEN Workshop at IUI 2022: 3rd Workshop on Human-AI Co-Creation with Generative Models (2022). https://hai-gen.github.io/2022/
    [53]
    Sandra R Murillo and J Alfredo Sánchez. 2014. Empowering interfaces for system administrators: Keeping the command line in mind when designing GUIs. In Proceedings of the XV International Conference on Human Computer Interaction. 1–4.
    [54]
    Elizabeth D Mynatt and Gerhard Weber. 1994. Nonvisual presentation of graphical user interfaces: contrasting two approaches. In Proceedings of the SIGCHI conference on Human factors in computing systems. 166–172.
    [55]
    Alok Mysore and Philip J Guo. 2017. Torta: Generating mixed-media gui and command-line app tutorials using operating-system-wide activity tracing. In Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology. 703–714.
    [56]
    C. Nass and Y. Moon. 2000. Machines and Mindlessness: Social Responses to Computers. Journal of Social Issues 56, 1 (2000), 81–103.
    [57]
    Nhan Nguyen and Sarah Nadi. 2022. An Empirical Evaluation of GitHub Copilot’s Code Suggestions. In 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR). IEEE, 1–5.
    [58]
    Martin Nordio, H Estler, Carlo A Furia, Bertrand Meyer, 2011. Collaborative software development on the web. arXiv preprint arXiv:1105.0768(2011).
    [59]
    Maxwell Nye, Anders Andreassen, Guy Gur-Ari, Henryk Witold Michalewski, Jacob Austin, David Bieber, David Martin Dohan, Aitor Lewkowycz, Maarten Paul Bosma, David Luan, Charles Sutton, and Augustus Odena. 2021. Show Your Work: Scratchpads for Intermediate Computation with Language Models. https://arxiv.org/abs/2112.00114.
    [60]
    OpenAI. 2022. ChatGPT: Optimizing Language Models for Dialogue. OpenAI Blog (30 11 2022). Retrieved 20-Jan-2023 from https://openai.com/blog/chatgpt/
    [61]
    Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, and Ryan Lowe. 2022. Training language models to follow instructions with human feedback. https://arxiv.org/abs/2203.02155
    [62]
    Peter Pirolli and Stuart Card. 1999. Information foraging.Psychological review 106, 4 (1999), 643.
    [63]
    Larry Press. 1990. Personal computing: Windows, DOS and the MAC. Commun. ACM 33, 11 (1990), 19–26.
    [64]
    Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, 2019. Language Models are Unsupervised Multitask Learners.
    [65]
    Alvin Rajkomar, Jeffrey Dean, and Isaac Kohane. 2019. Machine learning in medicine. New England Journal of Medicine 380, 14 (2019), 1347–1358.
    [66]
    Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125(2022).
    [67]
    B. Reeves and C.I. Nass. 1996. The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places. CSLI Publications.
    [68]
    Mawarny Md Rejab, James Noble, and George Allan. 2014. Distributing Expertise in Agile Software Development Projects. In 2014 Agile Conference. 33–36.
    [69]
    Jeba Rezwana and Mary Lou Maher. 2021. COFI: A Framework for Modeling Interaction in Human-AI Co-Creative Systems. In ICCC. 444–448.
    [70]
    Charles H. Rich and Richard C. Waters. 1990. The Programmer’s Apprentice. Addison-Wesley Publishing Company, Reading, MA.
    [71]
    Peter Robe and Sandeep Kaur Kuttal. 2022. Designing PairBuddy—A Conversational Agent for Pair Programming. ACM Transactions on Computer-Human Interaction (TOCHI) 29, 4(2022), 1–44.
    [72]
    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684–10695.
    [73]
    Steven Ross, Elizabeth Brownholtz, and Robert Armes. 2004. A Multiple-Application Conversational Agent. In Proceedings of the 9th International Conference on Intelligent User Interfaces (Funchal, Madeira, Portugal) (IUI ’04). Association for Computing Machinery, New York, NY, USA, 319–321.
    [74]
    Steven Ross, Elizabeth Brownholtz, and Robert Armes. 2004. Voice User Interface Principles for a Conversational Agent. In Proceedings of the 9th International Conference on Intelligent User Interfaces (Funchal, Madeira, Portugal) (IUI ’04). Association for Computing Machinery, New York, NY, USA, 364–365.
    [75]
    Baptiste Roziere, Marie-Anne Lachaux, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised Translation of Programming Languages. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 20601–20611.
    [76]
    Harvey Sacks. 1984. Notes on methodology. In Structures of Social Action: Studies in Conversation Analysis, John Heritageand J. Maxwell Atkinson (Eds.). Cambridge University Press, Cambridge, 2–27.
    [77]
    Nithya Sambasivan and Rajesh Veeraraghavan. 2022. The Deskilling of Domain Expertise in AI Development. In CHI Conference on Human Factors in Computing Systems. 1–14.
    [78]
    Harini Sampath, Alice Merrick, and Andrew Macvean. 2021. Accessibility of command line interfaces. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–10.
    [79]
    Matthias Scheutz, Scott A DeLoach, and Julie A Adams. 2017. A framework for developing and using shared mental models in human-agent teams. Journal of Cognitive Engineering and Decision Making 11, 3 (2017), 203–224.
    [80]
    Isabella Seeber, Eva Bittner, Robert O Briggs, Triparna De Vreede, Gert-Jan De Vreede, Aaron Elkins, Ronald Maier, Alexander B Merz, Sarah Oeste-Reiß, Nils Randrup, 2020. Machines as teammates: A research agenda on AI in team collaboration. Information & management 57, 2 (2020), 103174.
    [81]
    Shilad Sen, Werner Geyer, Michael Muller, Marty Moore, Beth Brownholtz, Eric Wilcox, and David R Millen. 2006. FeedMe: a collaborative alert filtering system. In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work. 89–98.
    [82]
    Ben Shneiderman. 2020. Human-centered artificial intelligence: Three fresh ideas. AIS Transactions on Human-Computer Interaction 12, 3(2020), 109–124.
    [83]
    Ben Shneiderman. 2022. Human-Centered AI. Oxford University Press.
    [84]
    Kurt Shuster, Jing Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, 2022. BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage. arXiv preprint arXiv:2208.03188(2022).
    [85]
    Michael Skirpan and Casey Fiesler. 2018. Ad empathy: A design fiction. In Proceedings of the 2018 ACM Conference on Supporting Groupwork. 267–273.
    [86]
    Diomidis Spinellis. 2012. Git. IEEE Software 29, 3 (2012), 100–101. https://doi.org/10.1109/MS.2012.61
    [87]
    Angie Spoto and Natalia Oleynik. 2017. Library of Mixed-Initiative Creative Interfaces. Retrieved 19-Jun-2021 from http://mici.codingconduct.cc/
    [88]
    Ayushi Srivastava, Shivani Kapania, Anupriya Tuli, and Pushpendra Singh. 2021. Actionable UI Design Guidelines for Smartphone Applications Inclusive of Low-Literate Users. Proceedings of the ACM on Human-Computer Interaction 5, CSCW1(2021), 1–30.
    [89]
    Margaret-Anne Storey and Alexey Zagalsky. 2016. Disrupting developer productivity one bot at a time. In Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. 928–931.
    [90]
    Kartik Talamadupula. 2021. Applied AI matters: AI4Code: applying artificial intelligence to source code. AI Matters 7, 1 (2021), 18–20.
    [91]
    Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, and et al.2022. LAMDA: Language models for dialog applications. https://arxiv.org/abs/2201.08239
    [92]
    Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, and Neel Sundaresan. 2020. Unit Test Case Generation with Transformers and Focal Context. arXiv preprint arXiv:2009.05617(2020).
    [93]
    Severi Uusitalo, Anna Kantosalo, Antti Salovaara, Tapio Takala, and Christian Guckelsberger. 2022. Co-creative Product Design with Interactive Evolutionary Algorithms: A Practice-Based Reflection. In International Conference on Computational Intelligence in Music, Sound, Art and Design (Part of EvoStar). Springer, 292–307.
    [94]
    Priyan Vaithilingam and Philip J Guo. 2019. Bespoke: Interactively synthesizing custom GUIs from command-line applications by demonstration. In Proceedings of the 32nd annual ACM symposium on user interface software and technology. 563–576.
    [95]
    Priyan Vaithilingam, Tianyi Zhang, and Elena L. Glassman. 2022. Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models. In Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI EA ’22). Association for Computing Machinery, New York, NY, USA, Article 332, 7 pages. https://doi.org/10.1145/3491101.3519665
    [96]
    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems, I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Vol. 30. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
    [97]
    Yao Wan, Zhou Zhao, Min Yang, Guandong Xu, Haochao Ying, Jian Wu, and Philip S Yu. 2018. Improving automatic source code summarization via deep reinforcement learning. In Proceedings of the 33rd ACM/IEEE international conference on automated software engineering. 397–407.
    [98]
    April Yi Wang, Dakuo Wang, Jaimie Drozdal, Michael Muller, Soya Park, Justin D Weisz, Xuye Liu, Lingfei Wu, and Casey Dugan. 2022. Documentation Matters: Human-Centered AI System to Assist Data Science Code Documentation in Computational Notebooks. ACM Transactions on Computer-Human Interaction 29, 2(2022), 1–33.
    [99]
    Dakuo Wang, Justin D Weisz, Michael Muller, Parikshit Ram, Werner Geyer, Casey Dugan, Yla Tausczik, Horst Samulowitz, and Alexander Gray. 2019. Human-AI collaboration in data science: Exploring data scientists’ perceptions of automated AI. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–24.
    [100]
    Qiaosi Wang, Koustuv Saha, Eric Gregori, David Joyner, and Ashok Goel. 2021. Towards mutual theory of mind in human-ai interaction: How language reflects what students perceive about a virtual teaching assistant. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–14.
    [101]
    Jeremy Warner and Philip J Guo. 2017. Codepilot: Scaffolding end-to-end collaborative software development for novice programmers. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 1136–1141.
    [102]
    Justin D Weisz, Michael Muller, Stephanie Houde, John Richards, Steven I Ross, Fernando Martinez, Mayank Agarwal, and Kartik Talamadupula. 2021. Perfection Not Required? Human-AI Partnerships in Code Translation. In 26th International Conference on Intelligent User Interfaces. 402–412.
    [103]
    Justin D Weisz, Michael Muller, Steven I Ross, Fernando Martinez, Stephanie Houde, Mayank Agarwal, Kartik Talamadupula, and John T Richards. 2022. Better together? an evaluation of ai-supported code translation. In 27th International Conference on Intelligent User Interfaces. 369–391.
    [104]
    Joseph Weizenbaum. 1966. ELIZA — a computer program for the study of natural language communication between man and machine. Commun. ACM 9(1966), 36–45.
    [105]
    Frank F Xu, Bogdan Vasilescu, and Graham Neubig. 2022. In-ide code generation from natural language: Promise and challenges. ACM Transactions on Software Engineering and Methodology (TOSEM) 31, 2(2022), 1–47.
    [106]
    Aditya Ankur Yadav, Ishan Garg, and Dr. Pratistha Mathur. 2019. PACT - Programming Assistant ChaTbot. In 2019 2nd International Conference on Intelligent Communication and Computational Techniques (ICCT). 131–136.
    [107]
    Munazza Zaib, Quan Z. Sheng, and W. Zhang. 2020. A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP. Proceedings of the Australasian Computer Science Week Multiconference (2020).
    [108]
    Elaine Zibrowski, Lisa Shepherd, Kamran Sedig, Richard Booth, Candace Gibson, 2018. Easier and faster is not always better: grounded theory of the impact of large-scale system transformation on the clinical work of emergency medicine nurses and physicians. JMIR Human Factors 5, 4 (2018), e11013.
    [109]
    Albert Ziegler, Eirini Kalliamvakou, X. Alice Li, Andrew Rice, Devon Rifkin, Shawn Simister, Ganesh Sittampalam, and Edward Aftandilian. 2022. Productivity Assessment of Neural Code Completion. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming (San Diego, CA, USA) (MAPS 2022). Association for Computing Machinery, New York, NY, USA, 21–29. https://doi.org/10.1145/3520312.3534864

    Cited By

    View all
    • (2024)Designing Home Automation Routines Using an LLM-Based ChatbotDesigns10.3390/designs80300438:3(43)Online publication date: 13-May-2024
    • (2024)A Comparative Analysis of Large Language Models for Code Documentation GenerationProceedings of the 1st ACM International Conference on AI-Powered Software10.1145/3664646.3664765(65-73)Online publication date: 10-Jul-2024
    • (2024)From Human-to-Human to Human-to-Bot Conversations in Software EngineeringProceedings of the 1st ACM International Conference on AI-Powered Software10.1145/3664646.3664761(38-44)Online publication date: 10-Jul-2024
    • Show More Cited By

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IUI '23: Proceedings of the 28th International Conference on Intelligent User Interfaces
    March 2023
    972 pages
    ISBN:9798400701061
    DOI:10.1145/3581641
    This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 March 2023

    Check for updates

    Author Tags

    1. code-fluent large language models
    2. conversational interaction
    3. foundation models
    4. human-centered AI

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    IUI '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 746 of 2,811 submissions, 27%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4,674
    • Downloads (Last 6 weeks)440
    Reflects downloads up to 14 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Designing Home Automation Routines Using an LLM-Based ChatbotDesigns10.3390/designs80300438:3(43)Online publication date: 13-May-2024
    • (2024)A Comparative Analysis of Large Language Models for Code Documentation GenerationProceedings of the 1st ACM International Conference on AI-Powered Software10.1145/3664646.3664765(65-73)Online publication date: 10-Jul-2024
    • (2024)From Human-to-Human to Human-to-Bot Conversations in Software EngineeringProceedings of the 1st ACM International Conference on AI-Powered Software10.1145/3664646.3664761(38-44)Online publication date: 10-Jul-2024
    • (2024)Unveiling the Potential of a Conversational Agent in Developer Support: Insights from Mozilla’s PDF.js ProjectProceedings of the 1st ACM International Conference on AI-Powered Software10.1145/3664646.3664758(10-18)Online publication date: 10-Jul-2024
    • (2024)AI and the Future of Collaborative Work: Group Ideation with an LLM in a Virtual CanvasProceedings of the 3rd Annual Meeting of the Symposium on Human-Computer Interaction for Work10.1145/3663384.3663398(1-14)Online publication date: 25-Jun-2024
    • (2024)Significant Productivity Gains through Programming with Large Language ModelsProceedings of the ACM on Human-Computer Interaction10.1145/36611458:EICS(1-29)Online publication date: 17-Jun-2024
    • (2024)Evaluation of Code Generation for Simulating Participant Behavior in Experience Sampling Method by Iterative In-Context Learning of a Large Language ModelProceedings of the ACM on Human-Computer Interaction10.1145/36611438:EICS(1-19)Online publication date: 17-Jun-2024
    • (2024)Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering PracticeProceedings of the ACM on Software Engineering10.1145/36607881:FSE(1819-1840)Online publication date: 12-Jul-2024
    • (2024)Talk2Care: An LLM-based Voice Assistant for Communication between Healthcare Providers and Older AdultsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36596258:2(1-35)Online publication date: 15-May-2024
    • (2024)Exploring ChatGPT for identifying sexism in the communication of software developersProceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3652037.3663918(400-403)Online publication date: 26-Jun-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media