skip to main content
research-article
Free access
Just Accepted

Optimizing Delegation in Collaborative Human-AI Hybrid Teams

Online AM: 09 August 2024 Publication History

Abstract

When humans and autonomous systems operate together as what we refer to as a hybrid team, we of course wish to ensure the team operates successfully and effectively. We refer to team members as agents. In our proposed framework, we address the case of hybrid teams in which, at any time, only one team member (the control agent) is authorized to act as control for the team. To determine the best selection of a control agent, we propose the addition of an AI manager (via Reinforcement Learning) which learns as an outside observer of the team. The manager learns a model of behavior linking observations of agent performance and the environment/world the team is operating in, and from these observations makes the most desirable selection of a control agent. From our review of current state of the art, we present a novel manager model for oversight of hybrid teams by our support for diverse agents and decision-maker operation across multiple time steps and decisions. In our model, we restrict the manager's task by introducing a set of constraints. The manager constraints indicate acceptable team operation, so a violation occurs if the team enters a condition which is unacceptable and requires manager intervention. To ensure minimal added complexity or potential inefficiency for the team, the manager should attempt to minimize the number of times the team reaches a constraint violation and requires subsequent manager intervention. Therefore, our manager is optimizing its selection of authorized agents to boost overall team performance while minimizing the frequency of manager intervention. We demonstrate our manager's performance in a simulated driving scenario representing the case of a hybrid team of agents composed of a human driver and autonomous driving system. We perform experiments for our driving scenario with interfering vehicles, indicating the need for collision avoidance and proper speed control. Our results indicate a positive impact of our manager, with some cases resulting in increased team performance up to \(\approx 187\%\) that of the best solo agent performance.

References

[1]
Juan Afanador, Murilo Baptista, and Nir Oren. 2019. An adversarial algorithm for delegation. In Agreement Technologies: 6th International Conference, AT 2018, Bergen, Norway, December 6-7, 2018, Revised Selected Papers 6. Springer, 130–145.
[2]
Charu C Aggarwal. 2014. Instance-based learning: a survey. Data classification: algorithms and applications, 157.
[3]
Ujué Agudo, Karlos G Liberal, Miren Arrese, and Helena Matute. 2024. The impact of ai errors in a human-in-the-loop process. Cognitive Research: Principles and Implications, 9, 1, 1.
[4]
Naveed Akhtar, Ajmal Mian, Navid Kardan, and Mubarak Shah. 2021. Advances in adversarial attacks and defenses in computer vision: a survey. IEEE Access, 9, 155161–155196.
[5]
Samira Badrloo, Masood Varshosaz, Saied Pirasteh, and Jonathan Li. 2022. Image-based obstacle detection methods for the safe navigation of unmanned vehicles: a review. Remote Sensing, 14, 15, 3824.
[6]
Zana Buçinca, Siddharth Swaroop, Amanda E. Paluch, Susan A. Murphy, and Krzysztof Z. Gajos. 2024. Towards optimizing human-centric objectives in ai-assisted decision-making with offline reinforcement learning. arXiv:2403.05911, (Apr. 2024). arXiv:2403.05911 [cs]. http://arxiv.org/abs/2403.05911.
[7]
Ángel Alexander Cabrera, Abraham J Druck, Jason I Hong, and Adam Perer. 2021. Discovering and validating ai errors with crowd-sourced failure reports. Proceedings of the ACM on Human-Computer Interaction, 5, CSCW2, 1–22.
[8]
Cindy Candrian and Anne Scherer. 2022. Rise of the machines: delegating decisions to autonomous ai. Computers in Human Behavior, 134, 107308.
[9]
Hui Cao, Wenlong Zou, Yinkun Wang, Ting Song, and Mengjun Liu. 2022. Emerging threats in deep learning-based autonomous driving: a comprehensive survey. arXiv preprint arXiv:2210.11237.
[10]
Yulong Cao, Chaowei Xiao, Benjamin Cyr, Yimeng Zhou, Won Park, Sara Rampazzi, Qi Alfred Chen, Kevin Fu, and Z Morley Mao. 2019. Adversarial sensor attack on lidar-based perception in autonomous driving. In Proceedings of the 2019 ACM SIGSAC conference on computer and communications security, 2267–2281.
[11]
Zhangjie Cao, Erdem Biyik, Woodrow Z. Wang, Allan Raventos, Adrien Gaidon, Guy Rosman, and Dorsa Sadigh. 2020. Reinforcement learning based control of imitative policies for near-accident driving. In Proceedings of Robotics: Science and Systems (RSS). (July 2020).
[12]
Micah Carroll, Rohin Shah, Mark K Ho, Tom Griffiths, Sanjit Seshia, Pieter Abbeel, and Anca Dragan. 2019. On the utility of learning about humans for human-ai coordination. Advances in neural information processing systems, 32.
[13]
Jhelum Chakravorty, Nadeem Ward, Julien Roy, Maxime Chevalier-Boisvert, Sumana Basu, Andrei Lupu, and Doina Precup. 2019. Option-critic in cooperative multi-agent systems. arXiv preprint arXiv:1911.12825.
[14]
Guanting Chen, Xiaocheng Li, Chunlin Sun, and Hanzhao Wang. 2023. Learning to make adherence-aware advice. arXiv preprint arXiv:2310.00817.
[15]
Jiayu Chen, Marina Haliem, Tian Lan, and Vaneet Aggarwal. 2022. Multi-agent deep covering option discovery. arXiv preprint arXiv:2210.03269.
[16]
Min Chen, Stefanos Nikolaidis, Harold Soh, David Hsu, and Siddhartha Srinivasa. 2018. Planning with trust for human-robot collaboration. In Proceedings of the 2018 ACM/IEEE international conference on human-robot interaction, 307–315.
[17]
Yao Deng, Xi Zheng, Tianyi Zhang, Chen Chen, Guannan Lou, and Miryung Kim. 2020. An analysis of adversarial attacks and defenses on autonomous driving models. In 2020 IEEE international conference on pervasive computing and communications (PerCom). IEEE, 1–10.
[18]
Alexey Dosovitskiy, German Ros, Felipe Codevilla, Antonio Lopez, and Vladlen Koltun. 2017. CARLA: An open urban driving simulator. In Proceedings of the 1st Annual Conference on Robot Learning, 1–16.
[19]
Jordan Erskine and Christopher Lehnert. 2022. Developing cooperative policies for multi-stage reinforcement learning tasks. IEEE Robotics and Automation Letters, 7, 3, 6590–6597.
[20]
Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. 2018. Robust physical-world attacks on deep learning visual classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1625–1634.
[21]
Andrew Fuchs, Andrea Passarella, and Marco Conti. 2022. A cognitive framework for delegation between error-prone ai and human agents. In 2022 IEEE International Conference on Smart Computing (SMARTCOMP). IEEE, 317–322.
[22]
Andrew Fuchs, Andrea Passarella, and Marco Conti. 2023. Compensating for sensing failures via delegation in human–ai hybrid systems. Sensors, 23, 7, 3409.
[23]
Andrew Fuchs, Andrea Passarella, and Marco Conti. 2023. Optimizing delegation between human and ai collaborative agents. In European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD). Workshop on Hybrid Human-Machine Learning and Decision Making.
[24]
Francisco M Garcia, Chris Nota, and Philip S Thomas. 2020. Learning reusable options for multi-task reinforcement learning. arXiv preprint arXiv:2001.01577.
[25]
Junyao Guo, Unmesh Kurup, and Mohak Shah. 2019. Is it safe to drive? an overview of factors, metrics, and datasets for driveability assessment in autonomous driving. IEEE Transactions on Intelligent Transportation Systems, 21, 8, 3135–3151.
[26]
Tuomas Haarnoja et al. 2018. Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905.
[27]
Katrin Haegler, Rebekka Zernecke, Anna Maria Kleemann, Jessica Albrecht, Olga Pollatos, Hartmut Brückmann, and Martin Wiesmann. 2010. No fear no risk! human risk behavior is affected by chemosensory anxiety signals. Neuropsychologia, 48, 13, 3901–3908.
[28]
Tommi Jaakkola, Michael Jordan, and Satinder Singh. 1993. Convergence of stochastic iterative dynamic programming algorithms. Advances in neural information processing systems, 6.
[29]
Alexis Jacq, Johan Ferret, Olivier Pietquin, and Matthieu Geist. 2022. Lazy-mdps: towards interpretable rl by learning when to act. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 669–677.
[30]
Zahra Rezaei Khavas, S. Reza Ahmadzadeh, and Paul Robinette. 2020. Modeling trust in human-robot interaction: a survey. In Social Robotics. Alan R. Wagner, David Feil-Seifer, Kerstin S. Haring, Silvia Rossi, Thomas Williams, Hongsheng He, and Shuzhi Sam Ge, (Eds.) Springer International Publishing, Cham, 529–541. isbn: 978-3-030-62056-1.
[31]
Karl Kurzer, Chenyang Zhou, and J Marius Zöllner. 2018. Decentralized cooperative planning for automated vehicles with hierarchical monte carlo tree search. In 2018 IEEE intelligent vehicles symposium (IV). IEEE, 529–536.
[32]
Amama Mahmood, Jeanie W Fung, Isabel Won, and Chien-Ming Huang. 2022. Owning mistakes sincerely: strategies for mitigating ai errors. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 1–11.
[33]
Francisco S Melo. 2001. Convergence of q-learning: a simple proof. Institute Of Systems and Robotics, Tech. Rep, 1–4.
[34]
Vahid Balazadeh Meresht. [n. d.] Learning to switch among agents in a team via 2-layer markov decision processes. en.
[35]
Vahid Balazadeh Meresht, Abir De, Adish Singla, and Manuel Gomez-Rodriguez. 2020. Learning to switch between machines and humans. CoRR, abs/2002.04258. https://arxiv.org/abs/2002.04258 arXiv: 2002.04258.
[36]
Richard D Morris. 1968. Solving random walk problems using resistive analogues.
[37]
Eura Nofshin, Siddharth Swaroop, Weiwei Pan, Susan Murphy, and Finale Doshi-Velez. 2024. Reinforcement learning interventions on boundedly rational human agents in frictionful tasks. arXiv:2401.14923, (Jan. 2024). arXiv:2401.14923 [cs]. http://arxiv.org/abs/2401.14 923.
[38]
Quazi Marufur Rahman, Niko Sünderhauf, and Feras Dayoub. 2019. Did you miss the sign? a false negative alarm system for traffic sign detectors. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 3748–3753.
[39]
James Reason. 1990. Human error. Cambridge university press.
[40]
Dale Richards and Jennifer Cowell-Butler. 2022. Decisions within human-machine teaming: the introduction of decision strings. en. In 2022 IEEE 3rd International Conference on Human-Machine Systems (ICHMS). IEEE, Orlando, FL, USA, (Nov. 2022), 1–7. isbn: 978-1-66545-238-0.
[41]
Thiadmer Riemersma. 2019. Colour metric. (May 2019). https://www.compuphase.com/cmetric.htm.
[42]
Khashayar Rohanimanesh and Sridhar Mahadevan. 2002. Learning to take concurrent actions. Advances in neural information processing systems, 15.
[43]
Francisca Rosique, Pedro J Navarro, Carlos Fernández, and Antonio Padilla. 2019. A systematic review of perception system and simulators for autonomous vehicles research. Sensors, 19, 3, 648.
[44]
Stephen Russell, Ira S Moskowitz, and Adrienne Raglin. 2017. Human information interaction, artificial intelligence, and errors. Autonomy and Artificial Intelligence: A Threat or Savior? 71–101.
[45]
Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952.
[46]
Francesco Secci and Andrea Ceccarelli. 2020. On failures of rgb cameras and their effects in autonomous driving applications. In 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE). IEEE, 13–24.
[47]
Arambam James Singh, Akshat Kumar, and Hoong Chuin Lau. 2020. Hierarchical multiagent reinforcement learning for maritime traffic management.
[48]
Eleni Straitouri, Adish Singla, Vahid Balazadeh Meresht, and Manuel Gomez-Rodriguez. 2021. Reinforcement learning under algorithmic triage. CoRR, abs/2109.11328. https://arxiv.org/abs/2109.11328 arXiv: 2109.11328.
[49]
Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
[50]
Weihao Tan et al. [n. d.] Intervention aware shared autonomy. en.
[51]
Samuel Westby and Christoph Riedl. 2023. Collective intelligence in human-ai teams: a bayesian theory of mind approach. In Proceedings of the AAAI Conference on Artificial Intelligence number 5. Vol. 37, 6119–6127.
[52]
Sarah A Wu, Rose E Wang, James A Evans, Joshua B Tenenbaum, David C Parkes, and Max Kleiman-Weiner. 2021. Too many cooks: bayesian inference for coordinating multi-agent collaboration. Topics in Cognitive Science, 13, 2, 414–432.
[53]
Mingyu Yang, Jian Zhao, Xunhan Hu, Wengang Zhou, Jiangcheng Zhu, and Houqiang Li. 2022. Ldsa: learning dynamic subtask assignment in cooperative multi-agent reinforcement learning. Advances in Neural Information Processing Systems, 35, 1698–1710.
[54]
Yuxiao Zhang, Alexander Carballo, Hanting Yang, and Kazuya Takeda. 2023. Perception and sensing for autonomous vehicles under adverse weather conditions: a survey. ISPRS Journal of Photogrammetry and Remote Sensing, 196, 146–177.
[55]
Wei Zhou, Julie Stephany Berrio, Stewart Worrall, and Eduardo Nebot. 2019. Automated evaluation of semantic segmentation robustness for autonomous driving. IEEE Transactions on Intelligent Transportation Systems, 21, 5, 1951–1963.
[56]
Martin Zimmermann and Franz Wotawa. 2020. An adaptive system for autonomous driving. Software Quality Journal, 28, 1189–1212.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Autonomous and Adaptive Systems
ACM Transactions on Autonomous and Adaptive Systems Just Accepted
EISSN:1556-4703
Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 09 August 2024
Accepted: 01 August 2024
Revised: 03 June 2024
Received: 08 February 2024

Check for updates

Author Tags

  1. hybrid team
  2. reinforcement learning
  3. collaboration
  4. delegation

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 139
    Total Downloads
  • Downloads (Last 12 months)139
  • Downloads (Last 6 weeks)58
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media