Large language models and the perils of their hallucinations

R Azamfirei, SR Kudchadkar, J Fackler - Critical Care, 2023 - Springer
Critical Care, 2023Springer
We read with great interest the paper by Salvagno et al.[1] As they masterfully
stated,“ChatGPT work should not be used as a replacement for human judgment, and the
output should always be reviewed by experts before being used in any critical decision-
making or application.” As is often the case in critical care, new technologies and apparent
breakthroughs are often touted as game-changers. However, the truth usually emerges the
next day when the confetti has settled, and we have to clean up the sticky mess left by …
We read with great interest the paper by Salvagno et al.[1] As they masterfully stated,“ChatGPT work should not be used as a replacement for human judgment, and the output should always be reviewed by experts before being used in any critical decision-making or application.” As is often the case in critical care, new technologies and apparent breakthroughs are often touted as game-changers. However, the truth usually emerges the next day when the confetti has settled, and we have to clean up the sticky mess left by gallons of printed ink mixed with our hopeful wishes. Salvagno et al. present a ChatGPT-generated summary of three studies. As they noted, the summary was believable, albeit generic and sparse in the details. The glaring problem is that it’s completely fabricated. ChatGPT cannot access the internet, and its training dataset stops in September 2021; it has no reference to any studies published in 2023 [2]. In fact, one of the trials included in the summary, Belohlavek et al.[3], showed no improvement in functional neurological outcomes, contradicting Chat-GPT’s summary.
Springer