The problem of scientific references in medicine and health in AI-powered language models Chatbots.

Prof. Sérgio Pinheiro

Since the public release of AI-powered language model Chatbots at the end of last year, several questions have been raised regarding their application as a source of information in health and medicine. Information in health and medicine should always be disseminated responsibly and based on some fundamental precepts such as accuracy, or level of scientific evidence, clarity, relevance, accessibility, and respect for the general public. In this context, AI-powered Chatbots have significant advantages as sources of information in health and medicine, but the main problem, to date, concerns the level of scientific evidence of their information.

Among the significant advantages of AI-powered Chatbots in generating health information, we can mention speed, access to large data sets, personalization of information, and ease of communication with the user. In a way, we can also add that these Chatbots universalize access to medical and health information, considering that individuals of different races, creeds, social groups, ethnicities, and socioeconomic levels can have “accurate” and clear, personalized responses to their questions in health and medicine.

Although the main Chatbots highlight accuracy as one of the main advantages of the information in health and medicine generated by AI-powered language models, many problems exist when trying to extract bibliographic references. When asking the following question to four different Chatbots via poe.com (Sage, Claude, ChatGPT, and Dragonfly), “What bibliographic references do you use to prepare an informative text in the field of health and medicine?”, the answers include: scientific review and original peer-reviewed articles, books, research reports, guidelines and recommendations from reputable medical and health organizations, and medical databases, such as PubMed or Medline.

These are extremely relevant and appropriate answers. However, when asking the specific question about a rare medical condition in children and adolescents, “What bibliographic references did you use to prepare the text on nephrotic syndrome in children?”, the answers include outdated references regarding scientific articles and recommendations from international societies, as in the case of ChatGPT and Dragonfly. Sage goes beyond: “As an AI language model, I can provide general information and insights on a wide range of topics, including medical conditions. However, it is important to note that I do not substitute for medical advice or professional medical opinion. Any information provided by me should be taken as general information.” Interestingly, Claude responds that “Actually, I did not use any external bibliographic reference to prepare this text” and goes further when questioned about the reliability of the information he generated on the specific topic: “You are right, I should not be considered a reliable source for formal research on medical topics.”

The level of evidence, a fundamental precept for the responsible dissemination of information in health and medicine by AI-powered language model Chatbots, is in question here.

Finally, it is essential to use sources that are reliable, impartial, and evidence-based. Additionally, it is crucial to critically evaluate the sources of Chatbots to ensure that they are relevant, up-to-date, and provide accurate information. For now, the health and medical information generated by Chatbots may not be as accurate as science requires and should not be used as a reference for technical consultations. The hope is that this may change in the short term.



Copyright © 2023 AI-Talks.org

Leave a Reply

Your email address will not be published. Required fields are marked *