ChatGPT vs. web search for patient questions: what does ChatGPT do better?
Document Type
Article
Abstract
PURPOSE: Chat generative pretrained transformer (ChatGPT) has the potential to significantly impact how patients acquire medical information online. Here, we characterize the readability and appropriateness of ChatGPT responses to a range of patient questions compared to results from traditional web searches. METHODS: Patient questions related to the published Clinical Practice Guidelines by the American Academy of Otolaryngology-Head and Neck Surgery were sourced from existing online posts. Questions were categorized using a modified Rothwell classification system into (1) fact, (2) policy, and (3) diagnosis and recommendations. These were queried using ChatGPT and traditional web search. All results were evaluated on readability (Flesch Reading Ease and Flesch-Kinkaid Grade Level) and understandability (Patient Education Materials Assessment Tool). Accuracy was assessed by two blinded clinical evaluators using a three-point ordinal scale. RESULTS: 54 questions were organized into fact (37.0%), policy (37.0%), and diagnosis (25.8%). The average readability for ChatGPT responses was lower than traditional web search (FRE: 42.3 ± 13.1 vs. 55.6 ± 10.5, p < 0.001), while the PEMAT understandability was equivalent (93.8% vs. 93.5%, p = 0.17). ChatGPT scored higher than web search for questions the 'Diagnosis' category (p < 0.01); there was no difference in questions categorized as 'Fact' (p = 0.15) or 'Policy' (p = 0.22). Additional prompting improved ChatGPT response readability (FRE 55.6 ± 13.6, p < 0.01). CONCLUSIONS: ChatGPT outperforms web search in answering patient questions related to symptom-based diagnoses and is equivalent in providing medical facts and established policy. Appropriate prompting can further improve readability while maintaining accuracy. Further patient education is needed to relay the benefits and limitations of this technology as a source of medial information.
Medical Subject Headings
Humans; Comprehension; Health Literacy; Internet; Patient Education as Topic (methods); Artificial Intelligence
Publication Date
6-1-2024
Publication Title
European archives of oto-rhino-laryngology : official journal of the European Federation of Oto-Rhino-Laryngological Societies (EUFOS) : affiliated with the German Society for Oto-Rhino-Laryngology - Head and Neck Surgery
E-ISSN
1434-4726
Volume
281
Issue
6
First Page
3219
Last Page
3225
PubMed ID
38416195
Digital Object Identifier (DOI)
10.1007/s00405-024-08524-0
Recommended Citation
Shen, Sarek A.; Perez-Heydrich, Carlos A.; Xie, Deborah X.; and Nellis, Jason C., "ChatGPT vs. web search for patient questions: what does ChatGPT do better?" (2024). ENT and Skull Base Surgery. 29.
https://scholar.barrowneuro.org/ent-and-skull-base-surgery/29