Affordable Access

Publisher Website

Large language models: a new frontier in paediatric cataract patient education.

Authors
  • Dihan, Qais1, 2
  • Chauhan, Muhammad Z2
  • Eleiwa, Taher K3
  • Brown, Andrew D4
  • Hassan, Amr K5
  • Khodeiry, Mohamed M6
  • Elsheikh, Reem H2
  • Oke, Isdin7
  • Nihalani, Bharti R7
  • VanderVeen, Deborah K7
  • Sallam, Ahmed B2
  • Elhusseiny, Abdelrahman M8, 7
  • 1 Rosalind Franklin University of Medicine and Science Chicago Medical School, North Chicago, Illinois, USA.
  • 2 Deparment of Ophthalmology, University of Arkansas for Medical Sciences, Little Rock, AR, USA.
  • 3 Department of Ophthalmology, Benha University, Benha, Egypt. , (Egypt)
  • 4 University of Arkansas for Medical Sciences, Little Rock, Arkansas, USA.
  • 5 Department of Ophthalmology, South Valley University, Qena, Egypt. , (Egypt)
  • 6 Department of Ophthalmology, University of Kentucky, Lexington, Kentucky, USA.
  • 7 Department of Ophthalmology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA.
  • 8 Deparment of Ophthalmology, University of Arkansas for Medical Sciences, Little Rock, AR, USA [email protected].
Type
Published Article
Journal
British Journal of Ophthalmology
Publisher
BMJ
Publication Date
Sep 20, 2024
Volume
108
Issue
10
Pages
1470–1476
Identifiers
DOI: 10.1136/bjo-2024-325252
PMID: 39174290
Source
Medline
Keywords
Language
English
License
Unknown

Abstract

This was a cross-sectional comparative study. We evaluated the ability of three large language models (LLMs) (ChatGPT-3.5, ChatGPT-4, and Google Bard) to generate novel patient education materials (PEMs) and improve the readability of existing PEMs on paediatric cataract. We compared LLMs' responses to three prompts. Prompt A requested they write a handout on paediatric cataract that was 'easily understandable by an average American.' Prompt B modified prompt A and requested the handout be written at a 'sixth-grade reading level, using the Simple Measure of Gobbledygook (SMOG) readability formula.' Prompt C rewrote existing PEMs on paediatric cataract 'to a sixth-grade reading level using the SMOG readability formula'. Responses were compared on their quality (DISCERN; 1 (low quality) to 5 (high quality)), understandability and actionability (Patient Education Materials Assessment Tool (≥70%: understandable, ≥70%: actionable)), accuracy (Likert misinformation; 1 (no misinformation) to 5 (high misinformation) and readability (SMOG, Flesch-Kincaid Grade Level (FKGL); grade level <7: highly readable). All LLM-generated responses were of high-quality (median DISCERN ≥4), understandability (≥70%), and accuracy (Likert=1). All LLM-generated responses were not actionable (<70%). ChatGPT-3.5 and ChatGPT-4 prompt B responses were more readable than prompt A responses (p<0.001). ChatGPT-4 generated more readable responses (lower SMOG and FKGL scores; 5.59±0.5 and 4.31±0.7, respectively) than the other two LLMs (p<0.001) and consistently rewrote them to or below the specified sixth-grade reading level (SMOG: 5.14±0.3). LLMs, particularly ChatGPT-4, proved valuable in generating high-quality, readable, accurate PEMs and in improving the readability of existing materials on paediatric cataract. © Author(s) (or their employer(s)) 2024. No commercial re-use. See rights and permissions. Published by BMJ.

Report this publication

Statistics

Seen <100 times