Artificial intelligence (AI) may answer patients' frequently asked questions (FAQs) better than orthodontists. This study was published in March in the Journal of the World Federation of Orthodontists.
Furthermore, ChatGPT-generated responses may differ from those of orthodontists, potentially being more comprehensive and well-structured, the authors wrote.
"The conversational AI (ChatGPT-4) may outperform orthodontists in answering orthodontic FAQs, even in a non-English context," wrote the authors, led by Xinlianyi Zhou of the Sichuan University West China Hospital of Stomatology (J World Fed Orthod, March 25, 2025).
To evaluate the performance of conversational AI in answering orthodontic patient FAQs, researchers selected 30 representative questions covering the entire treatment process, including 14 from the first visit, eight from the treatment plan meeting, and eight after treatment started.
Each FAQ was answered by ChatGPT-4 and two orthodontists with an average of four years of orthodontic experience. Then, four senior experts with an average of nine years of experience in orthodontic practice independently ranked the three responses for each question based on quality, they wrote.
ChatGPT ranked first in 61 cases (50.8%), second in 35 (29.2%), and third in 24 (20.0%), with an average rank of 1.69 ± 0.79, significantly better than orthodontist A (2.23 ± 0.79, p < 0.001) and orthodontist B (2.08 ± 0.79, p < 0.05). The Spearman correlation coefficient between ChatGPT's average ranking and inter-rater agreement was 0.69 (p < 0.001), indicating a strong positive correlation. These results suggested that ChatGPT performs best on questions with widely accepted answers among orthodontic professionals and struggles with more debated topics, they wrote.
However, ChatGPT's responses to questions about clear aligner treatment had suboptimal accuracy, highlighting its limitations in providing up-to-date and precise information, the authors added.
"Users should keep in mind that AI serves only as a support tool and should never be overly relied upon," Zhou and colleagues wrote.