Comparison of the ability of three multimodal large language models to handle issues related to medication adherence

LIU Shuyan; MOU Jie; LI Kunyu; LIU Xiaoqing; LIU Yuxuan

doi:10.19577/j.1007-4406.2025.09.003

2025, 09, v.34 656-662

3种多模态大语言模型处理用药依从性相关问题的能力比较

刘姝言牟婕李坤雨刘潇晴刘玉璇

1.天津中医药大学

基金项目(Foundation):

邮箱(Email): lsrlyx2008@126.com;

DOI: 10.19577/j.1007-4406.2025.09.003

发布时间： 2025-09-25

出版时间： 2025-09-25

移动端阅读

170	0	203
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

目的分析3种多模态大语言模型[ChatGPT-4(GPT-4)、Claude 3和Gemini 1.5]处理用药依从性相关问题的性能差异，为临床上药师选择人工智能工具辅助工作提供依据。方法设计30个与用药依从性相关的标准化临床问题（其中漏服处理、用药教育和药物相互作用方面各10题），由GPT-4、Claude 3和Gemini 1.5分别独立作答。采用单盲的形式，由5名具有副高及以上职称的临床药师对多模态大语言模型的回答进行评价（准确性0～5分、实用性0～3分、安全性0～2分），评估3种多模态大语言模型处理用药依从性相关问题的能力。结果 GPT-4在回答用药依从性各项相关问题（漏服处理、用药教育和药物相互作用）的得分均为满分（100分），擅长复杂决策和用药安全性评估。Claude 3在回答漏服处理方面问题的得分为72分，回答用药教育方面问题的得分为80分，回答药物相互作用方面问题的得分为72分。Gemini 1.5在回答漏服处理方面问题的得分为44分，回答用药教育方面问题的得分为60分，回答药物相互作用方面问题的得分为29分，使用时需经过严格的人工审核。结论在回答用药依从性相关问题的场景下，3种多模态大语言模型中，GPT-4表现最佳，Claude 3在用药教育方面具有优势，Gemini 1.5效果较差，结果需人工审核。多模态大语言模型可作为药师在临床工作中的高效辅助工具，但仍需结合人工审核，以达到保障患者治疗的安全性，提高药物疗效，以及提升药师工作效率的目的。

关键词： 多模态大语言模型; 用药依从性; 医疗应用; 人工智能辅助决策;

Abstract：

AIM To evaluate the performance differences of multimodal large language models [ChatGPT(GPT-4), Claude 3, Gemini 1.5] in medication adherence-related Q&A, providing evidence for clinical AI tool selection. METHODS A multi-dimensional evaluation system was designed, including 30 standardized clinical questions(10 on missed dose management, 10 on medication education, and 10 on drug interactions). In a single-blind format, responses from the 3 models were scored by 5 senior pharmacists(accuracy: 0-5; practicality: 0-3; safety: 0-2). Total and categoryspecific scores were compared to evaluate the efficacy of 3 multimodal large language models in enhancing medication adherence. RESULTS GPT-4 achieved perfect scores(100 points) in all categories(missed dose management, medication education, and drug interactions) and was adept at complex decision-making and medication safety assessment. Claude 3's scores were 72 for missed dose management, 80 for medication education, and 72 for drug interactions. Gemini 1.5's scores were 44, 60, and 29 in the respective categories, indicating that its outputs require strict human review. CONCLUSION In addressing questions related to medication adherence, GPT-4 demonstrated the best performance among the three multimodal large language models. Claude 3 excelled in medication education, while Gemini 1.5 struggled significantly, often requiring human review for its results. Multimodal large language models can serve as efficient assistive tools for pharmacists in clinical work, but their use must be combined with human review to ensure patient safety, improve therapeutic efficacy, and enhance pharmacist efficiency.

KeyWords： multimodal large language model; medication adherence; healthcare application; AI-assisted decision-making;

如需获取全文，请访问cnki.net

参考文献

[1] LEE J K, MOON S J, PARK H Y, et al. Impact of medication nonadherence on hospitalization and mortality in heart failure:a systematic review and meta-analysis[J]. J Am Heart Assoc,2023, 12(5):e027000.

[2] WANG Y, LI X, ZHANG R, et al. Economic burden of poor medication adherence in type 2 diabetes:a cost-analysis study from china[J]. Diabetes Res Clin Pract, 2022, 186:109802.

[3] GONZáLEZ-LóPEZ A, RODRíGUEZ-ARTALEJO F, GARCíAESQUINAS E. Medication nonadherence as a risk factor for dementia progression:a longitudinal cohort study[J]. J Alzheimers Dis, 2023, 91(2):671.

[4] PATEL R S, LAGASSéD, LOPEZ A G, et al. Real-world consequences of nonadherence to antihypertensive therapy:a population-based cohort study[J]. Hypertension, 2022, 79(8):1798.

[5] KIM E S, BAEK J Y, LEE T J. Association between antidepressant adherence and suicide risk in major depressive disorder:a nationwide cohort study[J]. JAMA Psychiatry, 2024, 81(2):146.

[6]杨永,张瑛,杨连招,等.社区A型行为类型老年原发性高血压病患者的服药意向和用药依从性及干预质性研究[J].黑龙江科学, 2022, 13(2):71.

[7] World Health Organization. Global report on hypertension:the race against a silent killer[EB/OL].(2023-09-20)[2025-01-30].https://max.book118.com/html/2023/0920/7006065146005160.shtm.

[8] CHISHOLM-BURNS M A, et al. Economic impact of medication nonadherence in chronic diseases[J]. Value in Health, 2023,26(3):454.

[9] SAMPARKS S, WONGPRAKORN N, THONGPRADIT S, et al.Antidepressant discontinuation and relapse in major depressive disorder:a multicenter prospective cohort study[J]. J Affect Disord, 2023, 324:280.

[10] KIBUUKA D, KAYIMA J, KAKANDE E, et al. Long-term antihypertensive therapy adherence in sub-saharan africa:a 5-year cohort study[J]. J Hypertens, 2022, 40(5):912.

[11] THOMAS T, LEE J, ALI S, et al. Limitations of sms reminders for chronic disease medication adherence:a meta-analysis of randomized trials[J]. J Med Internet Res, 2023, 25(1):e41562.

[12] KIM H, PARK S, LEE Y, et al. Why do patients ignore medication alerts? A qualitative analysis of sms reminder failures[J]. Patient Educ Couns, 2022, 105(7):2275.

[13] WANG L, ZHANG R, SHEN Z, et al. Cost-effectiveness analysis of human follow-up vs digital interventions for hypertension management[J]. BMC Med, 2022, 20(1):89.

[14] GARCíA-LóPEZ M, RODRíGUEZ-PéREZ A, DíAZ-MARTíNEZá. Barriers to scaling traditional medication adherence interventions:a mixed-methods study[J]. Implement Sci, 2023,18(1):12.

[15] PATEL K, JOHNSON A, SMITH B, et al. One-size-fits-all approaches fail in medication adherence:individualized scheduling based on pharmacokinetics improves outcomes[J]. Clin Pharmacol Ther, 2024, 115(2):278.

[16] INSIGHTS X. The evolution of AI in healthcare[EB/OL].(2023-04-13)[2025-01-30]. https://www.xsolis.com/blog/theevolution-of-ai-in-healthcare.

[17] SALLAM M. ChatGPT utility in healthcare education, research,and practice:systematic review on the promising perspectives and valid concerns[J]. Healthcare, 2023, 11(6):887.

[18] KUMAR A, ROBERTS D, WOOD K E, et al. AI-assisted antibiotic stewardship in ICU:impact on mortality and resistance patterns[J].Intensive Care Med, 2023, 49(6):678.

[19] CHEN L, WANG Y, ZHANG H, et al. GPT-4 based behavioral analytics for personalized medication reminders:a randomized controlled trial[J]. NPJ Digit Med, 2023, 6(1):112.

[20] AL-ASHWAL F Y, ZAWIAH M, GHARAIBEH L, et al. Evaluating the sensitivity, specificity, and accuracy of ChatGPT-3.5,ChatGPT-4, Bing AI, and bard against conventional drug-drug interactions clinical tools[J]. Drug Healthc Patient Saf, 2023,15:137.

[21] GEROIMENKO V. Augmented reality and artificial intelligence:the fusion of advanced technologies[M]. Cham:Springer Nature Switzerland, 2023.

[22] WOLTERS KLUWER. Blog post-artificial intelligence in pharmacy:are you ready?[EB/OL].(2023-04-14)[2025-01-30]. https://www.wolterskluwer.com/en/expert-insights/artificialintelligence-in-pharmacy-are-you-ready.

[23] LEE S, KIM J, PARK H. AI virtual pharmacist for 24/7 medication monitoring:a randomized controlled trial in hypertensive patients[J]. NPJ Digit Med, 2023, 6(1):45.

[24] DE GEEST S, ZULLIG L L, DUNBAR-JACOB J, et al.ESPACOMP medication adherence reporting guideline(EMERGE)[J]. Ann Intern Med, 2018, 169(1):30.

[25] ROOSAN D, CLUTTER J, KENDALL B, et al. Power of heuristics to improve health information technology system design[J]. ACI Open, 2022, 6(2):e114.

[26] SINGH H, MEYER A N D, THOMAS E J. Barriers to physician trust in clinical decision support ai:a mixed-methods study[J].JAMA Netw Open, 2023, 6(5):e2315872.

[27] WU J, LIAO X, LUO J, et al. Clinician resistance to llm-assisted care:evidence from chinese hospitals[J]. Lancet Digit Health,2022, 4(12):e893.

[28] JOHNSON K S, CRAWFORD A, PATEL V R."Black box anxiety":why nurses reject real-time AI alerts in medication administration[J]. J Nurs Scholarsh, 2024, 56(1):88.

[29] LEE S M, PARK H Y, KIM E S. Psychiatrists'attitudes toward chatgpt in mental health care:a cross-sectional survey[J].Psychiatr Serv, 2023, 74(11):1158.

[30] RODRíGUEZ J A, GOGIA K M, PéREZ-STABLE E J. When AI challenges clinical expertise:qualitative analysis of physician resistance patterns[J]. NEJM AI, 2024, 1(3):AIoa2300038.

[31] OpenAI, ACHIAM J, ADLER S, et al. GPT-4 technical report[EB/OL].(2023-03-14)[2025-01-30]. https://doi.org/10.48550/arXiv.2303.08774.

基本信息:

DOI：10.19577/j.1007-4406.2025.09.003

中图分类号:R95;TP18

引用信息:

[1]刘姝言,牟婕,李坤雨,等.3种多模态大语言模型处理用药依从性相关问题的能力比较[J].中国临床药学杂志,2025,34(09):656-662.DOI:10.19577/j.1007-4406.2025.09.003.

发布时间：

2025-09-25

出版时间：

2025-09-25

请选择需要下载的pdf数据

中国临床药学杂志

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈

引用

GB/T 7714-2015 格式引文

MLA格式引文

APA格式引文

请选择需要下载的pdf数据

中国临床药学杂志

使用微信“扫一扫”功能。将此内容分享给您的微信好友或者朋友圈

引用

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈