<?xml version="1.0"?>
<!DOCTYPE ArticleSet PUBLIC "-//NLM//DTD PubMed 2.0//EN" "http://www.ncbi.nlm.nih.gov/entrez/query/static/PubMed.dtd">
<ArticleSet>
  <Article>
    <Journal>
      <PublisherName>Barw</PublisherName>
      <JournalTitle>Barw Medical Journal</JournalTitle>
      <Issn>2960-1959</Issn>
      <Volume>3</Volume>
      <Issue>3</Issue>
      <PubDate PubStatus="epublish">
        <Year>2025</Year>
        <Month>04</Month>
        <Day>30</Day>
      </PubDate>
    </Journal>
    <ArticleTitle>Exploring Large Language Models Integration in the Histopathologic Diagnosis of Skin Diseases: A Comparative Study</ArticleTitle>
    <FirstPage>6</FirstPage>
    <LastPage>12</LastPage>
    <ELocationID EIdType="doi">10.58742/bmj.v3i3.180</ELocationID>
    <Language>eng</Language>
    <AuthorList>
      <Author>
        <FirstName EmptyYN="Y"/>
        <LastName>Talar Sabir Ahmed</LastName>
        <Affiliation>Hiwa Cancer Hospital, Shorsh Street, Sulaymaniyah, Iraq. talar.ahmed@gmail.com</Affiliation>
      </Author>
      <Author>
        <FirstName EmptyYN="Y"/>
        <LastName>Rawa M. Ali</LastName>
        <Affiliation>Hospital for Treatment of Victims of Chemical Weapons, Mawlawy Street, Halabja, Iraq. rawa.ali@gmail.com</Affiliation>
      </Author>
      <Author>
        <FirstName EmptyYN="Y"/>
        <LastName>Ari M. Abdullah</LastName>
        <Affiliation>Department of Pathology, Sulaymaniyah Teaching Hospital, Sulaymaniyah, Iraq. ariabdullah1978@gmail.com</Affiliation>
      </Author>
      <Author>
        <FirstName EmptyYN="Y"/>
        <LastName>Hadeel A. Yasseen</LastName>
        <Affiliation>College of Medicine, University of Sulaimani, Madam Mitterrand Street, Sulaymaniyah, Iraq. hadeel.yasseen@gmail.com</Affiliation>
      </Author>
      <Author>
        <FirstName EmptyYN="Y"/>
        <LastName>Ronak S. Ahmed</LastName>
        <Affiliation>Shahid Nabaz Dermatology Teaching Center for Treating Skin Diseases, Sulaymaniyah Directorate of Health, Sulaymaniyah, Iraq. ronakahmed76@gmail.com</Affiliation>
      </Author>
      <Author>
        <FirstName EmptyYN="Y"/>
        <LastName>Ameer M. Salih</LastName>
        <Affiliation>Scientific Affairs Department, Smart Health Tower, Madam Mitterrand Street, Sulaymaniyah, Iraq. ameer.salih@univsul.edu.iq</Affiliation>
      </Author>
      <Author>
        <FirstName EmptyYN="Y"/>
        <LastName>Dilan S. Hiwa</LastName>
        <Affiliation>Scientific Affairs Department, Smart Health Tower, Madam Mitterrand Street, Sulaymaniyah, Iraq. dilan.sarmad.hiwa@gmail.com</Affiliation>
      </Author>
      <Author>
        <FirstName EmptyYN="Y"/>
        <LastName>Shvan H. Mohammed</LastName>
        <Affiliation>Xzmat polyclinic, Rizgari, Kalar, Sulaymaniyah, Iraq. shvanh80@gmail.com</Affiliation>
      </Author>
    </AuthorList>
    <History>
      <PubDate PubStatus="received">
        <Year>2025</Year>
        <Month>04</Month>
        <Day>05</Day>
      </PubDate>
    </History>
    <Abstract>Introduction

The exact manner in which large language models will be integrated into pathology is not yet fully comprehended. This study examines the accuracy, benefits, biases, and limitations of large language models in diagnosing dermatologic conditions within pathology.

Methods

A pathologist compiled 60 real histopathology case scenarios of skin conditions from a hospital database. Two other pathologists reviewed each patient&#x2019;s demographics, clinical details, histopathology findings, and original diagnosis. These cases were presented to ChatGPT-3.5, Gemini, and an external pathologist. Each response was classified as complete agreement, partial agreement, or no agreement with the original pathologist&#x2019;s diagnosis.

Results

ChatGPT-3.5 had 29 (48.4%) complete agreements, 14 (23.3%) partial agreements, and 17 (28.3%) none agreements. Gemini showed 20 (33%), 9 (15%), and 31 (52%) complete agreement, partial agreement, and no agreement responses, respectively. Additionally, the external pathologist had 36(60%), 17(28%), and 7(12%) complete agreements, partial agreements, and no agreements responses, respectively, in relation to the pathologists&#x2019; diagnosis. Significant differences in diagnostic agreement were found between the LLMs (ChatGPT, Gemini) and the pathologist (P &lt; 0.001).

Conclusion

In certain instances, ChatGPT-3.5 and Gemini may provide an accurate diagnosis of skin pathologies when presented with relevant patient history and descriptions of histopathological reports. However, their overall performance is insufficient for reliable use in real-life clinical settings.
</Abstract>
  </Article>
</ArticleSet>
