ChatGPT failed his medical exam.  Doctors can sleep soundly.  At least in Poland

ChatGPT failed his medical exam. Doctors can sleep soundly. At least in Poland

ChatGPT will not replace doctors for now. This is at least the result of research conducted by scientists from the Nicolaus Copernicus University in Toruń. They commissioned AI to solve problems from the “internal medicine” exam. ChatGPT failed miserably.

Will AI take my job? The dynamic development of generative artificial intelligence tools, such as ChatGPT, has caused more and more people to ask themselves this question. In March last year, the world saw a report by Goldman Sachs analysts who forecast that in the coming years AI may eliminate up to 300 million jobs in the USA and Europe.

Equally disturbing conclusions come from last year’s McKinsey report, which shows that by 2030 it will be possible to automate tasks that constitute up to 30 percent of the hours currently worked in the US economy. While it is quite easy to imagine a scenario in which artificial intelligence replaces us in performing repetitive activities, there are professions that seem resistant to the AI ​​revolution. At least for now. Such a profession is, for example, a doctor. But are you sure?

Doctors can sleep soundly. ChatGPT failed “internal”

Scientists from the Collegium Medicum of the Nicolaus Copernicus University in Toruń decided to check how AI would cope with the medical exam. As they explain, one of the reasons they decided to perform such a test is the fact that more and more patients use generative artificial intelligence tools to make “self-diagnosis”, which may be potentially dangerous.

When talking about diseases with friends or patients, we often hear that someone checked their symptoms in the Google search engine and, based on this, made a diagnosis themselves.

– notes Dr. Szymon Suwała from the Department of Endocrinology and Diabetology of the Faculty of Medicine, Collegium Medicum of the Nicolaus Copernicus University. – These possibilities are even expanding, because you can talk about your diseases with ChatemGPT or Gemini. Both chats are based on information from Google or other search engines, so don’t be fooled, their diagnosis will not be better – he adds.

To prove this, scientists “asked” ChatGPT (created by Open AI, and currently the most popular language model that generates answers to asked questions) to solve a specialization exam in “internal medicine” (a common term for the branch of medicine dealing with diseases of internal organs). “We have removed some of the questions that ChatGPT would not be able to answer for technical reasons, e.g. those containing images or analytical elements related to another question,” explains Dr. Suwała

ChatGPT took the exam, which consists of a total of 120 questions, ten times. And he failed it every time. In none of the tests did artificial intelligence even make it to the oral part, i.e. it did not score 60%. points. Correct answers ranged between 47.5 and 53.3%. When analyzing the responses, the researchers noted that ChatGPT performed poorly on questions that asked for a specific keyword that the doctor, knowing what the question was about, was able to find and use. Another interesting fact was that sometimes the artificial intelligence knew the correct answer, but ended up selecting a different, incorrect one.

Each time, in addition to a specific answer, we received a description of why ChatGPT chose this particular one. That’s when we noticed that he repeatedly marked the wrong answer and then described the decision-making process as if he knew another, correct answer. Why was this happening? We do not know

– says Dr. Suwała.

ChatGPT has its successes in medicine

Interestingly, while in Poland ChatGPT failed miserably in internal medicine, there are countries where it managed to pass the medical exam. This was the case in the USA, where artificial intelligence successfully solved the USMLE (United States Medical Licensing Examination) test. It is a three-stage exam for future doctors who want to work in the USA.

As researchers from the Nicolaus Copernicus University remind us, in Poland, a student graduating from medicine receives a medical diploma. To obtain the full right to practice the profession, during a postgraduate internship or the last year of studies, a doctor must pass the Medical Final Examination (LEK). To pass it, you need to answer 56%. of 200 questions. LEK seems to be the most appropriate exam for the USMLE. Just like in the USA, in Poland ChatGPT coped with LEK, which was also checked by researchers from Bydgoszcz. – These are simpler questions, they cover rather basic issues, because doctors answer them right after graduation – says Dr. Suwała.

– Artificial intelligence may pass the exam, but it will not be able to cure the patient – ​​he explains.

Medical sciences, contrary to appearances, are not exact sciences. They have more in common with the humanities. It is not without reason that we talk about the art of medicine. Very often, when in contact with a patient, we see certain nuances that artificial intelligence may not notice. We often tell students that diseases do not read books. A patient may suffer from several different diseases, may have several other diseases, may be genetically different, and suddenly it turns out that the disease, which seemed simple, logical, and precisely described, develops completely differently in the patient. Will artificial intelligence be able to combine all the components? Perhaps in the future yes, but I don’t think it will be a matter of the next days, weeks, months or even years. I think it will be decades.

– emphasizes the scientist.

This, of course, does not change the fact that AI can already provide solid support for medics. This type of tools have been used for several years, including: for the diagnosis of cancerous lesions, in orthopedics for the diagnosis of fractures, in pathology for the evaluation of preparations and in the design of drugs.

Source: Gazeta

You may also like

Immediate Access Pro