Evaluating the performance of AI-based large language models in radiation oncology

In a new study published in the journal AI in Precision Oncology, Nikhil Thaker, from Capital Health and Bayta Systems, and co-authors, evaluated the performance of various LLMs, including OpenAI’s GPT-3.5-turbo, GPT-4, GPT-4-turbo, Meta’s Llama-2 models, and Google’s PaLM-2-text-bison. The LLMs were given an exam including 300 questions, and the answers were compared to Radiation Oncology trainee performance.

Leave A Comment

Your email address will not be published. Required fields are marked *