Health Tech

Study: LLMs Identify Mental Health Crises with Accuracy Comparable to Clinicians

OpenAI’s GPT-4 was able to identify suicidal ideation with similar accuracy to clinicians, but in a much shorter amount of time, a new study from Brightside Health found.

Large language models (LLMs) can identify and predict mental health crises with comparable accuracy to clinicians, but in a significantly shorter amount of time, a new study shows. The findings indicate the potential AI has in supporting clinicians at a time when there is a severe shortage of behavioral health providers.

The peer-reviewed study was done by Brightside Health and was published in JMIR Mental Health. The San Francisco-based mental health company provides virtual care to patients with mild to severe clinical depression, anxiety and other mood disorders. Its platform offers psychiatry, therapy and a crisis care program for those with elevated suicide risk. The company uses AI in several ways, including its PrecisionRx tool, which analyzes patient data to tailor treatment to each patient.

The study used deidentified patient data from intake questions on Brightside’s platform for 140 patients who indicated suicidal ideation and 120 patients who later indicated during treatment that they had suicidal ideation with a plan to act. Data were also pulled from 200 patients who never indicated suicidal ideation. Suicidal ideation is thinking about, considering or planning suicide.

Six Brightside clinicians were then shown the patients’ data but were only shown information on their past suicide attempts and their written responses to a question about what they were feeling or experiencing. The clinicians were then asked “a simple yes or no question regarding their prediction of endorsement of [suicidal ideation] with a plan, along with their confidence level about the prediction.” OpenAI’s GPT-4 was then tasked with the same thing.

The researchers found that the clinicians were able to predict suicidal ideation with a plan with 55.2% to 67% accuracy. GPT-4 was able to predict with 61.5% accuracy.

In addition, GPT-4 was able to give its evaluation for the 460 samples in less than 10 minutes, whereas the average clinician took over three hours.

“Our research supports the notion that generative AI holds promise for identifying patients at risk of suicide. … With a shortage of behavioral health clinicians and burnout rates high, having tools to help clinicians triage and identify which patients need the timeliest care is extremely important, especially for higher acuity and severity patients,” said Dr. Mimi Winsberg, co-founder and chief medical officer at Brightside Health.

presented by

Suicide is currently the second leading cause of death among adults ages 18 to 45, and 12.3 million Americans ages 18 or older reported having thoughts of suicide in 2021. However, suicide can be extremely difficult to predict, according to Winsberg. In addition, 122 million Americans are living in a mental health professional shortage area, making it difficult to receive care. With AI proving itself useful in other areas of healthcare, Brightside chose to conduct the study to see if it could have a similar impact on mental health, Winsberg said.

While the study shows AI’s promise in moving the needle on these stats, Winsberg noted that the technology is meant to be used in a controlled manner with human oversight.

“This research highlights the potential of LLMs for efficient triage and clinical decision support in mental health and how technologies such as these can help alleviate clinician time shortage and empower them with risk assessment tools, which is especially crucial for patients at risk of suicide. … While generative AI has the potential to enhance clinical decision-making and patient care within the mental health sector, it will be important to use generative AI to support clinicians, not replace them, using a collaborative approach between AI and human expertise,” she said.

Picture: Yuichiro Chino, Getty Images
Image: 1936530447