By Dr. Guillermo Cecchi
More than 63 million psychiatric interviews are conducted every year. But none of them are analyzed in a quantitative codified manner. Surprising? Not really. Doctors don’t have time to find patterns in the pages of notes they keep per patient. Those pages, though, keep “big data” on psychiatric issues that analytics can help unlock and predict before episodes occur.
Now, after a multi-year study and accompanying development of text analysis algorithms, we may finally be able to quantify patterns in these interviews, and help doctors treat patients suffering from post-traumatic stress disorder and other conditions.
The most recent effort to match machine learning to clinical text started with work by my colleague Cheryl Corcoran at Columbia University, whom I worked with on speech graphs. She had been studying speech patterns to predict psychotic episodes. The patients had known pre-psychotic symptoms, but no known outbreaks. They participated in one interview and were observed for another two-and-a-half years. The belief was that speech patterns could identify those who were pre-psychotic, regardless of how apparent – or not – the symptoms were.
The unstructured data from the interviews was just too large to sort and codify. No patterns were emerging. But maybe a smart machine, and data from a past ecstasy study, could help.
Our current collaboration on pre-psychotic speech analysis needed baseline data. We found it in the form of an ecstasy study by another Columbia University colleague, Gillinder Bedi. While at the University of Chicago she compared interviews of those under the influence of the drug ecstasy, versus those taking a placebo. While a drug of abuse, ecstasy also has well-established pro-social effects, and is being studied for potential psychotherapeutic use. Her study administered ecstasy to regular users for four interviews under strict monitoring protocols. Its affect on a person’s emotional state, such as increased empathy, made for effective comparison to those not under the influence – and the algorithms we wrote with the help of my colleagues Facundo Carrillo and Diego Slezak at the University of Buenos Aires uncovered even more.
We found for the first time in known literature that ecstasy users’ speech fluidity increases and they use fewer catch phrases. This knowledge helped establish a baseline for which to compare patients with potential to suffer a psychotic episode, as the coherence of their discourse (how semantically similar consecutive phrases are) is a key symptom; our initial results were presented at the annual meeting of the American College of Neuropsychopharmacology in 2013.
By using real time machine learning to find word and phrase patterns during interviews, a psychiatrist would have a much better view of a patient’s true state of mind.
Combining the qualitative with the quantitative
Psychiatry is full of historical literature that characterizes patient conditions. Doctors must also fill out interview scales with questions such as “how anxious is the patient, on a scale from 1 to 5?” And these resources are effective. They’re just qualitative, only. For example, we did not find any study attempting to predict schizophrenia because, in part, of practitioners’ inability to simply and quickly compare notes. Until now, the ability to use computers to match vast amounts of this unstructured data didn’t exist. So, there was a lack of objective criteria that could be agreed upon across these practitioners and institutions.
Our machine learning algorithms can accurately read, analyze, and find those patterns. The next step is to give doctors a way to do this analysis in real time, by “hearing,” transcribing, and analyzing an interview in real time – all via a mobile device.
The prototype developed by our software lab in India works through a mobile cloud platform via a smart phone interface designed for health care workers. The device acquires (“hears”) the speech of the patient being interviewed, and sends it to a server managed by the healthcare facility for transcription and de-identification (for patient confidentiality). The output is then analyzed by a separate IBM cloud application, which returns results to the device – comparing the patient’s speech against other previously diagnosed patients, and the density of “loops” in the patient’s speech, as compared to the normal population.
We do not think of the app so much as a diagnostics tool, but rather as something akin to a blood test. It is and will always be the clinical psychiatrist who makes the diagnosis. Our measures can inform that diagnosis by capturing speech patterns not readily identifiable and feeding them back in real time, and kept over time, to the psychiatrist. Today, in the current research phase, participating doctors must enter their diagnosis before receiving the assessment. This aggregation and annotation of the data allows us to fine tune the analyses.
Perhaps in the future, those annual 63 million interviews will be codified, and contribute to diagnoses that help those suffering from PTSD, depression, and other conditions – all before any psychotic episodes actually occur
Read our findings in the paper A Window into the Intoxicated Mind? Speech as an Index of Psychoactive Drug Effects in the latest issue of Neuropsychopharmacology.
This work was also done in collaboration with I. Rish and J. Kozloski at IBM’s Thomas J. Watson Research Center, and S. Allam at IBM India.