This thesis presents a novel framework for detecting depression in clinical subjects using large language models (LLMs) and the perplexity metric. The framework utilizes the GPT-2 model on the DAIC corpus, where perplexity serves as a decision criterion to differentiate between depressed and non-depressed individuals based on their responses during clinical interviews. Inspired by prior studies on communication disorders, such as dementia, the study hypothesizes that consistent speech patterns in depressed individuals, associated with cognitive traits like rumination and emotional blunting, can be detected using this metric. Different decision criteria were evaluated, with the Minor Perplexity Criterion achieving the best performance on the DAIC-WOZ test set, reaching an average F1-score of 87.50%, which outperformed existing benchmarks. However, on the validation set, this criterion yielded a lower F1-score of only 76.70%. In contrast, the Medoid Perplexity Criterion performed more competitively on the validation set, achieving an F1-score of 84.4%, which surpassed most previous results. The study also examined the influence of different interview types on classification performances, finding that "only participant" interviews yielded the most reliable results. This suggests that the subjects' text, cleaned of external prompts, may better reflect mental states. Agent-based interviews, such as those conducted with the virtual agent "Ellie," introduced significant biases that reduced overall performance. This highlights the potential for agent interactions to distort the natural expression of depressive symptoms. Despite promising results, challenges such as data imbalance and transcription quality emphasize the need for further refinement. Future research should focus on enhancing the model’s generalizability and exploring its potential for diagnosing other language-related disorders in healthcare.
This thesis presents a novel framework for detecting depression in clinical subjects using large language models (LLMs) and the perplexity metric. The framework utilizes the GPT-2 model on the DAIC corpus, where perplexity serves as a decision criterion to differentiate between depressed and non-depressed individuals based on their responses during clinical interviews. Inspired by prior studies on communication disorders, such as dementia, the study hypothesizes that consistent speech patterns in depressed individuals, associated with cognitive traits like rumination and emotional blunting, can be detected using this metric. Different decision criteria were evaluated, with the Minor Perplexity Criterion achieving the best performance on the DAIC-WOZ test set, reaching an average F1-score of 87.50%, which outperformed existing benchmarks. However, on the validation set, this criterion yielded a lower F1-score of only 76.70%. In contrast, the Medoid Perplexity Criterion performed more competitively on the validation set, achieving an F1-score of 84.4%, which surpassed most previous results. The study also examined the influence of different interview types on classification performances, finding that "only participant" interviews yielded the most reliable results. This suggests that the subjects' text, cleaned of external prompts, may better reflect mental states. Agent-based interviews, such as those conducted with the virtual agent "Ellie," introduced significant biases that reduced overall performance. This highlights the potential for agent interactions to distort the natural expression of depressive symptoms. Despite promising results, challenges such as data imbalance and transcription quality emphasize the need for further refinement. Future research should focus on enhancing the model’s generalizability and exploring its potential for diagnosing other language-related disorders in healthcare.
Assessing Mental Health through Language: The Efficacy of Perplexity for Early Detection of Depression
COLOMBINO, MICHELE
2023/2024
Abstract
This thesis presents a novel framework for detecting depression in clinical subjects using large language models (LLMs) and the perplexity metric. The framework utilizes the GPT-2 model on the DAIC corpus, where perplexity serves as a decision criterion to differentiate between depressed and non-depressed individuals based on their responses during clinical interviews. Inspired by prior studies on communication disorders, such as dementia, the study hypothesizes that consistent speech patterns in depressed individuals, associated with cognitive traits like rumination and emotional blunting, can be detected using this metric. Different decision criteria were evaluated, with the Minor Perplexity Criterion achieving the best performance on the DAIC-WOZ test set, reaching an average F1-score of 87.50%, which outperformed existing benchmarks. However, on the validation set, this criterion yielded a lower F1-score of only 76.70%. In contrast, the Medoid Perplexity Criterion performed more competitively on the validation set, achieving an F1-score of 84.4%, which surpassed most previous results. The study also examined the influence of different interview types on classification performances, finding that "only participant" interviews yielded the most reliable results. This suggests that the subjects' text, cleaned of external prompts, may better reflect mental states. Agent-based interviews, such as those conducted with the virtual agent "Ellie," introduced significant biases that reduced overall performance. This highlights the potential for agent interactions to distort the natural expression of depressive symptoms. Despite promising results, challenges such as data imbalance and transcription quality emphasize the need for further refinement. Future research should focus on enhancing the model’s generalizability and exploring its potential for diagnosing other language-related disorders in healthcare.File | Dimensione | Formato | |
---|---|---|---|
Assessing Mental Health through Language_ The Efficacy of Perplexity for Early Detection of Depression.pdf
non disponibili
Descrizione: This thesis develops a framework for detecting depression using GPT-2 and the perplexity metric on the DAIC corpus.
Dimensione
2.23 MB
Formato
Adobe PDF
|
2.23 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14240/164311