blog image

Depression Analysis Using Machine Learning and AI

Depression has become one of the major global health concerns. Technology like AI and ML can be used to analyze depression data to provide better treatments to people suffering from different types of depressive disorders. We’ll discuss depression and the ML Python code used to analyze data.

The changing lifestyle and social scenarios have brought many changes to our lives. We have access to too much information. We are way too connected with the virtual world, and the lines between real and virtual are blurring rapidly. While it sounds like a good thing to stay up to date and informed about anything under the sun, it also has severe side effects. 

The fast-paced world has resulted in a lot of anxiety and stress, leading to different psychological issues in people. Depression and poor emotional health are now among the major concerns across the globe. Thankfully, technology is coming to the rescue yet again. Machine learning engineers and researchers are working on analyzing depression in people to detect the symptoms at earlier stages and provide better ways to cope with mental health issues. 

Artificial intelligence and machine learning algorithms can be used to analyze datasets with depression-related data to deliver accurate and in-depth insights. Let’s understand what depression actually is and how ML can provide a feasible solution to help people with depression and make their lives happier. 


What is Depression? 

Depression is a serious mental illness that makes you feel sad, lonely, tired, or anxious. It makes you lose interest in things you previously enjoyed. Depression is a psychological disorder that increases negative thoughts and emotions, leading to other health conditions. It also reduces your productivity, alertness, and ability to think coherently. It affects how you think, feel, and act. 

Depression is a common condition seen in many people. Many times, people themselves don’t realize that they are in depression. Statistics show that around 3.8% of the global population suffers from depression. This includes 5.7% of adults who are aged over sixty years and 5% of adults aged less than sixty. 

To put it in figures, 280-310 million people have depression. What’s alarming is that more than 800,000 people commit suicide due to depression every year. Kids and teens are by no means safe from depression. The US is among the states with the highest depression rates around the world. 

Depression (Major Depressive Disorder, MDD) is commonly known as clinical depression. MDE (Major Depressive Episode) is a measure of time a person exhibits or has the symptoms of depression. Note that mood swings and short bursts of anger/ irritation are not considered depression. 


Different Types of Depression 

Depression is an umbrella term that covers more than one type of mental illness/ disorder. It can be classified into the following types: 

Anxiety/ Distress 

Anxiety is when you feel stressed and tense throughout the day. It brings negative thoughts about how things can go wrong or that something really bad will happen to you or your loved ones. So much worry takes over your mind and your thoughts. It also leads to anxiety and panic attacks.  

Agitation 

You feel uneasy and uncomfortable no matter what. You cannot relax and calm down. An agitated person has jerky movements and is constantly fidgeting or in motion. You cannot sit in a position for more than a few seconds. Some people also tend to talk a lot when agitated. It doesn’t make sense, but you can’t control it either. 

Melancholy 

Melancholy is intense sadness or emotional pain. It fills your mind to an extent where even good things don’t cheer you up. Activities you usually enjoy also fail to make you happy. Melancholy results in loss of appetite, sad thoughts, feeling down/ low in the mornings, disturbed and irregular sleep patterns, and suicidal thoughts. 


Persistent Depressive Disorder

Persistent Depressive Disorder is when a person is suffering from depression for more than two years. It is a chronic condition where the person is highly vulnerable and susceptible to making harmful decisions. PDD is used to describe chronic major depression and dysthymia (low-grade persistent depression). The symptoms of this disorder are: 

  • Drastic changes in appetite (starving or overindulging in food)
  • Erratic sleep schedules (sleeping for hours or not sleeping at all)
  • Low self-esteem
  • Lack of energy, motivation 
  • Disinterest in just about everything 
  • Hopelessness 
  • Inability to make decisions 
  • Lack of concentration power 
  • Weight loss or gain 
  • Feeling guilty 
  • Suicidal thoughts 

What is Bipolar Disorder?  

Bipolar disorder is also called manic depression, as it causes extreme mood swings in a person. You might experience random bursts of energy where you feel fantastic and at the top of the world. You work and overdo things until you’re exhausted. Meanwhile, on the other end of the spectrum, you’ll feel miserable and horrible about anything and everything. You feel fatigued, tired, and worthless. 

This is a vicious cycle where you alter between two contrasting moods but no middle ground. Doctors recommend mood stabilizers like lithium and calming activities like meditation to bring some sort of balance and stability to your mood. 


Symptoms and Warning Signs of Depression

Depression has many symptoms, some of which overlap with a general lack of mood or exhaustion after a long day of work. Naturally, all of us feel low at some point in our lives or another. But when the feelings persist and take over our lives, it is a sign of depression. 

Depression isn’t general sadness or pain of loss. It is more intense and can wreak havoc in your life by gradually robbing your happiness and ability to assert yourself. You can no longer feel, think, work, enjoy, and act the way you used to do. Some people term it as ‘living in a black hole’, where the void sucks out even the last bit of energy and happiness from you. 

Some feel apathetic to their surroundings. Nothing matters to them anymore. Others have a constant sense of impending doom and cannot consider a positive alternative. Men exhibit signs of anger and restlessness, while women have excessive feelings of guilt, sleepiness, hunger, etc. Obviously, this varies from person to person. 

Apart from this, all the above-listed symptoms are warnings signs of depression. A person who exhibits such signs needs medical intervention as soon as possible. 


Datasets Used to Analyze Depression 

Using the following datasets, it is possible to build an AI and ML model for depression analysis.  

Dataset by Kaggle 

The datasets available at Kaggle can help developers and researchers build systems that can automatically detect the depression state of a person based on the given sensor data. This dataset can be used in the following ways (and more): 

  • Analyze the sleep patterns of depressed and non-depressed people 
  • Use machine learning to classify different states of depression 
  • Use motor activity data to predict MADRS score (Montgomery–Åsberg Depression Rating Scale) 

Kaggle is a Google subsidiary and an online community of machine learning practitioners and data scientists.

LGHC Adult Depression Indicator 

This dataset is used as the main source at the Let’s Get Healthy California indicator on the site, https://letsgethealthy.ca.gov/. The data is geographically limited to California and comes from the California Behavioral Risk Factor Surveillance Survey (BRFSS). The data in the table is about the number of adults who never knew they had a depressive disorder. 

The BRFSS survey is conducted by the Public Health Survey Research Program of California State University in Sacramento, under contract from CDPH (California Department of Public Health). It is an annual telephone survey conducted cross-sectionally to understand health-related issues in Californians. Aspects like chronic health conditions, risk behavior, and use of preventive services are measured. 

Reported Symptoms of Anxiety and Depression in the Last Seven Days (from US Census Bureau) 

The US Census Bureau, along with five federal agencies started a survey to gather data about COVID-19 and its impact on American households. The Household Pulse Survey aimed to measure the extent of the pandemic’s impact on aspects like food security, employment, housing, education, disruptions, consumer spending abilities, and the physical plus mental wellness of the citizens. 

The survey took place online, where participants received invitations via emails and text messages. The email addresses and phone numbers were randomly selected, making sure that only one person from a household participated in the survey. The sample data was selected from the Census Bureau Master Address File Data. The estimates were adjusted to accommodate the lack of response from some participants. It also had to match the estimates of the Census Bureau in terms of age, gender, educational qualification, race, and ethnicity. The estimates also met the NCHS Data Presentation Standards for Proportions. 

Camden Depression and Anxiety Profile 

Camden is a town in northwest London, UK. This dataset provides information about the patterns and trends of diagnosed depression and anxiety in the townsfolk aged 18 and above. 

Summary of Depression Datasets 

S.No.Dataset NameProvided ByDataset SizeDownload links
1The depression datasetKaggle53.12 MBhttps://www.kaggle.com/datasets/arashnic/the-depression-dataset
2Adult Depression (LGHC Indicator)Get Healthy California54.71 KBhttps://data.world/chhs/5a281abf-1730-43b0-b17b-ac6a35db5760
3Indicators of Anxiety or Depression Based on Reported Frequency of Symptoms During Last 7 DaysU.S. Census Bureau1.6 MBhttps://catalog.data.gov/dataset/indicators-of-anxiety-or-depression-based-on-reported-frequency-of-symptoms-during-last-7-
4Camden Depression and Anxiety ProfilePublic Health Intelligence1.8 MBhttps://data.world/datagov-uk/fdf83747-0aeb-4cb9-840d-75f16c8d3105
Summary of Depression Datasets

Using NLP and ML to Analyze Depression Data in Python 

Step1: Installing necessary dependencies

!pip install nltk
!pip install pandas
!pip install numpy

Step2: Importing necessary libraries:

import pandas as pd
import seaborn as sns
import re
from nltk.stem import PorterStemmer,WordNetLemmatizer
from nltk.corpus import stopwords
from wordcloud import WordCloud
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from imblearn import under_sampling 
from imblearn import over_sampling
from imblearn.over_sampling import SMOTE
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report
import re
import pickle

Step3: Reading the dataset 

df = pd.read_csv("depression_data.csv")
df.head(5)
df['message'].iloc[:1]
df.columns
df = df.drop('Unnamed: 0',axis=1)
df

Output:

10314 rows × 2 columns

messagelabel
0just had a real good moment. i missssssssss hi…0
1is reading manga http://plurk.com/p/mzp1e0
2@comeagainjen http://twitpic.com/2y2lx – http:…0
3@lapcat Need to send ’em to my accountant tomo…0
4ADD ME ON MYSPACE!!! myspace.com/LookThunder0
10309No Depression by G Herbo is my mood from now o…1
10310What do you do when depression succumbs the br…1
10311Ketamine Nasal Spray Shows Promise Against Dep…1
10312dont mistake a bad day with depression! everyo…1
1031301

Step4: Data Analysis and Data Visualization:

sns.countplot(df['label'])

Output:

C:\Users\rahul\anaconda3\lib\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variable as a keyword arg: x. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(
<AxesSubplot:xlabel='label', ylabel='count'>
image1.png
wo = WordNetLemmatizer()
corpus=[]
for i in range(0,len(df)):
    message = re.sub('[^a-zA-Z]',' ',df['message'][i])
    message = message.lower()
    message = message.split()
    message = [wo.lemmatize(word) for word in message ]
    message = ' '.join(message)
    corpus.append(message)
corpus[2]
depressive_words = ' '.join(list(df[df['label'] == 1]['message']))
depressive_wc = WordCloud(width = 512,height = 512, collocations=False, colormap="Blues").generate(depressive_words)
plt.figure(figsize = (10, 8), facecolor = 'k')
plt.imshow(depressive_wc)
plt.axis('off')
plt.tight_layout(pad = 0)
plt.show()

Output:

Output - Depression Analysis
positive_words = ' '.join(list(df[df['label'] == 0]['message']))
positive_wc = WordCloud(width = 512,height = 512, collocations=False, colormap="Blues").generate(positive_words)
plt.figure(figsize = (10, 8), facecolor = 'k')
plt.imshow(positive_wc)
plt.axis('off'), 
plt.tight_layout(pad = 0)
plt.show()

Output:

Depression Analysis Output

Step5: Train and Test Data Preparation:

X_train, X_test, y_train, y_test = train_test_split(corpus,df['label'],test_size=0.25,random_state=42)
vectorizer = TfidfVectorizer( ngram_range=(1,3), stop_words='english',max_features=15000)
X_train_vect = vectorizer.fit_transform(X_train)
X_test_vect = vectorizer.transform(X_test)
X_train_vect.shape

Step6: Handing Data Imbalancing in Data:

x_resample, y_resample = SMOTE().fit_resample(X_train_vect, y_train)
x_test_resample, y_test_resample = SMOTE().fit_resample(X_test_vect, y_test)
# lets print the shape of x and y after resampling it
print(x_resample.shape)
print(y_resample.shape)

Step7: Model Training Using Logistic Regression Algorithm

clf = LogisticRegression(solver='lbfgs')
clf.fit(x_resample,y_resample)
y_pred = clf.predict(x_test_resample)
accuracy_score(y_test_resample,y_pred)
print(classification_report(y_test_resample,y_pred))

Output:

precisionrecallf1-scoresupport
0.960.990.972011
0.990.950.972011
accuracy0.974022
macro average0.970.970.974022
weighted average0.970.970.974022

Step8: Model Training Using Naive Bayes Algorithm

mnb = MultinomialNB()
mnb.fit(x_resample,y_resample)
y_pred = mnb.predict(x_test_resample)
accuracy_score(y_test_resample,y_pred)
print(classification_report(y_test_resample,y_pred))

Output:

precisionrecallf1-scoresupport
0.960.950.962011
0.950.960.962011
accuracy0.964022
macro average0.960.960.964022
weighted average0.960.960.964022

Step9: Testing the Model with User Inputs:

def preprocess(data):
    #preprocess
    a = re.sub('[^a-zA-Z]',' ',data)
    a = a.lower()
    a = a.split()
    a = [wo.lemmatize(word) for word in a ]
    a = ' '.join(a)  
    return a
strr = input('Enter Your Message: ')
print("-------------------------------")
examples = strr
a = preprocess(examples)
example_counts = vectorizer.transform([a])
prediction =mnb.predict(example_counts)
prediction[0]
if prediction[0]==0:
    print('Positive')
elif prediction[0]==1:
    print('Depressive')

Output:

Enter Your Message: happy birthday
-------------------------------
Positive

Conclusion 

The machine learning models can be used to analyze depression and its types, anxiety, PTSD, bipolar disorder, and a variety of other mental disorders that affect people from all parts of the world. As more and more data is created about depression patterns and symptoms, the algorithm will get more accurate and deliver better predictions. 

This will help identify depression in its early stages and enable medical practitioners to help people recognize their condition and opt for appropriate treatment to control depression and lead happier lives. 

Leave a Reply

DMCA.com Protection Status