March 03, 2022
Mental Health in Tech Workplace
In this project we'll try to understand what are the factos contributing the mental health of a person. This dataset is from a 2014 survey that measures attitudes towards mental health and frequency of mental health disorders in the tech workplace. There are always a lot of great project regarding different ways of solving the problems but only a few handful address the problems of domain knowledge and getting started. I will built on that to explain the dataset,perform EDA and then Build a baseline model.
Background
What exactly do we mean by Mental Health at workplace?
Mental health affects your emotional, psychological and social well-being. It affects how we think, feel, and act. It also helps determine how we handle stress, relate to others, and make choices. In the workplace, communication and inclusion are keys skills for successful high performing teams or employees. The impact of mental health to an organization can mean an increase of absent days from work and a decrease in productivity and engagement. In the United States, approximately 70% of adults with depression are in the workforce. Employees with depression will miss an estimated 35 million workdays a year due mental illness. Those workers experiencing unresolved depression are estimated to encounter a 35% drop in their productivity, costing employers $105 billion dollars each year.
What can your employer do about this?
So what can employers do? It’s called Mental Health First Aid. Mental Health First Aid teaches participants how to notice and support an individual who may be experiencing a mental health or substance use concern or crisis and connect them with the appropriate employee resources. It teaches employees critical communication and support skills that can influence your organizations bottom line.
Research shows that employees who go through Mental Health First Aid have an increased awareness of mental health among themselves and their co-workers. It allows them to recognize the signs of someone who maybe struggling and teaches them the skills to know when to reach out and what resources are available. Which in turn creates beneficial intervention that increases engagement and creates an environment of inclusion and support.
Employers can also offer robust benefit packages to support employees who go through mental health issues. That includes Employee Assistance Programs, Wellness programs that focus on mental and physical health, Health and Disability Insurance or flexible working schedules or time off policies.
Organizations that incorporate mental health awareness help to create a healthy and productive work environment that reduces the stigma associated with mental illness, increases the organizations mental health literacy and teaches the skills to safely and responsibly respond to a co-workers mental health concern. Incorporating mental health awareness in the workplace can help lead the way for mental health issues throughout your community by equipping people with the tools they need to start a dialogue so that more people can get the help they need.
Data Fields
Timestamp — Timestamp
Age — Age
Gender — Gender
Country — Country
state — If you live in the United States, which state or territory do you live in?
Self Employed — Are you self-employed?
Family History — Do you have a family history of mental illness?
Treatment — Have you sought treatment for a mental health condition?
Work Interfere — If you have a mental health condition, do you feel that it interferes with your work?
No Employees — How many employees does your company or organization have?
Remote Work — Do you work remotely (outside of an office) at least 50% of the time?
Tech Company — Is your employer primarily a tech company/organization?
Benefits — Does your employer provide mental health benefits?
Care Options — Do you know the options for mental health care your employer provides?
Wellness Program — Has your employer ever discussed mental health as part of an employee wellness program?
Anonymity — Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources?
Leave — How easy is it for you to take medical leave for a mental health condition?
Mentalhealthconsequence — Do you think that discussing a mental health issue with your employer would have negative consequences?
Physhealthconsequence — Do you think that discussing a physical health issue with your employer would have negative consequences?
Coworkers — Would you be willing to discuss a mental health issue with your coworkers?
Physhealthinterview — Would you bring up a physical health issue with a potential employer in an interview?
Mentalvsphysical — Do you feel that your employer takes mental health as seriously as physical health?
Obs_consequence — Have you heard of or observed negative consequences for coworkers with mental health conditions in your workplace?
Comments — Any additional notes or comments
Import Package and Data
import warnings
warnings.filterwarnings('ignore')
#basic libraries
import numpy as np
import pandas as pd
#visualization libraries
import missingno as msno
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
%matplotlib inline
For this exercise, the data set (.csv format) is uploaded to my repository github, read into the Jupyter notebook and stored in a Pandas DataFrame.
Data Preparation and Cleaning
Checking some feature from dataset
There are a total of 26 columns in the dataset. We see that except the age column, all the columns are of object datatype. Comment column seems to contain most number ( 70% ) of null values, which makes sense because it was an optional text box so it's reasonable to expect that many (most) respondents would leave it blank. We will be dropping the timestamp column because it's contains date, month, year and time the respondent took this questionnaire, which is irrelevant for us. The state column also contains a lot of null values. We'll dig deeper into that.
It will be really misleading to conclude that a certain country faces more problem with the mental health of employees because around 60% of the people belong to The US. Moreover there are a lot of countries which have only one respondents. The country column thus becomes pointless. We will be dopping this. A quick look at the states suggest us that it is applicable for the one's only in The US, so we'll drop it as well.
How can age be negative? And age below 15 years? Are they even legally allowed to work?
Regarding the gender, people have described themselves as male and females in such different ways.
Lets get back to our work and correct these responses. While this may not be the best way, we will be using this approach for gender column: We will be renaming and combining all the categories that mean the same into one.
Male, or cis Male, means born as male and decide to be male.
Female, or cis Female, means born as female and decide to be female.
Other, is a word that describes sexual and gender identities other than straight and cisgender. Lesbian, gay, bisexual, and transgender people may all identify with the word other.
Unknowingly, we have stumbled upon the fact that the number of males in the dataset are 4 times the number of females. Thus, we must keep this in mind and avoid making any faulty assumptions that males are more susceptible to mental health issues etc. Alternatively, we may conclude that the number of males in the tech industry are much more as compared to the number of females (This research was conducted specifically for the tech industry).
There's only one column which is 'work_interfere' remaining that contains null values. For now we will proceed without any imputation. Actually, there's another column, 'self_employed' which contains around 18 null values which we failed to notice at first.
Exploratory Data Analysis
Before begining with the EDA which should learn about the organization which has collected this data. Open Sourcing Mental Illness is a non-profit, corporation dedicated to raising awareness, educating, and providing resources to support mental wellness in the tech and open source communities. OSMI began in 2013, with Ed Finkler speaking at tech conferences about his personal experiences as a web developer and open source advocate with a mental health disorder. The response was overwhelming, and thus OSMI was born.
Every year, OSMI came out with a new survey to see how employees want to get mental health treatment in tech companies around the world and I pick the survey from 2014.
This survey is filled by respondents who suffer from mental health disorders (diagnose or un-diagnosed by medical, even it's just a feeling) in tech companies and see if any factors can affect the employee to get treatment or not.
From this research, this machine learning can help HR to see what factors have the company needs to support so the employee wants to get mental health treatment.
This is the respondents result of question, 'Have you sought treatment for a mental health condition?'
This is our target variable. Looking at the first graph, we see that the percentage of respondents who want to get treatment is exactly 50%. Workplaces that promote mental health and support people with mental disorders are more likely to have increased productivity, reduce absenteeism, and benefit from associated economic gains. If employees enjoy good mental health, employees can:
Be more productive
Take active participation in employee engagement activities and make better relations; both at workplace and personal life.
Be more joyous and make people around them happy.
After analysing the target variable, we will try to explore the individual columns and what they mean.
This is respondent's answer to the question, 'Are you self-employed?'.
We see that the number of people who are self employed are around 10%. Most of the people who responded to the survey belonged to working class. We also see that though there is a vast difference between people who are self employed or not, the number of people who seek treatment in both the categories is more or less similar. Thus, we may conclude that whether a person is self employed or not, does not largely affect whether he may be seeking mental treatment or not.
This is the respondents answer to the question, 'Do you have a family history of mental illness?'.
From close to 40% of the respondents who say that they have a family history of mental illness, the plot shows that they significantly want to get treatment rather than without a family history. This is acceptable, remember the fact that people with a family history pay more attention to mental illness. Family history is a significant risk factor for many mental health disorders. Thus, this is an important factor that has to be taken under consideration as it influences the behaviour of the employees to a significant extent.
This was the respondent's answer to the question, 'If you have a mental health condition, do you feel that it interferes with your work?'.
On seeing the first graph we conclude that around 48% of people say that sometimes work interefers with their mental health. Now 'Sometimes' is a really vague response to a question, and more often than not these are the people who actually face a condition but are too shy/reluctant to choose the extreme category.
Coming to our second graph, we see that the people who chose 'Sometimes' had the highest number of people who actually had a mental condition. Similar pattern was shown for the people who belonged to the *'Often category'.
But what is more surprising to know is that even for people whose mental health 'Never' has interfered at work, there is a little group that still want to get treatment before it become a job stress. It can be triggered a variety of reasons like the requirements of the job do not match the capabilities, resources or needs of the worker.
We will be leaving the 'number_of_employees' category and move forward with the next column which is 'remote_work'.
This was the respondent's answer to the question, 'Do you work remotely (outside of an office) at least 50% of the time?'.
Around 70% of respondents don't work remotely, which means the biggest factor of mental health disorder came up triggered on the workplace. On the other side, it has slightly different between an employee that want to get treatment and don't want to get a treatment. The number of people who seek treatment in both the categories is more or less similar and it does not affect our target variable. Let's move forward with our next variable which is 'tech_company'.
This is the respondents answer to the question, 'Is your employer primarily a tech company/organization?'.
Although the survey was specifically designed to be conducted in the tech field, there are close to 18% of the companies belonginf to the non tech field. However, looking at the second graph, one may conclude that whether a person belongs to the tech field or not, mental health still becomes a big problem.
However, on a deeper look we find that the number of employees in the tech sector who want to get treatment is slightly lower than the one's who don't. But in the non-tech field the situation gets reversed.
The next category that we'll be looking into is benefits
This was the respondent's answer to the question, 'Does your employer provide mental health benefits?'.
We see that around 38% of the respondents said that their employer provided them mental health benefits, whereas a significant number ( 32% ) of them didn't even know whether they were provided this benefit.
Coming to the second graph, we see that for the people who YES said to mental health benefits, around 63% of them said that they were seeking medical help.
Surprisingly, the people who said NO for the mental health benefits provided by the company, close to 45% of them who want to seek mental health treatment.
This was the respondent's answer to the question, 'Do you know the options for mental health care your employer provides?'. Since this graph is more or less similar to the benefits one, we won't be discussing it in more detail. Moving forward, the next category is wellness program. Lets try understanding that!
This is the respondents answer to the question, 'Has your employer ever discussed mental health as part of an employee wellness program?'.
About 19% of the repondents say YES about becoming a part of the employee wellness program and out of those 60% of employee want to get treatment.
One shocking revealation is that more than 65% of respondents say that there aren't any wellness programs provided by their company. But close to half of those respondents want to get treatment, which means the company needs to fulfil its duty and provide it soon.
The next category is seek_help, we will be leaving it as it is more or less similar to care_options, benefits and wellness_program. Our next category is anonymity.
This is the respondent's answer to the question, 'Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources?'.
Around 65% of the people were not aware whether anonymity was provided to them and 30% said yes to the provision of anonymity by the company.
Looking at the second graph, we see that out of the people who answered yes to the provision of anonymity, around 60% of them were seeking help regarding their mental condition. Possible reasoning for this may be that the employee feels that the company has protected his/her privacy and can be trusted with knowing the mental health condition of it's workers. The most basic reason behind hiding this from the fellow workers can be the social stigma attached to mental health.
The next factor that we will be discussing is 'leave.'
This is the respondent's answer to the question, 'How easy is it for you to take medical leave for a mental health condition?'.
While close to 50% of the people answered that they did not know about it, suprisingly around 45% of those people sought help for their condition.
A small percent of people ( around 8% ) said that it was very difficult for them to get leave for mental health and out of those, 75% of them sought for help.
Employees who said it was 'somewhat easy' or 'very easy' to get leave had almost 50% people seeking medical help.
The next category that we'd be looking into is mental health consequence.
This is the respondent's answer to the question, 'Do you think that discussing a mental health issue with your employer would have negative consequences?'.
Around same number of people ( around 40% each ) answered Maybe as well as No for the negative impact of discussing mental health consequences with the employer and about 23% said Yes to it.
23% is a significant number who feel that discussing their mental health might create a negative impact on their employer. This may be because of the stigma, decreased productivity, impact on promotions or any other preconcieved notion.
It is nice to know that out of the people who answered No, there were only around 40% of the people who actually sought after help, whereas in both the other categories, it is more than 50%.
The next factor that we are going to discuss is physical health consequence. It will be interesting to compare both of these two together.
This is the respondent's answer to the question, 'Do you think that discussing a physical health issue with your employer would have negative consequences?'.
There is a starking difference between the reponses for the same question regarding mental and physical health. More than 70% of the employees believe that their physical health does not create a negative impact on their employer and only 5% of them believes that it does.
While it maybe incorrect for us to draw any conclusions about whether they seek mental help on the basis of their physical condition, because it is more or less same for all the three categories, we must keep in mind about how differently mental and physical health are treated as a whole.
This is the respondent's answer to the question, 'Would you be willing to discuss a mental health issue with your coworkers?'.
Around 62% of the employees said that they might be comfortable discussing some type of mental problems with their coworkers, and out of them around 50% actually sought for medical help.
20% of the employees believed that discussing mental health with their coworkers wasn't a good option for them.
The next category is supervisor. Lets find out whether the employees are comfortable sharing their mental health with their supervisor.
This is the respondent's answer to the question, 'Would you be willing to discuss a mental health issue with your direct supervisor(s)?'.
This graph is quite different from the one of the coworker. Here, around 40% of the workers believe in sharing their mental health with their supervisors. This may have something to do with their performance etc.
Looking at the second graph, employees who actually sought for help regarding their mental health was more or less similar for all the three categories.
This has become really tiring now! Anyway, just 2-3 categories more left for analysis. Let's move forward with our next variable, which is 'mental_health_interview'
This is the respondent's answer to the question, 'Do you think that discussing a mental health issue with your employer would have negative consequences?'.
As our intution might suggest us, 80% of the respondents believe that it is a good option to discuss your mental health with the future employer. This is actually a good thing! This might not have been the case 15 years ago.
While around 15% of the candidates seem confused about whether they should be discussing their mental conditions with the future employer or not, less than 5% think that it may not be a good option discussing it.
The next category is physical_health_interview. Let's see if there's any difference in the respondent's answer for this one with the previous one.
This is the respondent's answer to the question, 'Would you bring up a physical health issue with a potential employer in an interview?'.
While a majority of the people are still dubious about discussing their physical health condition with the future employer, however, close to 17% believe that there is no issue in discussing their physical health conditions.
Around 50% of the people still remain confused about whether it is a good option to discuss their condition or not.
Coming to the last but one, mental_vs_physical. Let's see what insights can be drawn from this category.
This was the respondent's answer to the question, 'Do you feel that your employer takes mental health as seriously as physical health?'.
While close to 50% people said that they didn't know, the number of people who answered Yes as well as No were completely equal.
For the people who answered Yes as well as the ones who answered No, more than 505 of them sought after medical help for their mental health, whereas it was not the case for the one's belonging to the 'Don't know' category.
Coming to the last column, we have finally reached to obs_consequence.
This was the respondent's answer to the question, 'Have you heard of or observed negative consequences for coworkers with mental health conditions in your workplace?'.
Majority ( 85% ) of the people, answered No to this question. This is quite important to note that IT being an organised sector, follows strict guidelines of employee satisfaction etc. Thus, we didn't come across any major issue regarding the employer behavior as such!
Data Preparation
We have only two columns left that contain null values - work_interfere and self_employed. Let us try to fill these null values and make our data ready for further processing.
Since, there are only 20% of work_interfere so let's change NaN to "Don't know.
There are only 1.4% of self employed so let's change NaN to NOT self_employed
We will be replacing the blank values with 'Don't Know' for work_interfere category and for the
We can clearly see that all the columns except the 'Age', consist of object type values. We also notice that most of the columns consist of values 'Yes', 'No' , 'Maybe' etc. which can be easily encoded. So the next step that we would perform will be encoding.
We can see that the target column, i.e 'treatment' has almost equal values for both the categories. This means that we do not have to perform undersampling or oversampling. Now let us make a heatmap and try to understand the correlation of various features with the target variable.
Evaluating Model