Report on Stress detection using Machine Learning, Thesis of Machine Learning

Its a full project report covering all important features and detailed about the project.

Typology: Thesis

2016/2017

Available from 10/18/2021

sidrah-kaleem
sidrah-kaleem 🇵🇰

3

(3)

10 documents

1 / 45

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
INDEX
S.NO CONTENTS PAGE NUMBER’s
1Abstract 8
2Chapter-1
Introduction 9
31.1-Purpose of the
Project
9
4Chapter-2
Literature Survey
10-13
52.1- Related Research 10
62.2-Existing System 11-12
72.3-Limitations of
the Existing System
13
8Chapter-3
Proposed System
14-15
9Chapter-4
Proposed Work
16-21
1 | P a g e
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d

Partial preview of the text

Download Report on Stress detection using Machine Learning and more Thesis Machine Learning in PDF only on Docsity!

INDEX

S.NO CONTENTS PAGE NUMBER’s

1 Abstract 8

Chapter-

Introduction

1.1-Purpose of the

Project

4 Chapter-

Literature Survey

5 2.1- Related Research 10

6 2.2-Existing System 11-

7 2.3-Limitations of

the Existing System

8 Chapter-

Proposed System

9 Chapter-

Proposed Work

10 4.1-Dataset 16

11 4.2-Data analyzing

and data mining

12 4.3-Visualizations 17-

13 Chapter-

Model Building

14 5.1-Bit of Theory 22

15 5.2-Bit of Math 22-

16 5.3-Data Cleaning 23

17 5.4-Converting

Categorical Values

18 5.5-Splitting the

Dataset

19 5.6-Accuracy 24

CHAPTER -

INTRODUCTION

DEFINATION:

Stress is a body’s method for reacting to a challenge.Human Stress can have an impact on a person’s mental and physical well-being. Stress can lead to a change in behaviour and in physiology. Many people suffer from stress in everyday life. Stress is related to human work in one way or other. Originates of stress have different sources such as time pressure while working in company,responsibility,economic problem or physical factors such as noise. Signs of stress are human fell tension,anxious,angry,frustrated or irritated by things over which he has no control. 1.1 PURPOSE OF THE PROJECT: Stress detection is an on-going research topic among both psychologists and engineers.Wearable sensors and bio signal technologies are developed for detecting the human stress. There are various bio signal processing technologies use for human stress detection such as Electroencephalography(EEG), Electrocardiography(ECG), Electromyography (EMG), Blood Pressure(BP) ,Blood Volume Pulses(BVP), Galvanic Skin Resistance(GSR), Respiration and Skin Temperature(ST) etc.. Also to measure the stress level human physiological features are used. Person physiological features changes when he/she response to stressful events. By considering various physiological features occur in human while he/she is in stress the estimation is stress are done by using cluster based analysis method. In proposed method Machine Learning technology which is trending now-a-days is used.Since Machine Learning leads to building of model for prediction of outcome which in turn for building a model it should require some data for training which in turn requires collection of dataset. Here in the proposed method we used the dataset which consist of 1600 records for training the model. The attributes in the dataset are upto 12 which include some common factors which are responsible for stress. These

attributes includes hours of work, family problems, insufficient income ,age,carrier options etc..This trained model is connected to a web application which is written using HTML and CSS. So the user can easily know whether the person need treatment or not..High level and average level stress values people should get treatment and low level stress values people no need of taking treatment according to the model we built.

CHAPTER-

LITERATURE SURVEY

A literature review is conducted over analysis of stress and detection of stress level which are done by previously by our senior citizens. By this we came to know that everyone who are worked on the detection of stress used the parameters provided by some electronic devices like ECG,EEG etc… Some of the previous methods are shown below :

1. 2.1-Related Research

  1. Previous work on stress measurement has been focused on the collection and analysis of physiological data and the identification of the correlation between perceived stress and multiple physiological features. Qianli Xu, Tin Lay Nwe, and Cuntai Guan(2015) : Proposes a novel Cluster-Based Analysis method to measure stress using physiological signals, which accounts for intersubject differences. This research uses the clustering process that assigns the subject into subgroups, So as to exploit the inherent homogeneity of subject’s stress response within the cluster. Thus the intersubject differences are automatically accommodated and the overall accuracy of stress evaluation is improved. Chee-Keong Alfred Lim and Wai Chong Chia(2015): Focuses on evaluating to what extent a single electrode EEG headset - NeuroSky Mindwave is able to classify brainwave in terms of subject’s stressor level. In this study they use the MATLAB environment for processing the EEG signals. By reducing the

**3. 2.2Existing System

  1. Existing system proposes a novel cluster based analysis method to measure perceived stress using physiological signals, which accounts for the inter-subject differences.** The system contains three stages Data Gathering, Data Preprocessing and Data Classification. Data is collected by using wireless sensors. Wireless sensor was used to collect a subject’s physiological signals. Data preproccesing discover the useful info. This is responsible for collecting raw data and converting it into info useful for decision making by users. It does the feature extraction on data means redundancy in the dataset is reduced by this process. In Data classification mode feature extracted data are used to classified subjects into different classes.

5. 2.3-Limitations of the Existing System

  1. There are various researches are going on human stress detection and estimation of stress. But everyone is worked and still working on the data that is giving by the ECG machines or some mechanical instruments. But we know that every individual is not capable or offrdable for those ECG testing which are not provided freely. And the values given by the signals are not understandable.. So it is necessary to bring a machine or application that asks for the user’s convinient data like understandable data based on that data it should predict the stress. So, in this paper we gonna elaborate about how to satisfy this limitations and come up with a application by which everyone can access through their smart phones and can detect the stress with simple data.

In the proposed approach we predict the stress and provide suggestion whether to take treatment or not by building a MLmodel using Naive Bayes algorithm..The model works with an accuracy of 85%.The following are the steps involved in building the model and deploying it.  Importing dataset  Important feature extraction using matplotlib library by knowing relation between parameters  Filling null values with mean or mode.  Building model using Naive Bayes Algorithm  Connecting model to python program and executing it through we application on local host sever.

8. Softwares Required :

  1. Any python IDE like jupyter notebook etc..
  2. Any python we application development environment using flask like Spyder.
  3. Any search engine for web application.

CHAPTER-

PROPOSED WORK

4.1 DATASET

Since we know that dataset is the main part for building any model that predicts an outcome. We collected the dataset from the internet which consist of parameters as which are the questions asked to the citizens in different countries. The parameters include Timestamp( Time the survey was submitted ) , Age( Respondent age ),Gender( Respondent gender), Country( Respondent country ),state( If you live in the United States, which state or territory do you live in? ), self employed( Are you self- employed? ),family history( Do you have a family history of mental illness? ),treatment( Have you sought treatment for a mental stress condition? ),work interfere( If you have a mental stress condition, do you feel that it interferes with your work? ),no employees( How many employees does your company or organization have? ),remote work( Do you work remotely (outside of an office) at least 50% of the time? ),tech company( Is your employer primarily a tech company/organization? ),benefits( Does your employer provide mental stress benefits? ),care options( Do you know the options for mental health care your employer provides? ),wellness program( Has your employer ever discussed mental stress as part of an employee wellness program? ),seek help( Does your employer provide resources to learn more about mental stress issues and how to seek help? ),anonymity( Is your anonymity protected if you choose to take advantage of mental stress or substance abuse treatment resources? ),leave (How easy is it for you to take medical leave for a mental health condition?), mental health consequence( Do you think that discussing a mental stress issue with your employer would have negative consequences? ),phys health consequence( Do you think that discussing a physical health stress with your employer would have negative consequences? ),coworkers( Would you be willing to discuss a mental stress issue with your coworkers? ),supervisor( Would you be willing to discuss a mental stress issue with your direct supervisor(s)? ) etc.. Like this there are about 27 columns for training our model. The labelled data in this or the one we predict from the dataset is whether he need treatment or not means treatment column is taken as labeled data..If the predicted outpur is yes them stress level according to the given data is lies between Average- HighIf it is no then stress level is low.

The Second graph explains about the gender of the people who answered to the question from the above graph it is clear that most of the people who answered are women with frequency of 991 and then men with frequency of 247 and last trans with the frequency of 19 and after the encoding categorical one’s into numerical the representations are : Female - Male - 1 Trans - Note : Users should give input data in the user interface in the above mentioned form only.

The below graph shows the relation between the question whether there work causes any pressure or stress to them or not and their respective frequency..Many of them answered that it causes sometimes and the least one’s is oftenly it causes and others like rarely and Never and Don’t Know. Here in the graph : Don’t Know- Never - 1 Often – 2 Rarely - 3 Sometimes – 4

Here ,the another one shows about the how many of these individuals have that stress from their family history like whether did anyone faced that stress problem in there family .From the graph it is clear that only few of them are facing it from there family and many of them are not..Here, No- Yes -

There are many more relations that can we easily understandable. For now we will continue with some other remaining features values what users should give in the user interface..  Care options : User values should be - No - 0 Not_Sure - 1 Yes – 2  Benefits : User values should be - Don’t know - 0 No - 1 Yes – 2  Anonymity : User values should be : Don’t Know - 0 No - 1 Yes – 2  Leave : User values should be : Don’t know - 0 Somewhat difficult - 1 Somewhat easy - 2 Very difficult - 3 Very easy – 4

treatment( Have you sought treatment for a mental stress) benefits( Does your employer provide mental stress benefits? ) anonymity( Is your anonymity protected if you choose to take advantage of mental stress or substance abuse treatment resources? ) care_options( Do you know the options for mental health care your employer provides? ) leave (How easy is it for you to take medical leave for a mental health condition?) There are the main variables to find a person is affected stress or not. 5.3 DATA CLEANING Before algorithm application we need to prepare data. We have dropped time stamp ,country, state and comments from data set because they are not important .we have checked whether the data contains null values or not.now we had checked dataset for knowing what type of datatypes are they. we have counted different types of answers given by public for gender question. we assigned default values for each datatype like default integer should be 0. we differntiated intfeatures and string features and cleaned NaN values.now we checked the dataset have any null values or not, there are no null values in dataset. Now we will find variety of answers given by public for gender without number of times. we will convert all gender values in lower case. After changing into lower case, we changed all different types on answers in male list to male, female list to female and trans list trans by iterating through each row.now we filled missing age with median. We have filled <18 and >120 age values with median. 5.4 CONVERTING CATEGORICAL VALUES Converting categorical age range into understandable type like 0-20 age is given by 0.21- 30 is given by 20.in this data there are only 0.014% of self employed so we have changed NaN to NOT self_employed. There are only 0.20% of self work_interface so we have changed NaN to don’t know. Now for each categorial column we replaced each category with the class. If there are 3 diff categories column is given a 0,1,2. Encoding of data is done.

5.5 SPLITTING THE DATASET We define X=feature colums= Age, Gender, family_history, benefits, care_options, anonymity, leave, work_interfere. And Y= treatement. Now we splitted X,Y into training and testing sets. 5 .6 ACCURACY After splitting those variables into train and test data, we will generate naïve bayes model on training et and perform prediction on test features. After model generation we can check accuracy using actual and predicted values.