Personality Prediction using Machine Learning Classifiers, Assignments of Artificial Intelligence

Personality is a fundamental basis of human behaviour. At most basic, personality including patterns of thought, feeling, behaviours that make an individual unique. Personality will directly or indirectly influence the interaction or preferences of a person. This research using different learning algorithms and concepts of data mining to mine on the data features and learn from the pattern. The aim of this experiment is to explore different options of the algorithm on modifying the personality

Typology: Assignments

2019/2020

Uploaded on 12/03/2020

zhixin_chong
zhixin_chong 🇲🇾

1 document

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Journal of Applied Technology and Innovation (e -ISSN: 2600-7304) vol. 5, no. 1, (2021) 1
Personality Prediction using Machine Learning
Classifiers
Xin Yee Chin
School of Computing
Asia Pacific University of Technology
and Innovation (APU)
Kuala Lumpur, Malaysia
Han Yang Lau
School of Computing
Asia Pacific University of Technology
and Innovation (APU)
Kuala Lumpur, Malaysia
Zhi Xin Chong
School of Computing
Asia Pacific University of Technology
and Innovation (APU)
Kuala Lumpur, Malaysia
Man Pan Chow
School of Computing
Asia Pacific University of Technology
and Innovation (APU)
Kuala Lumpur, Malaysia
Zailan Arabee Abdul Salam
School of Computing
Asia Pacific University of Technology
and Innovation (APU)
Kuala Lumpur, Malaysia
Abstract Personality is a fundamental basis of human
behaviour. At most basic, personality including patterns of
thought, feeling, behaviours that make an individual unique.
Personality will directly or indirectly influence the interaction
or preferences of a person. This research using different
learning algorithms and concepts of data mining to mine on the
data features and learn from the pattern. The aim of this
experiment is to explore different options of the algorithm on
modifying the personality prediction source code by using
logistic regression algorithm, and to find whether the accuracy
of the classification can be improved. There are five
characteristics of different people that are known as the Big
Five characteristic, which is openness, neuroticism,
conscientiousness, agreeableness and extraversion that have
been stored in the dataset used for training. Then, an overview
and comparison will be provided on the different measures
taken to reduce the issues faced by researchers in this field.
Classification methods implemented are Support Vector
Machine, Ridge Algorithm, Naive Bayes, Logistic Regression
and Voting Classifier. Testing results showed that the Logistic
Regression still outperformed the other methods.
Key words- machine learning; personality prediction; Big Five
Personality; regression
I. INTRODUCTION
Personality is all about the different characteristic of an
individual’s pattern of feeling, behaving and thinking.
Personality embraces the mood, opinion and attitude of
someone, it is the best expression way to express clearly and
in an understandable form when interacting with someone.
Personality provides the ability to distinguish one person
from another that can be observed in the workplace
environment and so on. Although there are many more ways
to explain what personality exactly is, from the psychological
point of perspective, there are two main explanations. First
pertains to the consistency of differences between humans.
In this way, the study of personality can focus on
classifying and identifying human’s psychological patterns.
Second belongs to the emphasis of quality which is mostly
likely to make people alike and that will help to distinguish
psychological man from the other species. Personality
theorists are then directed to research about those regulations
among people that can usefully define the nature of man and
other factors that influence the course of live. Understanding
personality is important and useful. Personality provides
people the idea of how leading, influencing communication
can take place in certain conditions. For example, personality
traits such as agreeableness and extraversion are mostly
going to improve the chance of communication. Whereas
personality traits such as high self-esteem are most likely
going to remain silent at the workplace.
Therefore, personality shows that it can be useful in many
situations. In this experiment, machine learning is being
apply to judge and classify personality. Based on the
previous source code, logistic regression was used to classify
the big five personalities. Big five personality traits include
openness, neuroticism, conscientiousness, agreeableness and
extraversion. Classifying these personality traits is useful in
many ways, one of the reasons to classify personality is to
check the suitability of an employee. Employee’s personality
is often tested in real time to determine which position of the
job he or she is particularly fitting in well.
In this research, different algorithms are added to further
explore the dataset to test if higher accuracy can be found
and created. Classification methods will be added to the
original code is Support Vector Machine, Ridge Algorithm,
Naïve Bayes, Logistic Regression and Voting Classifier.
Logistic Regression is being the default algorithms to the
source code.
Critical analysis was performed on similar
projects/papers that used different methods. [2] used
Multiclass Support Vector Machine (SVM) to perform
personality classification based on handwriting. The
personalities are Optimistic, Extrovert, Introvert, Sloppy,
Energetic. Multiclass classification. Histogram of oriented
gradient performs feature extraction on handwriting data, and
noise removal performed using adaptive thresholding
pf3
pf4
pf5

Partial preview of the text

Download Personality Prediction using Machine Learning Classifiers and more Assignments Artificial Intelligence in PDF only on Docsity!

Personality Prediction using Machine Learning

Classifiers

Xin Yee Chin School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Han Yang Lau School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Zhi Xin Chong School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Man Pan Chow School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Zailan Arabee Abdul Salam School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Abstract — Personality is a fundamental basis of human behaviour. At most basic, personality including patterns of thought, feeling, behaviours that make an individual unique. Personality will directly or indirectly influence the interaction or preferences of a person. This research using different learning algorithms and concepts of data mining to mine on the data features and learn from the pattern. The aim of this experiment is to explore different options of the algorithm on modifying the personality prediction source code by using logistic regression algorithm, and to find whether the accuracy of the classification can be improved. There are five characteristics of different people that are known as the Big Five characteristic, which is openness, neuroticism, conscientiousness, agreeableness and extraversion that have been stored in the dataset used for training. Then, an overview and comparison will be provided on the different measures taken to reduce the issues faced by researchers in this field. Classification methods implemented are Support Vector Machine, Ridge Algorithm, Naive Bayes, Logistic Regression and Voting Classifier. Testing results showed that the Logistic Regression still outperformed the other methods. Key words- machine learning; personality prediction; Big Five Personality; regression I. INTRODUCTION Personality is all about the different characteristic of an individual’s pattern of feeling, behaving and thinking. Personality embraces the mood, opinion and attitude of someone, it is the best expression way to express clearly and in an understandable form when interacting with someone. Personality provides the ability to distinguish one person from another that can be observed in the workplace environment and so on. Although there are many more ways to explain what personality exactly is, from the psychological point of perspective, there are two main explanations. First pertains to the consistency of differences between humans. In this way, the study of personality can focus on classifying and identifying human’s psychological patterns. Second belongs to the emphasis of quality which is mostly likely to make people alike and that will help to distinguish psychological man from the other species. Personality theorists are then directed to research about those regulations among people that can usefully define the nature of man and other factors that influence the course of live. Understanding personality is important and useful. Personality provides people the idea of how leading, influencing communication can take place in certain conditions. For example, personality traits such as agreeableness and extraversion are mostly going to improve the chance of communication. Whereas personality traits such as high self-esteem are most likely going to remain silent at the workplace. Therefore, personality shows that it can be useful in many situations. In this experiment, machine learning is being apply to judge and classify personality. Based on the previous source code, logistic regression was used to classify the big five personalities. Big five personality traits include openness, neuroticism, conscientiousness, agreeableness and extraversion. Classifying these personality traits is useful in many ways, one of the reasons to classify personality is to check the suitability of an employee. Employee’s personality is often tested in real time to determine which position of the job he or she is particularly fitting in well. In this research, different algorithms are added to further explore the dataset to test if higher accuracy can be found and created. Classification methods will be added to the original code is Support Vector Machine, Ridge Algorithm, Naïve Bayes, Logistic Regression and Voting Classifier. Logistic Regression is being the default algorithms to the source code. Critical analysis was performed on similar projects/papers that used different methods. [2] used Multiclass Support Vector Machine (SVM) to perform personality classification based on handwriting. The personalities are Optimistic, Extrovert, Introvert, Sloppy, Energetic. Multiclass classification. Histogram of oriented gradient performs feature extraction on handwriting data, and noise removal performed using adaptive thresholding

followed by resizing to reorient image, then multiclass SVM classification was applied using polynomial kernel to map the feature space to a higher dimension, then a hyper-plane will be created which classifies the handwriting features to different classes. Even with a limited dataset and feature extraction, it achieves 80% accuracy. [2] study used the position of the user’s iris based on Eye Accessing Cues in Neural Language Processing to predict their personality. The iris position is an indicator of the mind’s internal representational system of what brain sections’ currently active. Support Vector Machine is used to take a rectangle crop of the eye with 9000 pixels as an input. Visual, auditory and kinaesthetic (VAK) learning style is used because it best conveys the personality of the person. 215 images of eyes (features) are pre-processed with an eye detection procedure called Cascade Object detector and resized into smaller size and classified with SVM. The results show Radial Basis Function kernel (Standard Gaussian Kernel) has best accuracy at 84.9% followed by linear kernel at 83.7%, and a train-test split of 75:25 gives the best accuracy at 84.9%, followed by 70:30 at 82.8%. Allouch, Azaria and Azoulay’s [5] used a voting classifier algorithm to assist children with special needs with communication in social encounters by recognising possibly insulting or harmful sentences. Dataset consists of interviews with parents of ASD children, categorised to five categories. Audio is translated to text for text classification by the voting classifier (ensemble method), it performs a voting protocol and chooses the result that the majority of algorithms suggest. Algorithms used in voting classifier included random forest, SVM, ridge classifier, extra trees, Bayesian inference method, MLP and K-nearest neighbours. Voting classifier achieved the best accuracy at 71.2%, followed by random forest at 71% accuracy and Embedded Convolutional Neural Network at 69.6% [5]. Using a combined set of neural networks achieved even better results at 71.4% accuracy, downside being significantly prolonged training period and requiring more training data. It can be concluded that SVM with polynomial and rbf kernel are good algorithms with high accuracy that have been used successfully in other projects. Voting classifier is a promising method that can achieve the highest accuracy with the downside of taking longer training and computational resources, but still faster than typical convolutional neural networks. Thus it is recommended that these algorithms and methods be incorporated into our project to further improve and iterate on our results. II. MATERIALS AND METHODS A. Hardware For this experiment, the system was implemented on an Acer Swift 3 laptop. The technical specifications are listed in Table I.

TABLE I. SPECIFICATION OF ACER SWIFT 3

Specification Description Processor 2.30GHz Hexa-core Intel Core i5(8th Gen) processor Storage 512GB Solid State Drive Capacity (SSD) Memory 8GB of LPDDR4 onboard memory Operating System Windows 10 B. Programming Language Python language is used in this research program as python is easy to understand as it has English-like syntax. Besides that, it has free and open source. For example, source code can be downloaded and modified because python has the open-source license which is under OSI approval. It also can improve productivity as it is the productive language which developers can execute the code line by line. Furthermore, it can import almost all the library and it also consists of 200,000 packages in the library. C. Software In this research, we are using Anaconda navigator to run the packages as the anaconda navigator allows us to manage the conda packages, environments and channels. Data scientists always use different versions of packages and environments so that anaconda navigator can help them to separate different versions. For instance, we can use the navigator to import the library or packages without typing the conda commands. We also can modify the parameter of the source code and run in the navigator as shown in Table II.

TABLE II. DATASET DESCRIPTION

Class label description: No. of class labels: 5 Type: Nominal Values: ● Extraverted ● Serious ● Responsible ● Lively ● dependable D. Original system The steps involved in the system work are:  Import all the libraries needed.  Load the train dataset from the train dataset file.  Pre-processing of the train datasets (a) Data transformation is performed by encode all nominal data type (Male and Female) into binary numbers (0 and 1). (b) Among the datasets, first 7 columns are used for training purposes and remaining one column for testing purpose.  Train the Logistic Regression Classification using the train dataset.  Load the test datasets from the test dataset file.  Pre-processing of the test datasets. S.NO ATTRIBUTE TYPE RANGE 1 Gender nominal Male/Female 2 Age numeric 17- 3 Openness numeric 1- 4 Neuroticism numeric 1- 5 Conscientiousness numeric 1- 6 Agreeableness numeric 1- 7 Extraversion numeric 1-

Besides that, f(si) refers to the activations and it is applied to the scores when sigmoid and softmax is calculated and it is also applied before calculating the CE loss calculation. B. Support Vector Machine Support vector machines are one of the supervised learning algorithms are mainly used in classifying and solving regression problems which also can be named as support vector classification (SVC) and support vector regression (SVR)[6]. SVM will come out with a hyperplane that split the features into different domain in higher dimensions(). Type of kernel that have been implemented will be kernel polynomial. Figure 1 shows the formula of polynomial kernel. K (^) ( X 1 , X 2 )=( a + X 1 T

X 2 )

b (6) x and y will be the input space, a determines the constant while b sets the degree of the polynomial. After applying this formula, x and y will be mapped into a higher dimension Z that may seem like in Fig 2.

Za = ∅ ( X ¿¿ a )=(1, a 1 , a 2 , a 1

2

, a 2

2

,a 1 ∗ a 2 ) ¿

Zb = ∅ ( X ¿¿ b )=(1, b 1 , b 2 , b 1

2

,b 2

2

, b 1 ∗ b 2 )¿

To solve the SVM , we would have to perform the dot product on each of the data points and do multiplication with the next dot product, by using kernel trick, dot product can be simply calculated by increasing the value of the power [6].

Za

T

Zb = k ( X a , Xb )=( 1 + Xa

T

X b )

Za

T

Zb = 1 + a 1 b 1 + a 2 b 2 + a 1

2

a 1

2

+ a 2

2

a 2

2

+ a 1 a 1 b 1 b 1

The 2 dimensional relationships can be used to find a support vector classifier while in a higher dimensional relationship, hyperplane is needed to split the features into different domain. If there are “m” dimensions, the equation of the hyperplane is shown in Equation 11.

y = w 0 + w 1 x 1 + w 2 x 2 + w 3 x 3 ….

¿ w 0 +∑ i = 1 m

wi xi

¿ w 0 + w

T

X

¿ b + w

T

X (11)

Wi will be determine as a vector (W1,W2,W3….Wm) , b will be the biased term which is W0, while X is variables. Thus we can conclude that for point wTx + b >= 0 for di = +1 and wTx + b >= 0 for di = -1. W Fig. 3. Hyperplane The input vector that touch the margin in Fig 3. will be pick as the “tips” of the vectors. Fig. 4. hyperplane with actual support vector Finally, by using the algorithm, hyperplane or a line will be created which can separates the data into classes. By using hyperplane can classify the test dataset. SVM is really effective in the higher dimension. C. Ridge Classification Ridge Classification is simply ridge regression, just with the response values converted into -1 and +1, then treated as a normal regression task. It’s a traditional machine learning algorithm, improving on the least squares regression, often used to handle multicollinearity. With presence of multicollinear data, regression coefficients will have significantly huge standard errors, which reduces prediction accuracy of the coefficients. A small constant value is modified into original Least Squares

estimator, β =( X ' X )−^1 X ' Y , forming:

βridge =( X ' X + λ I p )

− 1

X ' Y

Ridge regression performs L2 regularization by penalizing the square of features’ coefficients’ magnitude to reduce the error between actual and predicted

observations. β^ridge is selected to minimize the penalizedridge is selected to minimize the penalized square sums: this is equivalent to minimization ∑ i = 1 n ( y ¿¿ i −∑ j = 1 p

xij β j )

2 +¿ λ (^) ∑ j = 1 p

β j

2

Of (^) ∑ i = 1 n ( yi −∑ j = i p

Xij β j ¿)

2

subject to, for some, C > 0,∑ j = 1 p

β j

2

< c. (15)

Therefore, by placing a constraining/penalty term on certain parameters, ridge regression further minimizes the residual sum of squares. A constant is chosen as the penalty term, which is multiplied with the squared vector. The larger the value of the vector, the more the optimization function is penalized. D. Naive Bayes Naïve bayes is one of the statistical algorithms used for classification based on Bayes Theorem. For example, a fruit may be considered as orange if it is in orange colour and 3 inches in diameter. Even though these features will rely on each other, these properties will be contributing independently to the probability that the fruit is orange, and this is the meaning of why it is known as “Naïve’ [3]. Naive bayes algorithm commonly applies to large amounts of data. Naïve Bayes algorithm is suitable in classifying various applications. For example, text classification, recommender system, spam filtering and so forth. It is fast and accurate, providing high accuracy of prediction. A few models of naïve bayes had been created to perform different tasks based on suitability. Equation below shows the formula for calculating probability that a document occurs in a class by using multinomial Naïve Bayes classification.

P ( t ∨ d )= P ( C ) × P ( t 1 ∨ ×c ) × P ( t ¿¿ 2 ∨ ×c ) × P ( t ¿¿ 3 ∨ × c )... × P ( t ¿¿ n ∨ ×c )¿ ¿ ¿

In order to calculate the probability that a document occurs in a class, prior probability and probability of nth word are needed to calculate [7]. Equation 16 shows the formula of prior probability P(C). Equation 17 shows the formula of probability of the nth word.

P ( C )=

N c

N

P ( tn ∨ C )=

count ( tn , c )+ 1

count ( c )+¿ V ∨¿ ¿

These are the probability was being specific in a particular category from a set of documents. Equation 18 will show the formula for the calculation of TF-IDF (term frequency- inverse document frequency). This is a statistical method to observe how relevant a word is to its document [7].

tfid f t = f t ,d × log

N

df t

Equation above will show the formula for calculating conditional probability which is known as likelihood. It is calculating the conditional probability of a word appearing in a document that the document belongs to a class.

( tn ∨ C )=

W ct + 1

W '^ ∈V

W

'

ct +^ B

' (20) E. Voting Classifier Also known as an ensemble method, voting classifier is a wrapper for a set of different models that are trained in parallel. It then predicts the output class based on highest probability. Ensemble methods tend to produce lesser error and result in less over-fitting. There are 2 basic types of voting classifiers:

y = mode { C 1 ( x ) , C 2 ( x ) , ... , Cm ( x ) }

Majority voting: Simplest case, the class that receives the highest majority of votes by each model predicted is the output class.

y = arg max

i

j = 1 m

w j pij (22)

where wj is the weight that can be assigned to the jth classifier Soft voting: The probability vector for each predicted class from all the classifiers are summed up and averaged. Soft voting is recommended only if each classifier is well- calibrated. Voting classifier makes the most of the different algorithms and if done right should yield better performance than any single model. It is important that the set of classifiers are diverse so errors do not aggregate. IV. DISCUSS ON IMPLEMENTATION After the pre-processing of data is done, five types of algorithms are implemented. Nominal types of data were encoded into binary where 1 represents male, 0 represents female. Hence, there are 7 types of prediction models were

gains come from so-called unstable models such as decisions trees, where each observation usually has an impact on the decision boundary. More stable ones like SVMs do not gain as much because resampling usually does not affect support vectors much. Though It tends to improve performance on average. VI. CONCLUSION In this research, we have studied 5 machine learning classifiers to classify the human personality. The main purpose of this research is to study the new algorithms and improve the accuracy of the original algorithm which is Logistic Regression. New algorithms have been added are Support Vector Machine, Naive Bayes Classifier, Ridge Classifier, Voting Classifier. Precision, recall, accuracy and f1 score are used to measure the performance of all the classifiers. The overall results have shown that Logistic Regression still outperformed all new algorithms added. Moreover, predicting personality is an abstract model which means that it will describe the phenomena while concrete models will have a direct analogue result in machine learning. Therefore, abstract models cannot obtain a very high accuracy result when compared to concrete model. REFERENCES [1] A. Chitlangia, G. Malathi, “Handwriting Analysis based on Histogram of Oriented Gradient for Predicting Personality traits using SVM,” Procedia Computer Science 165, pp. 384-390, 2019. [2] S. Ramli, S Nordin, “Personality Prediction Based on Iris Position Classification Using Support Vector Machines,” Indonesian Journal of Electrical Engineering and Computer Science, 9(3), pp. 667-672,

[3] A. P. Wibawa, A. C. Kurniawan, D. M. P. Murti, R. P. Adiperkasa, S. M. Putra, S. A. Kurniawan, Y. R. Nugraha, “ Naïve Bayes Classifier for Journal Quartile Classification” International Journal of Recent Contributions from Engineering, Science & IT (iJES), 7(2), pp. 91-99,

[4] Y. Ren et al., “Robust softmax regression for multi-class classification with self-paced learning,” Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence( IJCAI), , pp. 2641–2647, 2017. [5] M. Allouch, A. Azaria, R. Azoulay., “Detecting Sentences that May be Harmful to Children with Special Needs,” 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1209-1213, 2019. [6] N. Guenther, M. Schonlau.,“ Support vector machines,” The Stata Journal, 16(4), pp.917-937, 2016. [7] Y. Artissa, I. Asror, S.A. Faraby.,“ Personality Classification based on Facebook status text using Multinomial Naïve Bayes method,” In Journal of Physics: Conference Series, vol. 1192, no. 1, p. 012003. IOP Publishing, 2019.