



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Personality is a fundamental basis of human behaviour. At most basic, personality including patterns of thought, feeling, behaviours that make an individual unique. Personality will directly or indirectly influence the interaction or preferences of a person. This research using different learning algorithms and concepts of data mining to mine on the data features and learn from the pattern. The aim of this experiment is to explore different options of the algorithm on modifying the personality
Typology: Assignments
1 / 7
This page cannot be seen from the preview
Don't miss anything!




Xin Yee Chin School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Han Yang Lau School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Zhi Xin Chong School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Man Pan Chow School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Zailan Arabee Abdul Salam School of Computing Asia Pacific University of Technology and Innovation (APU) Kuala Lumpur, Malaysia [email protected] Abstract — Personality is a fundamental basis of human behaviour. At most basic, personality including patterns of thought, feeling, behaviours that make an individual unique. Personality will directly or indirectly influence the interaction or preferences of a person. This research using different learning algorithms and concepts of data mining to mine on the data features and learn from the pattern. The aim of this experiment is to explore different options of the algorithm on modifying the personality prediction source code by using logistic regression algorithm, and to find whether the accuracy of the classification can be improved. There are five characteristics of different people that are known as the Big Five characteristic, which is openness, neuroticism, conscientiousness, agreeableness and extraversion that have been stored in the dataset used for training. Then, an overview and comparison will be provided on the different measures taken to reduce the issues faced by researchers in this field. Classification methods implemented are Support Vector Machine, Ridge Algorithm, Naive Bayes, Logistic Regression and Voting Classifier. Testing results showed that the Logistic Regression still outperformed the other methods. Key words- machine learning; personality prediction; Big Five Personality; regression I. INTRODUCTION Personality is all about the different characteristic of an individual’s pattern of feeling, behaving and thinking. Personality embraces the mood, opinion and attitude of someone, it is the best expression way to express clearly and in an understandable form when interacting with someone. Personality provides the ability to distinguish one person from another that can be observed in the workplace environment and so on. Although there are many more ways to explain what personality exactly is, from the psychological point of perspective, there are two main explanations. First pertains to the consistency of differences between humans. In this way, the study of personality can focus on classifying and identifying human’s psychological patterns. Second belongs to the emphasis of quality which is mostly likely to make people alike and that will help to distinguish psychological man from the other species. Personality theorists are then directed to research about those regulations among people that can usefully define the nature of man and other factors that influence the course of live. Understanding personality is important and useful. Personality provides people the idea of how leading, influencing communication can take place in certain conditions. For example, personality traits such as agreeableness and extraversion are mostly going to improve the chance of communication. Whereas personality traits such as high self-esteem are most likely going to remain silent at the workplace. Therefore, personality shows that it can be useful in many situations. In this experiment, machine learning is being apply to judge and classify personality. Based on the previous source code, logistic regression was used to classify the big five personalities. Big five personality traits include openness, neuroticism, conscientiousness, agreeableness and extraversion. Classifying these personality traits is useful in many ways, one of the reasons to classify personality is to check the suitability of an employee. Employee’s personality is often tested in real time to determine which position of the job he or she is particularly fitting in well. In this research, different algorithms are added to further explore the dataset to test if higher accuracy can be found and created. Classification methods will be added to the original code is Support Vector Machine, Ridge Algorithm, Naïve Bayes, Logistic Regression and Voting Classifier. Logistic Regression is being the default algorithms to the source code. Critical analysis was performed on similar projects/papers that used different methods. [2] used Multiclass Support Vector Machine (SVM) to perform personality classification based on handwriting. The personalities are Optimistic, Extrovert, Introvert, Sloppy, Energetic. Multiclass classification. Histogram of oriented gradient performs feature extraction on handwriting data, and noise removal performed using adaptive thresholding
followed by resizing to reorient image, then multiclass SVM classification was applied using polynomial kernel to map the feature space to a higher dimension, then a hyper-plane will be created which classifies the handwriting features to different classes. Even with a limited dataset and feature extraction, it achieves 80% accuracy. [2] study used the position of the user’s iris based on Eye Accessing Cues in Neural Language Processing to predict their personality. The iris position is an indicator of the mind’s internal representational system of what brain sections’ currently active. Support Vector Machine is used to take a rectangle crop of the eye with 9000 pixels as an input. Visual, auditory and kinaesthetic (VAK) learning style is used because it best conveys the personality of the person. 215 images of eyes (features) are pre-processed with an eye detection procedure called Cascade Object detector and resized into smaller size and classified with SVM. The results show Radial Basis Function kernel (Standard Gaussian Kernel) has best accuracy at 84.9% followed by linear kernel at 83.7%, and a train-test split of 75:25 gives the best accuracy at 84.9%, followed by 70:30 at 82.8%. Allouch, Azaria and Azoulay’s [5] used a voting classifier algorithm to assist children with special needs with communication in social encounters by recognising possibly insulting or harmful sentences. Dataset consists of interviews with parents of ASD children, categorised to five categories. Audio is translated to text for text classification by the voting classifier (ensemble method), it performs a voting protocol and chooses the result that the majority of algorithms suggest. Algorithms used in voting classifier included random forest, SVM, ridge classifier, extra trees, Bayesian inference method, MLP and K-nearest neighbours. Voting classifier achieved the best accuracy at 71.2%, followed by random forest at 71% accuracy and Embedded Convolutional Neural Network at 69.6% [5]. Using a combined set of neural networks achieved even better results at 71.4% accuracy, downside being significantly prolonged training period and requiring more training data. It can be concluded that SVM with polynomial and rbf kernel are good algorithms with high accuracy that have been used successfully in other projects. Voting classifier is a promising method that can achieve the highest accuracy with the downside of taking longer training and computational resources, but still faster than typical convolutional neural networks. Thus it is recommended that these algorithms and methods be incorporated into our project to further improve and iterate on our results. II. MATERIALS AND METHODS A. Hardware For this experiment, the system was implemented on an Acer Swift 3 laptop. The technical specifications are listed in Table I.
Specification Description Processor 2.30GHz Hexa-core Intel Core i5(8th Gen) processor Storage 512GB Solid State Drive Capacity (SSD) Memory 8GB of LPDDR4 onboard memory Operating System Windows 10 B. Programming Language Python language is used in this research program as python is easy to understand as it has English-like syntax. Besides that, it has free and open source. For example, source code can be downloaded and modified because python has the open-source license which is under OSI approval. It also can improve productivity as it is the productive language which developers can execute the code line by line. Furthermore, it can import almost all the library and it also consists of 200,000 packages in the library. C. Software In this research, we are using Anaconda navigator to run the packages as the anaconda navigator allows us to manage the conda packages, environments and channels. Data scientists always use different versions of packages and environments so that anaconda navigator can help them to separate different versions. For instance, we can use the navigator to import the library or packages without typing the conda commands. We also can modify the parameter of the source code and run in the navigator as shown in Table II.
Class label description: No. of class labels: 5 Type: Nominal Values: ● Extraverted ● Serious ● Responsible ● Lively ● dependable D. Original system The steps involved in the system work are: Import all the libraries needed. Load the train dataset from the train dataset file. Pre-processing of the train datasets (a) Data transformation is performed by encode all nominal data type (Male and Female) into binary numbers (0 and 1). (b) Among the datasets, first 7 columns are used for training purposes and remaining one column for testing purpose. Train the Logistic Regression Classification using the train dataset. Load the test datasets from the test dataset file. Pre-processing of the test datasets. S.NO ATTRIBUTE TYPE RANGE 1 Gender nominal Male/Female 2 Age numeric 17- 3 Openness numeric 1- 4 Neuroticism numeric 1- 5 Conscientiousness numeric 1- 6 Agreeableness numeric 1- 7 Extraversion numeric 1-
Besides that, f(si) refers to the activations and it is applied to the scores when sigmoid and softmax is calculated and it is also applied before calculating the CE loss calculation. B. Support Vector Machine Support vector machines are one of the supervised learning algorithms are mainly used in classifying and solving regression problems which also can be named as support vector classification (SVC) and support vector regression (SVR)[6]. SVM will come out with a hyperplane that split the features into different domain in higher dimensions(). Type of kernel that have been implemented will be kernel polynomial. Figure 1 shows the formula of polynomial kernel. K (^) ( X 1 , X 2 )=( a + X 1 T
b (6) x and y will be the input space, a determines the constant while b sets the degree of the polynomial. After applying this formula, x and y will be mapped into a higher dimension Z that may seem like in Fig 2.
2
2
2
2
To solve the SVM , we would have to perform the dot product on each of the data points and do multiplication with the next dot product, by using kernel trick, dot product can be simply calculated by increasing the value of the power [6].
T
T
T
2
2
2
2
The 2 dimensional relationships can be used to find a support vector classifier while in a higher dimensional relationship, hyperplane is needed to split the features into different domain. If there are “m” dimensions, the equation of the hyperplane is shown in Equation 11.
¿ w 0 +∑ i = 1 m
T
T
Wi will be determine as a vector (W1,W2,W3….Wm) , b will be the biased term which is W0, while X is variables. Thus we can conclude that for point wTx + b >= 0 for di = +1 and wTx + b >= 0 for di = -1. W Fig. 3. Hyperplane The input vector that touch the margin in Fig 3. will be pick as the “tips” of the vectors. Fig. 4. hyperplane with actual support vector Finally, by using the algorithm, hyperplane or a line will be created which can separates the data into classes. By using hyperplane can classify the test dataset. SVM is really effective in the higher dimension. C. Ridge Classification Ridge Classification is simply ridge regression, just with the response values converted into -1 and +1, then treated as a normal regression task. It’s a traditional machine learning algorithm, improving on the least squares regression, often used to handle multicollinearity. With presence of multicollinear data, regression coefficients will have significantly huge standard errors, which reduces prediction accuracy of the coefficients. A small constant value is modified into original Least Squares
− 1
Ridge regression performs L2 regularization by penalizing the square of features’ coefficients’ magnitude to reduce the error between actual and predicted
observations. β^ridge is selected to minimize the penalizedridge is selected to minimize the penalized square sums: this is equivalent to minimization ∑ i = 1 n ( y ¿¿ i −∑ j = 1 p
2 +¿ λ (^) ∑ j = 1 p
2
Of (^) ∑ i = 1 n ( yi −∑ j = i p
2
subject to, for some, C > 0,∑ j = 1 p
2
Therefore, by placing a constraining/penalty term on certain parameters, ridge regression further minimizes the residual sum of squares. A constant is chosen as the penalty term, which is multiplied with the squared vector. The larger the value of the vector, the more the optimization function is penalized. D. Naive Bayes Naïve bayes is one of the statistical algorithms used for classification based on Bayes Theorem. For example, a fruit may be considered as orange if it is in orange colour and 3 inches in diameter. Even though these features will rely on each other, these properties will be contributing independently to the probability that the fruit is orange, and this is the meaning of why it is known as “Naïve’ [3]. Naive bayes algorithm commonly applies to large amounts of data. Naïve Bayes algorithm is suitable in classifying various applications. For example, text classification, recommender system, spam filtering and so forth. It is fast and accurate, providing high accuracy of prediction. A few models of naïve bayes had been created to perform different tasks based on suitability. Equation below shows the formula for calculating probability that a document occurs in a class by using multinomial Naïve Bayes classification.
In order to calculate the probability that a document occurs in a class, prior probability and probability of nth word are needed to calculate [7]. Equation 16 shows the formula of prior probability P(C). Equation 17 shows the formula of probability of the nth word.
These are the probability was being specific in a particular category from a set of documents. Equation 18 will show the formula for the calculation of TF-IDF (term frequency- inverse document frequency). This is a statistical method to observe how relevant a word is to its document [7].
Equation above will show the formula for calculating conditional probability which is known as likelihood. It is calculating the conditional probability of a word appearing in a document that the document belongs to a class.
∑ W '^ ∈V
'
' (20) E. Voting Classifier Also known as an ensemble method, voting classifier is a wrapper for a set of different models that are trained in parallel. It then predicts the output class based on highest probability. Ensemble methods tend to produce lesser error and result in less over-fitting. There are 2 basic types of voting classifiers:
Majority voting: Simplest case, the class that receives the highest majority of votes by each model predicted is the output class.
∑ j = 1 m
where wj is the weight that can be assigned to the jth classifier Soft voting: The probability vector for each predicted class from all the classifiers are summed up and averaged. Soft voting is recommended only if each classifier is well- calibrated. Voting classifier makes the most of the different algorithms and if done right should yield better performance than any single model. It is important that the set of classifiers are diverse so errors do not aggregate. IV. DISCUSS ON IMPLEMENTATION After the pre-processing of data is done, five types of algorithms are implemented. Nominal types of data were encoded into binary where 1 represents male, 0 represents female. Hence, there are 7 types of prediction models were
gains come from so-called unstable models such as decisions trees, where each observation usually has an impact on the decision boundary. More stable ones like SVMs do not gain as much because resampling usually does not affect support vectors much. Though It tends to improve performance on average. VI. CONCLUSION In this research, we have studied 5 machine learning classifiers to classify the human personality. The main purpose of this research is to study the new algorithms and improve the accuracy of the original algorithm which is Logistic Regression. New algorithms have been added are Support Vector Machine, Naive Bayes Classifier, Ridge Classifier, Voting Classifier. Precision, recall, accuracy and f1 score are used to measure the performance of all the classifiers. The overall results have shown that Logistic Regression still outperformed all new algorithms added. Moreover, predicting personality is an abstract model which means that it will describe the phenomena while concrete models will have a direct analogue result in machine learning. Therefore, abstract models cannot obtain a very high accuracy result when compared to concrete model. REFERENCES [1] A. Chitlangia, G. Malathi, “Handwriting Analysis based on Histogram of Oriented Gradient for Predicting Personality traits using SVM,” Procedia Computer Science 165, pp. 384-390, 2019. [2] S. Ramli, S Nordin, “Personality Prediction Based on Iris Position Classification Using Support Vector Machines,” Indonesian Journal of Electrical Engineering and Computer Science, 9(3), pp. 667-672,
[3] A. P. Wibawa, A. C. Kurniawan, D. M. P. Murti, R. P. Adiperkasa, S. M. Putra, S. A. Kurniawan, Y. R. Nugraha, “ Naïve Bayes Classifier for Journal Quartile Classification” International Journal of Recent Contributions from Engineering, Science & IT (iJES), 7(2), pp. 91-99,
[4] Y. Ren et al., “Robust softmax regression for multi-class classification with self-paced learning,” Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence( IJCAI), , pp. 2641–2647, 2017. [5] M. Allouch, A. Azaria, R. Azoulay., “Detecting Sentences that May be Harmful to Children with Special Needs,” 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1209-1213, 2019. [6] N. Guenther, M. Schonlau.,“ Support vector machines,” The Stata Journal, 16(4), pp.917-937, 2016. [7] Y. Artissa, I. Asror, S.A. Faraby.,“ Personality Classification based on Facebook status text using Multinomial Naïve Bayes method,” In Journal of Physics: Conference Series, vol. 1192, no. 1, p. 012003. IOP Publishing, 2019.