







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The concept of social context in sentiment analysis, its importance, and how it can be applied to improve sentiment analysis. The authors discuss the definitions of social context, its elements, and its relationship with sentiment analysis. They also provide an overview of current trends and frameworks for comparing sentiment analysis approaches that utilize social context. examples of social context in sentiment analysis tasks on Facebook and discusses the types of user, content, relations, and interactions found in online social networks.
Typology: Study notes
1 / 13
This page cannot be seen from the preview
Don't miss anything!








Keywords: Sentiment analysis Social context Social network analysis Online social networks
Sentiment analysis in social media is harder than in other types of text due to limitations such as abbreviations, jargon, and references to existing content or concepts. Nevertheless, social media provides more information beyond text, such as linked media, user reactions, and relations between users. We refer to this information as social context. Recent works have successfully leveraged the fusion of text with social context for sentiment analysis tasks. However, these works are usually limited to specific aspects of social context, and there have not been any attempts to analyze and apply social context systematically. This work aims to bridge this gap by providing three main contributions: 1) a formal definition of social context; 2) a framework for classifying and comparing approaches that use social context; 3) a review of existing works based on the defined framework.
1. Introduction
Recent years have witnessed the rise of social media. Platforms such as Twitter or Facebook have become the de facto way to share thoughts and opinions with a wide audience [41]. Studies of Twitter usage show that about 19% of tweets contain a reference to a brand or product, 20% of which also show some expression of brand sentiment [39]. As a consequence, companies and researchers have grown interested in social media as a way to monitor public opinion. The sheer amount of social media content makes it impractical or impossible to manually process it. Hence, automatic sentiment analysis has grown very popular.
Sentiment analysis has been applied for many years in other types of opinionated content, such as online reviews or news articles. How- ever, social media content poses several unique challenges to natural language processing in general, and to sentiment analysis in particular [64]. Some of these challenges are imposed by the very nature of social media platforms, such as limited length and relying on associated media. Other difficulties are caused by the characteristics of human interaction in these types of media, e.g., short attention span, need for immediacy, and use of specialized language. The result is a type of text that is short, full of jargon or abbreviations, ephemeral, and rife with references to contextual information.
There are different approaches to sentiment analysis in social me- dia [3,14,71]. Most techniques are content-centric. They exploit specific linguistic characteristics of social media, just like previous research has done for other media (e.g., news articles) and domains (e.g., movie re- views). Some works try to overcome abbreviations and short texts in
social media by finding external sources to link text to, such as news articles [32] or Wikipedia pages [29]. Other works leverage the specific language in these media by finding cues for sentiment (e.g., smileys and hashtags) [21]. When the textual content is also accompanied by mul- timedia, such as images or videos, the sentiment information in these media obtained with multimodal analysis [69] may also be exploited. Nevertheless, these approaches fail to use the fact that information shared on social networks is not isolated. The meaning of a particular piece of content (e.g., a Tweet, a Facebook status or a blog post) may only be understood when its context is taken into consideration. This context includes visible information such as previous content that be- longs to the same conversation, previous interactions between users, or people that interacted with the content (e.g., by liking it). It also includes seemingly unrelated social features. For instance, some demo- graphic factors such as age and gender have been shown to correlate with sentiment and vocabulary [89], and they have been used to im- prove sentiment classification [37]. New sentiment analysis techniques are starting to incorporate the fusion of information from text and social context. Social context has also been introduced in other fields related to sentiment analysis, such as spam detection, where clues to identify spammers are usually hidden in multiple aspects of context, such as previous content, behavior, re- lationship, and interaction [15]. Unfortunately, the definition of social features, the methods employed to extract them, and how they are ap- plied to sentiment analysis tasks vary greatly from work to work. These differences in notation and approaches are taxing, which makes com- paring different works harder.
Thus, further research is needed to delve more deeply into the notion of social context and the fusion of social context with traditional textual sentiment analysis. This work seeks to answer the following questions:
As a result, the contributions herein are threefold. First, this work proposes a formal and general definition of social context. Secondly, a framework to compare existing works in the field is proposed. In this framework, each work is described using a multi-level taxonomy that classifies each approach in terms of the proposed definition of social context, and other factors such as the machine learning techniques ap- plied. Thirdly, the state of the art in sentiment analysis using social con- text is organized and compared using the defined framework. Moreover, the results reported by each work in the analysis have been aggregated and analyzed, to simplify the comparison of approaches. The remaining of this paper is structured as follows. Section 2 presents an overview of the state of the art in sentiment analysis prior to social context, and an introduction to social network analysis; Section 3 introduces a formal definition of social context; Section 4 presents the framework for comparison of approaches to sentiment analysis using social context; Section 5 provides an overview of the state of the art, using the framework presented in the previous section; Lastly, Section 6 discusses the main conclusions drawn from this work and future lines of research.
2. Related work
This section is overview of relevant work in the fields of sentiment analysis and social network analysis. Each field is discussed in a separate section. The former discusses different approaches in sentiment analysis, including deep learning and ensemble techniques. The latter introduces Social Network Analysis (SNA), and it focuses on community detection due to its importance in several of the works reviewed.
2.1. Sentiment analysis
Although sentiment analysis has been an active research topic for decades, it has grown in popularity with the advent of online opinion- rich resources [64]. In turn, these resources have also added their own set of limitations and challenges. Over the last two decades, numerous works have explored sentiment analysis in different applications and using different approaches. These approaches can be grouped into machine learning, lexicon based, and hybrid [71]. Of the three, machine learning techniques and hybrid ap- proaches seem to be dominant [3,65,90], and lexicon techniques are typ- ically incorporated into machine learning approaches to improve their results. Machine learning approaches apply a predictor (a classifier, or an estimator) on a set of features that represent the input. The set of predictors is not very different from those used in other areas. Instead, the complexity in these approaches lies in extracting complex features from the text, filtering only relevant features, and selecting a good pre- dictor [78].
One of the most straightforward features is the Bag Of Words (BOW) model. In BOW, each document is represented by the multiset (bag) of its constituent words. Word order is disrupted, and syntactic structures are broken. As a result, a great deal of information from natural language is lost [94]. Therefore, various types of features have been exploited, such as higher order n-grams [63]. A more sophisticated feature is Part of Speech (POS) tagging [30]. In it, a syntactic analysis process is run, and each word is labeled (tagged) with its syntactic function (e.g., noun). Additionally, syntactic trees can be calculated. Using these trees, the words in the input can be rearranged to a more convenient position while still conveying the same meaning. Note how these two types of
features only rely on lexical and syntactical information. For this reason, they are sometimes referred to as surface forms. Surface forms can also be combined with other prior information, such as word sentiment polarity [11,28,44,54,57]. This prior knowl- edge usually takes the form of sentiment lexicons, i.e., dictionaries that associate words in a domain or language with a sentiment. Some lexi- cons also include non-words such as emoticons [36,40] and emoji [60]. These alternative forms of writing have been shown very useful, as they can dominate textual cues and form a good proxy for text polarity [36]. The use of lexicon-based techniques has many advantages [82], most of which stem from their combination with other methods. For instance, it is possible to generate lexicons that are domain dependent or that incorporate language-dependent characteristics. Lexicons and syntactic information can also be combined with linguistic context to shift va- lence [68]. On the other hand, there are several disadvantages to lexi- con approaches. First, creating lexicons is an arduous task, as it needs to be consistent and reliable [82]. It also needs to account for valence variability across domains, contexts, and languages. These dependencies make it hard to maintain domain-independent lexicons. An alternative to retain independence while encoding domain, language, and context variability is through semantic representation of the lexical resources in the form of ontologies. An ontology can encode both lexical [52] and affective [81] nuances, both in the lexicons and in the automatic anno- tations [9]. This is especially useful for aspect-based sentiment analysis, as the differences between aspects can be incorporated into the ontol- ogy [91]. In recent years, new approaches based on deep learning have shown excellent performance in Sentiment Analysis [5,19]. In contrast with traditional techniques, deep learning techniques learn complex features from data with minimum human interaction. These algorithms do not need to be passed manually crafted features: they automatically learn new complex features. The downside is that the quality of the features heavily depends on the size of the training data set. Hence, they often require large amounts of data, which is not always available. They also raise other concerns such as interpretability [49,51] or its inability to adapt to deal with edge cases [51]. In the realm of Natural Language Processing (NLP), most of the focus is on learning fixed-length word vector representations using neural language models [42]. These rep- resentations, also known as word embeddings, can then be fed into a deep learning classifier, or used with more traditional methods. One of the most popular approaches in this area is word2vec [55]. The down- side of these methods is that they require enormous amounts of training data. Luckily, several researchers have already applied these methods to large corpora such as Wikipedia and released the resulting embeddings. Lastly, it is also possible to combine independent predictors to achieve a more accurate and reliable model than any of the predictors on their own. This approach is known as ensemble learning. Many ensem- ble methods have been previously used for sentiment analysis. Ensem- ble methods can be classified according to two main dimensions Rokach [73]: how predictions are combined (rule-based and meta-learning), and how the learning process is done (concurrent and sequential). A new application of ensemble methods is the combination of traditional clas- sifiers based on feature selection and deep learning approaches [3].
2.2. Social network analysis and community detection
Social Network Analysis (SNA) is the investigation of social struc- tures [62]. It provides techniques to characterize and study the con- nections between people, and their interactions. SNA is not limited to Online Social Network (OSN), but to any kind of social structure. Other examples of social network would be a network of citations in publica- tions or a network of relatives. Through SNA techniques, it is possible to extract information from a social network that may be useful for sen- timent analysis, such as chains of influence between users, groups of like-minded users, or metrics of user importance.
social context and to time-independent social context as static social context. To illustrate the definitions, we will model an example of social con- text for a sentiment analysis task on Facebook content. For this anal- ysis, we only need access to status updates by some users, and photos uploaded to a set of Facebook pages (groups). The first element in social context is content:
Definition 2. The collection of content is defined as:
C - ctJ | t < (^) (1)
Where Tc are all the types of content available, and each cti is a piece of content of a certain type t. Each piece of content should be unambiguously identified by its type and an identifier (i).
Our example context only includes two types of contents: status up- dates and photos. Each type of content may be given some attributes. Some of these attributes are common, such as the creation date. Others are specific for that type, such as the keywords for status updates, and the link to the image file for photos. Additionally, each photo and each status has to be given an identifier, which may also be the one given by the Facebook API. So far, the context defined is not very useful, as it would only allow us to analyze the sentiment of the status updates and the photos (using other modalities). The next element in Social Context is the collection of users in the network.
Definition 3. Let the set of users be:
U = [U 1 ,U 2 ,... , M „ } (2)
Where each i¿¡ is a specific user that is unambiguously identified by its user identifier i. Each user may have one or more roles. The set of roles for a user is:
/>(",) = {* I />,(",) = 1,",- eU,ti (3)
Where Tp are all possible roles in a context, and pt(u¡) is a function that determines whether user i¿¡ has been assigned role t.
Roles define the function of users within the network. They usually restrict the type of interactions and relations a user may have, and with what content and users, e.g., online fora have the role of topic moder- ators, in addition to regular users. The aim of moderators is to decide what content should be allowed, to edit it, and to manage users that misbehave. Hence, new relations (e.g., edited-by) and interactions (e.g., ban) are available to this specific role. If the user is a moderator of more than one topic, several roles will apply. Our example context will include the profiles of the users in our study and their attributes. Since we are only interested in age and location, users will just have those attributes. Our users may also have roles. In our case, we will be interested in page administrators. At this point, the lack of connection between users and content hampers other types of analysis. The categorization of connections in Social Context is based on the concept of social ties in the social sciences, i.e., dyadic relations [8]. Social ties are grouped into one of four categories: similarities, such as co-location or being the same gender; social relations, such as kinship (e.g., family ties), role (e.g., friendship), or affection (e.g., liking); inter- actions, such as having talked to each other, or harming one another; and flows, such as sharing information, beliefs, or resources. For the sake of simplicity, and based on the use of context in the state of the art, only two types of connections are modeled as part of Social Context: relations (Definition 4) and interactions (Definition 5). The remaining social ties (similarities and flows) can be modeled as an equivalent rela- tion or interaction, depending on the case. Similarities are not typically considered as ties in themselves but rather as conditions or states that increase the probability of forming other kinds of ties. Flows are typi- cally inferred from interactional and relational data [8] so, for the sake
of simplicity, they can be thought of as another type of relation or in- teraction. Hence, relations are connections such as friendship, kinship, group membership or liking each other, whereas interactions are connections such as getting in touch, re-sharing each other's content, etc. There are two main differences between relations and interactions that motivate their distinction. First, relations are few and slow-changing, whereas in- teractions are plentiful and short-lived. Secondly, content can be related to other content (e.g., a reply and the original content), while interac- tions are always performed by a user agent. Formally, relations and interactions are defined as follows:
Definition 4. Given a set of content C, and a set of users U. Relations are the connections between users (Ru), between users and content (Ruc) and between different content (Rc). Formally:
R: {r,\t<^ :^ R"^ U^ R uc (^) U Rc
" = Kui,u,\Ui'UJ^U'Ui*Uj't^Tr,u}
c
6 i /
c
6 C
e r
(4)
(7)
Where Tr c are the types of relations between two pieces of content, Tr uc are the types of relations between users and content, and Tr u are the types of relations between users.
Definition 5. Given a set of content C, and a set of users U. Interactions are the activities carried on by a user that involve either another user (Tu), or a piece of content (Juc). Formally:
I = [i, | teTi} =rulu'
I"- \i"^. I^ U:,Uj^ e^ U,^1^1
{¿f 1 .\u,eU,c¡ t,Ut,Uj,l ' ' ' J
\C,t<
Where T^uc are the types of interactions between user and content, Tiu are the types of interactions between users, and i is an identifier for the interactions, as multiple interactions of the same type are possible.
With all elements defined, we can go back to the previous example of Social Context on Facebook. From the possible types of relations be- tween users (Ru), we may add two: user friendship and kinship. These two relations would allow us to group users that are closely related. To link users with content, we will choose two types of user-content rela- tions (Ruc): authorship, and mentions (i.e., the link between the content and the users it mentions). As for relations between content (Rc), we may choose replies (i.e., the link between two pieces of content when one mentions the other). Lastly, we will only have access to interac- tions between users and content (Z"c) in the form of likes, reactions, and replies. Due to technical limitations, we will not have access to user interactions, such as direct messages. The resulting example context would allow for richer analyses that exploit information such as inferred groups of people based on how often they interact with each other or appear in photos together. Sentiment analysis may exploit prior knowledge about the sentiment of the user (via the authorship relation), or even knowledge about the sentiment of friends and acquaintances (through either relations or interactions between users). It may even be possible to find people within the group that have changed the opinion of the people with whom they interact. Table 1 shows other types of user, content, relations and interactions found in popular OSN. It includes common elements in the OSN ana- lyzed in the state of the art: Twitter, Weibo, Reddit, Facebook, blogging platforms and Wikipedia.
Table 1 Types of Social Context elements in different OSN.
OSN
Blog Wiki
Content (Tc)
Tweet
Post Comment
Status Page Comment Photo Event Post Comment Page Comment
User roles (Tp)
User
User
User Admin
User Page admin
Author Reader Editor Reviewer
Relations (Tr) User-User (Tr i l ) Follow Friend
Follow Friend
Follow
Friend Relative
Follow
User-Content (Trilc) Author Mentioned Favorite Author Mentioned Favorite Author Mentioned
Author Admin Fan Own Tagged Attend Like React Author Like Author Edit Review
Content-Content ( i ^ ) Reply Retweet
Reply Reshare
Link Reply
Link Reply Contain
Link Reply Link Parent Reply
Interactions (Tf) User-User (i^u) Mention Reply
Mention Reply
Mention Reply
Mention Reply Tag
Mention Reply
User-Content (i^uc) Reply Retweet Mention Reply Reshare
Vote Gild Reply Mention Comment Re-share
Reshare Comment Edit
The tabular format does not capture how different types of relations or interactions are unique to certain types of content and/or user roles. We will exemplify this fact using Facebook since it has different types of content and users roles. In Facebook, we may consider four main types of content. There are statuses, which are posts by users which are shown on their own profile (i.e., user feed). Statuses are very rich, they may mention other users, include location information, link to other content, or even express the mood of the author. The visibility of the status is governed by the user's privacy settings, and the relationship of the user to others. For instance, privacy-minded users may make their statuses only available to their close friends, while other users may make theirs public. Similarly, users can create pages, which are public profiles created around a specific topic, such as a business, a brand, or a cause. Pages are similar to user profiles, but they can be administered by one or more users. Another type of content is photos, which may be linked to a user profile or to a page. Photos can include information about the users that appear in them, which creates a relation between the photo and the users. Events are a different type of content that is used to organize gatherings and to give information about them. Users may indicate whether they will attend, comment on the event, and invite other users to join.
Users may interact with content to which they have access in differ- ent ways: by liking it; by commenting to it, which creates new content that other users may interact with; or by expressing their reaction or emotion to it, such as surprise. These types of interaction are common for all types of content. Some types of content provide other means of interaction, such as re-sharing of posts, which allows users to share a post by other user in their own profiles. The primary means for interaction between users is through con- tent, either by interacting with the content, e.g., users may reply to each other's content, by including other users in their content, e.g., by adding a mention in a comment or a tag in a photo. Lastly, they may interact through special actions such as poking each other, or through private in- stant messages. Since these interactions are private, they have not been included in the table.
Some researchers are concerned that the typical follower-friend re- lation might not be enough to capture the richness of relations in on- line media [20]. They also propose researching into new multifaceted approaches which take into consideration more aspects of the network simultaneously. Social context has been intentionally defined with those approaches in mind. The definition of Social Context can be interpreted in the form of sets, or in its equivalent graph form, where users and content are vertices, and both relations and interactions are edges. The graph form can be combined with different types of links (Tc, Tw Tr, T¿) to generate multiplex networks [27] (i.e. a multilayered network of users and content), which can be exploited in multifaceted approaches.
To conclude, the usage of the social network [43] and the effect of the social network on user behaviour [18] depend on other aspects such as cultural differences, factual information and events. This type
of information falls outside the scope of social context, and will need to be encoded through other means such as a knowledge graph, or a description of events. However, social context will capture information such as language of a user or creation time of content, which can be used to link the user or content to that external information. This concept will be further explained in Section 4.2.
4. Framework for research on social context in sentiment analysis
This section defines a novel framework to compare sentiment anal- ysis approaches that exploit social context. The framework is centered around a multi-levelled taxonomy for structuring research in the field. The first level refers to the dataset used. The second level covers the scope of Social Context built from the dataset. The third level covers machine learning methods applied. The fourth level covers the type of social context used (static and dynamic). Each level is further explained in a separate section.
4.1. Dataset
The datasets used for analyzing social context can be identified by several characteristics. The first of them is the online social network from which the data was gathered. Twitter predominates in this area, due to its relatively open API and abundance of content. The second characteristic is the type of annotation on content. Likewise, the third characteristic is the type of annotation on users. In this work, we focus on sentiment (polarity), but other annotations such as stance, emotion, and quality of the content are often used. In the case of polarity, the classes used may also differ, i.e. positive ( + ), negative (- ) and neutral (0). The fourth, fifth, and sixth characteristics are the type of link be- tween users, between pieces of content, and between users and content. These links can stem either from a relation or from an interaction, as mentioned in the definition of social context.
4.2. Context scope
Researchers have to choose what information from their datasets to select for the social context in their work. They may also complement the original data with information from external sources. As a consequence, every work employs a different context. Nonetheless, a closer inspection reveals some patterns: some elements are commonly used together (e.g., users and friendships), and some elements are harder to obtain or rarer than others (e.g., follower-followee relations are more common than retweets or favorites). As contexts get more and more complex, they start including more unusual elements in addition to the more basic ones. Hence, we propose a classification of works based on the complexity or scope of their context. Our proposal is inspired by the micro, meso and macro levels of analysis typically used in social sciences [7]. The
user's sentiment (and their content's) by using the sentiment of the con- tent to which she is being exposed. On the other hand, studies of social media activity regarding grassroots movements have shown that social integration, as measured through social network metrics, increases with their level of engagement and of expression of negativity [2]. This sug- gests a connection between the groups to which a user belongs, and the sentiment the user expresses. The connection could be exploited for user classification and, in turn, for classification of the content created by them.
4.4. Analysis methods and social theories
Lastly, works differ in the type of classification performed. The op- tions here range from using traditional classification algorithms (e.g., random forest, SVM) or neural networks, to network-based approaches such as label propagation. However, two types of algorithms stand out from those of contextless analysis: models that directly benefit from the networked nature of context, and deep learning approaches. Sev- eral works also use a hybrid approach, where traditional techniques are combined with network techniques, either via multiple processing steps or by combining the techniques into one. There are several ways in which algorithms could leverage the net- works in social context. Firstly, some algorithms are already network- oriented. Label propagation, in particular, has shown promising re- sults [80], and it can be made to treat lexical resources and the sub- ject of the analysis equally. Secondly, the structure of the network can be directly incorporated into the learning process through modified cost functions [38,92]. Thirdly, the output of a classifier can be later comple- mented with a network-based algorithm. For example, Li et al. [48] ap- ply standard classification, then tweets or users are clustered, and within each cluster, every piece of content or every user are given the same la- bel according to different criteria (i.e., most confident result, majority label, and weighted majority). Fourthly, a multi-step or ensemble clas- sification strategy can be used, where the structure of the network and social theories are used to combine the results of different classifiers. On the deep learning front, recent works are incorporating different types of neural networks that have been used for contextless analysis and subjectivity analysis [14], such as convolutional neural networks (CNN). At the same time, concepts such as word embeddings have in- spired network embedding as an alternative way of including features from social context in the analysis [97]. The range of features that can be captured through network embeddings is vast, including several types of relations [13]. Moreover, new research is complementing and extend- ing node embedding (i.e., nodes are represented as vectors) with other methods such as edge and community embedding [10]. In particular, community embedding has shown promising results in community pre- diction and node classification [12]. In general, network approaches usually follow well-known social theories. Social theories usually model how users with different views or status arrange themselves in the network. In other words, they are rules of attachment. They may also model how users behave. Some examples of social theories or attributes include homophily, consistency, social balance, and status theory. Homophily [53] is one of the commonly used theories in the works we have examined and in the social sciences. In simple terms, homophily means a connection between two people is more likely when they are similar in some aspects (i.e., birds of a feather flock together). Under the hypothesis of homophily, when two users are connected, certain features can be propagated. Con- sistency [50] usually means that users tend to maintain their views over time. So, two pieces of content shared by the same user in a short period are likely to express a similar sentiment or opinion if they are about the same topic. The social status theory [47] models the balance of power in social networks. It states that, if three nodes A, B and C form a clique, and the status relation between A and B is the same as between B and C, it must also be true of A and C. In other words, the superior of your superior is your superior, and the inferior of your inferior is your infe-
rior. Social balance models the balance of opinions in cliques. The rules in social balance translate to: a friend of a friend is a friend, and an en- emy of my enemy is my friend. Tang et al. [84] presents a more detailed explanation of social theories that can be used to mine social media.
5. Review of social context and sentiment analysis works
This section is the result of reviewing the state of the art in using so- cial context for sentiment analysis. The review is composed of five sub- sections. The first one presents and compares the different works that have been reviewed. The second subsection describes and compares the datasets that have been used in these works. The third subsection covers common social context features that are useful for sentiment analysis. The fourth one presents a performance comparison of the works on dif- ferent datasets. The last subsection discusses ways in which sentiment analysis has been used to improve social network analysis.
5.1. Works
This section introduces recent works in the area of sentiment anal- ysis that use social context. The aim is to compare how social context is defined and exploited in each of them. The main features of each of the works are summarized in Table 2. The table shows the gradual in- troduction of interactions to complement interactions, as works evolve from mesor to mesoi and mesoe approaches. It also highlights the most commonly used types of elements and social theories used. To the best of our knowledge, the first work to make explicit mention of social context in the context of sentiment analysis is Lu et al. [50]. Their goal was to predict the quality of reviews, rather than their sen- timent, but the work is worth mentioning for three reasons. First of all, they provide the first formal mention of social context in the sense cov- ered in this work. Secondly, their novelty is that they merge traditional features (text) with what they call Social Network Features. They pro- vide a categorization of features, including author and social network features, which are calculated with social network analysis. Lastly, the network is used to extract constraints based on several hypotheses of consistency (of authors, links, citations, and trust). On a related note, Pennacchiotti and Popescu [67] leverage replies, retweets and friendship relations to infer user attributes, such as ethnic- ity and political orientation. Their definition of political orientation can be considered stance detection. Although their work is implicitly mo- tivated by a hypothesis of homophily, they do not make any mention of specific social theories, and no constraints or rules based on them are constructed. Instead, classification is achieved via Gradient Boosted Decision Trees. Speriosu et al. [80] introduce an alternative approach to infer polar- ity that exploits the networked nature of social context. They compare three different approaches: a lexicon-based classifier (baseline), a maxi- mum entropy classifier and Label Propagation (LPROP). The best results were achieved with LPROP, which is also appealing because it yields an- notations for resources (e.g., lexicon), content and users indistinctly. Similarly, Tan et al. [83] use a network approach based on SampleR- ank with a Markovian model. The model assumes that the sentiment of a given user is only influenced by the sentiment label of tweets gener- ated by that user (consistency), and the sentiment of neighboring users (homophily). Li et al. [48] compare an approach based on linguistic features with a combination of linguistic features and social features (referred to as global social evidence). The goal is sentiment analysis about political figures (targets) on Twitter and fora. In their hybrid approach, users, tar- gets and issues (topics targets are vocal about) form a network. Three different hypotheses are then exploited on the data: 1) global consis- tency on indicative target-issue pairs, 2) global consistency on indica- tive target-target pairs, and 3) social balance. The results are slightly better than the baseline in the case of Twitter and widely better for fo- rum data. A similar comparison of linguistic and social features is made
Table 2 Comparison of works using sentiment analysis and social context. The number of polarity labels is shown in parentheses.
Reference Level Social Theories Pennacchiotti and Popescu [67]
Speriosu et al. [80] Tan et al. [83]
Li et al. [48]
Aisopos et al. [1] Hu et al. [38]
Pozzi et al. [70]
Ren and Wu [72] Deng et al. [23]
Twitter Twitter
Twitter, Fora Twitter Twitter
Twitter Fora
mesoi
mesor mesoi
mesor, Macro micro, mesoi mesor
mesoi
mesor mesor
Political orientation, ethnicity Polarity (2) Polarity (2)
Stance (targets)
Polarity (3)
Polarity (2)
Polarity (2)
Polarity (3)
Polarity (2)
Polarity (2)
Polarity (2) Polarity (3)
Polarity (3)
(mutual) mention Stance (targets)
Mention
Retweet
Replies, Retweets
Retweet
Retweet
Reply
Westetal. [92]
Yang and Eisenstein Twitter [97] Cheng et al. [16] Reddit Sixto et al. [79] Twitter Xiaomei et al. [95] Twitter
Polarity (3) Polarity (3) Votes, Mentions
mesoi mesoi meso 0
Polarity (2)
Polarity (2) Polarity (5) Polarity (2)
Retweet, Mention Reply Retweet
Authorship Friends
Authorship Follower Authorship Follower
Authorship Authorship
Authorship
Authorship
Favorite Authorship
Follower Follower
Mutual follower
Friends, inferred friends
Follow Follow
consistency, homophily balance, consistency
consistency and contagion
homophily homophily, consistency
social status, social balance language homophily
emotion contagion
by Aisopos et al. [1]. In their work, several classification algorithms are compared using different feature models, some of which include social context features. Hu et al. [38] are the first in our review to include a classification al- gorithm specially tuned to incorporate social context. Their work is also interesting because they overcome the fact that most existing datasets only contain texts, which makes them unsuitable for social context anal- ysis. They do so by combining text datasets with the friendship graph extracted from Kwak et al. [46]. Other works focus on user classification, such as Pozzi et al. [70]. They leverage connections in the network to infer user polarity, with highly positive results. User connections can also be exploited for con- tent polarity classification. Ren and Wu [72] use both friendship and user-topic relations (calculated from user tweets) to calculate user-topic polarity. In addition to friendship, Deng et al. [23] use reply-to relations in online fora, as well as inferred friendship. West et al. [92] showed that the assumption of homophily in networks can improve polarity detec- tion from short texts. They use social ties to infer the stance of users in Wikipedia. In particular, they exploit the social balance and social sta- tus theories. They also point out the effect that the selection strategy of training and testing nodes has on accuracy. Tang et al. [84] use similar social theories to improve sentiment analysis on Twitter.
Lately, some works have introduced novel approaches such as Con- volutional Networks [97]. In doing so, they add new types of features such as network embeddings, i.e., a vector representation of the network of a user, which can be fed into a classifier. The motivation behind these embeddings is to leverage language homophily in the analysis. Cheng et al. [16] follow in these steps, with a similar premise using content from a different social network (Reddit). In this case, the analysis also exploits the fact that comments are nested at different levels.
5.2. Datasets
The usual drawback with sentiment analysis datasets is that they rarely incorporate social context. This is either because social context was not taken into consideration when the dataset was collected or be- cause of data protection policies and terms of use of the original OSN.
Table 3 Datasets used in the experiments.
RT Mind [70] OMD [77] HCR-DEV [80] HCR-TEST [80] STS [31] PF1901 [23] MF1560 [23] SemEval 2013 [56] SemEval 2014 [76] SemEval 2015 [75] Ciao [85] TASS [74] YANG2011 [96] Li-Twitter [48] Li-Forum [48] AskMen [16] Ask Women [16] Politics [16]
Source Twitter Twitter Twitter Twitter Twitter Forum Forum Twitter Twitter Twitter Ciao Twitter Twitter Twitter Forum Reddit Reddit Reddit
Users 62 679 806 806 498 412 320 3813 5749 2379 257, 158 20 M ? ? ? ? ?
Entries 159 1261 1434 1434 490 1901 1560 3813 5749 2379 10, 68, 476 M 4646 762 1057 K 814 K 2180 K
The latter is usually easier to circumvent, as these datasets usually have IDs or pointers to the original resources, so that the necessary data can be recovered with the appropriate credentials and access to the OSN. This process is known as hydration, and it can be used to recover more data than was initially considered, i.e., it enables the expansion of the social context. The limitation is the fact that resources can be removed or made private before hydration. Table 3 shows basic statistics of the datasets used in the works reviewed. RT Mind [70] contains a set of 62 users and 159 tweets, with positive or negative annotations. To collect this dataset, Pozzi et al. [70] crawled 2500 Twitter users who tweeted about Obama during two days in May
are more likely to express, and they have been shown to help in sentiment analysis [88].
Content may also be linked to features such as:
Additionally, it is also possible to generate user and topic-specific models or to embed the context of the topical context of the con- tent [16,23]. Network-based algorithms such as label propagation and algorithms that take arbitrary input sizes, such as recurrent neural net- works, are not constrained by a fixed input space. As a result, they can incorporate features of the context without aggregation, such as aver- aging.
5.3.2. Mesor features At this level, a network of users and content also starts to form. Con- nections in this network may be directed or undirected. Some examples of relations that can originate a network are:
5.3.3. Meso, features Interactions can also be used to create a network. For instance:
The ability to relate an author to other users enables the propagation of micro features over the meso network, which yields a new set of features, such as:
Lastly, some techniques allow embedding large information net- works (be it content, user or mixed networks) into low-dimensional vector spaces. These types of techniques are increasingly popular in con- textless analysis due to their excellent performance [3]. The components of the embedding can then be used as features, either on their own or combined with other features. One example of network embedding is the LINE method [86], which is used in one of the works reviewed [16]. However, LINE does not take different types of nodes or relationships into account. The heterogeneous network embedding model [13] is an alternative. Although it was conceived to embed networks of text and images, it could be adapted to encode mixed networks of content and users.
5.3.4. Mesoe features and enrichment through social network analysis Social Network Analysis provides several methods to process, ex- amine and describe a social network. These methods use the network topology and its attributes and infer information that could be useful for sentiment analysis tasks. For instance, there are several ways to measure user popularity and influence in a social network, according to different criteria. As a result, the impact of each user in the sentiment prediction can be weighted. Similarly, the importance of user connections (rela- tions and interactions) can be measured. Thus, the granularity can be set at the connection level, where sentiment prediction is not only influ- enced by neighboring users, but also on the strength of the connection to those neighbors. Another example is community detection, which could help segment the user base into smaller groups that exhibit similar be- havior.
5.3.5. Macro features Macro features include any type of information that is outside of the realm of the OSN. Hence, the possibilities for features in this category are unlimited. Of all the works we have reviewed, only one [48] uses macro features. In particular, it uses known enmity or opposition be- tween politicians, together with social theories about user and target consistency. Other possibilities include the analysis of links to external sources or attachments.
Table 4 Maximum Accuracy score reported in each work, per level of analysis and dataset.
Work [1] [23]
[48]
[79] [80]
[95]
[16]
[79] [97]
Level Dataset YANG MF PF Li-Forum Li-Twitter TASS HCR-DEV HCR-TEST OMD STS HCR OMD AskMen AskWomen Politics TASS Ciao SE 2013 SE 2014 SE 2015
Metric
Ace. Ace. Ace. Ace. Ace. Ace. Ace. Ace. Ace. Ace. Ace. Ace. Fl Fl Fl Fl Fl Fl Fl Fl
Baseline
micro
mesor
mesoi
mesoe
macro
Dataset STS OMD HCR-DEV HCR-TEST Li-Twitter Li-Forum YANG MF PF TASS 2015 HCR
contextless micro mesor meso: Level
Fig. 4. Difference in accuracy with respect to a contextless approach in all works analyzed, per dataset. The results for [1] have been removed due to their un- usually high accuracy (Table 4).
5.4. Performance
Having described these works, it is also important to compare their performance. Few works use the same dataset in the same conditions. In- stead of providing that comparison, Table 4 summarizes the best results for content-level classification in every work surveyed, at every level of analysis identified in the taxonomy in Section 4. The table shows both re- sults for Fl-score and accuracy, when available. As expected, the results show that social context improves the performance over the contextless baseline.
For completeness, Figs. 4 and 5 show all the results reported in these works, grouped by the level of analysis. The performance is shown rel- ative to the contextless baseline in every dataset.
5.5. Other approaches
Although this paper focuses on using social context to improve senti- ment analysis, there are other ways in which sentiment information can
5 -
0 -
5 -
0 -
1.5-
1.0-
- •
Dataset TASS 2015 SemEval 2013 SemEval 2014 SemEval 2015 AskWomen AskMen Politics
contextless micro meso, meso. Level
Fig. 5. Difference in Fl score with respect to a contextless approach in all works analyzed, per dataset.
be fused with other sources or types of information [4]. For instance, sentiment information can be included into existing social network anal- ysis. This can be done to characterize or explain a given phenomenon. When adding sentiment information, some patterns and trends emerge, which would otherwise be lost in the global aggregate. For instance, sentiment information can be used to analyze different Twitter commu- nities separately instead of aggregating their results [22]. Sentiment and social network analysis can also be combined to find potentially radicalized users [6', or to highlight emotionally charged content [24]. Additionally, sentiment information alone has proved to yield very high precision and a low recall in some user classification tasks [67]. This suggests that sentiment information could be crucial in positively identifying members of specific groups.
6. Conclusions and future work
The question that motivated this work was whether there is valu- able information in social networks that has the potential to improve sentiment analysis in specific scenarios. We refer to this information as social context. To answer this question, three related questions need to be answered: "what is social context?"(Ql), "can social context improve
[36] A. Hogenboom, D. Bal, F. Frasincar, M. Bal, F. De Jong, U. Kaymak, Exploiting emoti- cons in polarity classification of text., J. Web Eng. 14 (1&2) (2015) 2 2 - 4 0. 00043 [37] D. Hovy, Demographic factors improve classification performance., in: ACL (1), 2015, pp. 7 5 2 - 7 6 2. [38] X. Hu, L. Tang, J. Tang, H. Liu, Exploiting social relations for sentiment analysis in microblogging, in: Proceedings of the Sixth ACM International Conference on Web Search and Data Mining, in: WSDM 1 3 , ACM, New York, NY, USA, 2013, pp. 537-546. [39] B.J. Jansen, M. Zhang, K. Sobel, A. Chowdury, Twitter power : tweets as electronic word of mouth, J. Am. Soc. Inf. Sci. 60 (11) (2009) 2 1 6 9 - 2 1 8 8. [40] F. Jiang, Y.-Q. Liu, H.-B. Luan, J.-S. Sun, X. Zhu, M. Zhang, S.-P. Ma, Microblog sen- timent analysis with emoticon space model, J. Comput. Sci. Technol. 30 (5) (2015) 1120-1129. 00026 [41] A.M. Kaplan, M. Haenlein, Users of the world, unite! the challenges and opportuni- ties of social media, Bus. Horizons 53 (1) (2010) 5 9 - 6 8. [42] Y. Kim, Convolutional neural networks for sentence classification, in: Proceed- ings of the 2014 Conference on Empirical Methods in Natural Language Pro- cessing (EMNLP), Association for Computational Linguistics, Doha, Qatar, 2014, pp. 1 7 4 6 - 1 7 5 1. [43] Y. Kim, D. Sohn, S.M. Choi, Cultural difference in motivations for using social net- work sites: a comparative study of American and Korean college students, Comput. Human Behav. 27 (1) (2011) 3 6 5 - 3 7 2. 00708 [44] S. Kiritchenko, X. Zhu, S.M. Mohammad, Sentiment analysis of short informal texts, J. Artif. Intell. Res. 50 (2014) 7 2 3 - 7 6 2. [45] A.D. Kramer, J.E. Guillory, J.T. Hancock, Experimental evidence of massive-scale emotional contagion through social networks, Proc. Natl. Acad. Sci. (2014). [46] H. Kwak, C. Lee, H. Park, S. Moon, What is twitter, a social network or a news media? in: Proceedings of the 19th International Conference on World Wide Web, in: WWW 1 0 , ACM, New York, NY, USA, 2010, pp. 591-600. [47] J. Leskovec, D. Huttenlocher, J. Kleinberg, Signed networks in social media, in: Pro- ceedings of the SIGCHI Conference on Human Factors in Computing Systems, ACM, 2010, pp. 1361-1370. [48] H. Li, Y. Chen, H. Ji, S. Muresan, D. Zheng, Combining social cognitive theo- ries with linguistic features for multi-genre sentiment analysis., in: PACLIC, 2012, pp. 127-136. [49] Z.C. Lipton, The mythos of model interpretability., 2016 arXiv:1606.03490. [50] Y. Lu, P. Tsaparas, A. Ntoulas, L. Polanyi, Exploiting social context for review quality prediction, in: Proceedings of the 19th International Conference on World Wide Web, ACM, 2010, pp. 691-700. [51] G. Marcus, Deep learning: a critical appraisal., 2018 arXiv:1801.00631. [52] J. McCrae, D. Spohr, P. Cimiano, Linking lexical resources and ontologies on the semantic web with lemon, in: Extended Semantic Web Conference, Springer, 2011, pp. 2 4 5 - 2 5 9. 00210 [53] M. McPherson, L. Smith-Lovin, J.M. Cook, Birds of a feather: homophily in social networks, Ann. Rev. Sociol. 27 (1) (2001) 415-444. [54] P. Melville, W. Gryc, R.D. Lawrence, Sentiment analysis of blogs by combining lex- ical knowledge with text classification, in: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, in: KDD '09, ACM, New York, NY, USA, 2009, pp. 1275-1284. [55] T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representa- tions in vector space., 2013 arXiv:1301.3781. [56] P. Nakov, S. Rosenthal, Z. Kozareva, V. Stoyanov, A. Ritter, T. Wilson, Se- mEval-2013task 2: sentiment analysis in twitter, in: Proceedings of the 7th Inter- national Workshop on Semantic Evaluation. SemEval 1 3 , 7, 2013, pp. 312-320. Atlanta, Georgia, USA [57] T. Nasukawa, J. Yi, Sentiment analysis: capturing favorability using natural language processing, in: Proceedings of the 2Nd International Conference on Knowledge Cap- ture, in: K-CAP '03, ACM, New York, NY, USA, 2003, pp. 7 0 - 7 7. [58] M.-T. Nguyen, D.-V. Tran, L.-M. Nguyen, Social context summarization using user- generated content and third-party sources, Know.-Based Syst. (2017). [59] T. Noro, T. Tokuda, Searching for relevant tweets based on topic-related user activ- ities, J. Web Eng. 15 (3-4) (2016) 249-276. [60] P.K. Novak, J. Smailovic, B. Sluban, I. Mozetic, Sentiment of emojis, PloS One 10 (12) (2015) e0144296. 00226 [61] G.K. Orman, V. Labatut, H. Cherifi, Qualitative comparison of community detection algorithms, in: International Conference on Digital Information and Communication Technology and Its Applications, Springer, 2 0 1 1 , pp. 265-279. [62] E. Otte, R. Rousseau, Social network analysis: a powerful strategy, also for the in- formation sciences, J. Inf. Sci. 28 (6) (2002) 4 4 1 - 4 5 3. [63] A. Pak, P. Paroubek, Twitter as a corpus for sentiment analysis and opinion mining., in: LREc, 10, 2010, pp. 1320-1326. [64] B. Pang, L. Lee, Opinion mining and sentiment analysis, Found. Trends®Inf. Retr. 2 (1-2) (2008) 1-135. [65] B. Pang, L. Lee, S. Vaithyanathan, Thumbs Up?: sentiment classification using ma- chine learning techniques, in: Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing - Volume 10, in: EMNLP '02, Association for Computational Linguistics, Stroudsburg, PA, USA, 2002, pp. 79-86. [66] S. Papadopoulos, Y. Kompatsiaris, A. Vakali, P. Spyridonos, Community detection in social media, Data Min. Know. Discov. 24 (3) (2012) 515-554. [67] M. Pennacchiotti, A.-M. Popescu, A machine learning approach to twitter user clas- sification., Icwsm 11 (1) (2011) 2 8 1 - 2 8 8. [68] L. Polanyi, A. Zaenen, Contextual valence shifters, in: Computing Attitude and Affect in Text: Theory and Applications, Springer, 2006, pp. 1-10.
[69] S. Poria, E. Cambria, R. Bajpai, A. Hussain, A review of affective computing: from unimodal analysis to multimodal fusion, Inf. Fusion 37 (2017) 9 8 - 1 2 5. [70] F.A. Pozzi, D. Maccagnola, E. Fersini, E. Messina, Enhance user-level sentiment anal- ysis on microblogs with approval relations, in: Congress of the Italian Association for Artificial Intelligence, Springer, 2013, pp. 133-144. [71] K. Ravi, V. Ravi, A survey on opinion mining and sentiment analysis: tasks, approaches and applications, Know.-Based Syst. 89 (Supplement C) (2015) 1 4 -
[72] F. Ren, Y. Wu, Predicting user-topic opinions in twitter with social and topical con- text, IEEE Trans. Affect. Comput. 4 (4) (2013) 412-424. [73] L. Rokach, Ensemble-based classifiers, Artif. Intell. Rev. 33 (1-2) (2010) 1-39. [74] J.V. Román, E.M. Cámara, J.G. Morera, S.M.J. Zafra, TASS 2014-The challenge of aspect-based sentiment analysis, Procesamiento del Lenguaje Natural 54 (2015) 61-68. [75] S. Rosenthal, P. Nakov, S. Kiritchenko, S. Mohammad, A. Ritter, V. Stoyanov, Semeval-2015 task 10: sentiment analysis in twitter, in: Proceedings of the 9th international workshop on semantic evaluation (SemEval 2015), 2015, pp. 4 5 1 -
[76] S. Rosenthal, A. Ritter, P. Nakov, V. Stoyanov, SemEval-2014Task 9: sentiment anal- ysis in twitter, in: Proceedings of the 8th International Workshop on Semantic Eval- uation (SemEval 2014), Dublin, Ireland, 2014, pp. 7 3 - 8 0. [77] D.A. Shamma, L. Kennedy, E.F. Churchill, Tweet the debates: understanding commu- nity annotation of uncollected sources, in: Proceedings of the First SIGMM Workshop on Social Media, in: WSM '09, ACM, New York, NY, USA, 2009, pp. 3-10. [78] A. Sharma, S. Dey, A comparative study of feature selection and machine learning techniques for sentiment analysis, in: Proceedings of the 2012 ACM Research in Ap- plied Computation Symposium, ACM, 2012, pp. 1-7. [79] J. Sixto, A. Almeida, D. López-de Ipiña, Analysis of the structured information for subjectivity detection in twitter, Trans. Comput. Collect. Intell. XXIX (2018) 1 6 3 - 1 8 1. [80] M. Speriosu, N. Sudan, S. Upadhyay, J. Baldridge, Twitter polarity classification with label propagation over lexical links and the follower graph, in: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, 2 0 1 1 , pp. 5 3 - 5 6. [81] J.F. Sánchez-Rada, C.A. Iglesias, Onyx: a linked data approach to emotion represen- tation, Inf. Process. Manag. 52 (1) (2016) 99-114. 00026 [82] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, M. Stede, Lexicon-based methods for sentiment analysis, Comput. Linguist. 37 (2) (2011) 267-307. [83] C. Tan, L. Lee, J. Tang, L. Jiang, M. Zhou, P. Li, User-level sentiment analysis in- corporating social networks, in: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, in: KDD 1 1 , ACM, New York, NY, USA, 2 0 1 1 , pp. 1397-1405. [84] J. Tang, Y. Chang, H. Liu, Mining social media with social theories: a survey, SIGKDD Explor. Newsl 15 (lid) (2014) 2 0 - 2 9. [85] J. Tang, H. Gao, H. Liu, mTrust: discerning multi-faceted trust in a connected world, in: Proceedings of the Fifth ACM International Conference on Web Search and Data Mining - WSDM 1 2 , ACM Press, Seattle, Washington, USA, 2012, p. 93. [86] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, Q. Mei, Line: Large-scale information network embedding, in: Proceedings of the 24th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, 2015, pp. 1067-1077. [87] A. Tommasel, D. Godoy, A social-aware online short-text feature selection technique for social media, Inf. Fusion 40 (2018) 1-17. [88] S. Volkova, T. Wilson, D. Yarowsky, Exploring demographic language variations to improve multilingual sentiment analysis in social media, in: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013, pp. 1815-1827. [89] S. Volkova, Predicting Demographics and Affect in Social Networks, Johns Hopkins University, Baltimore, Maryland, 2015 Ph.D. thesis. [90] S. Wang, C D. Manning, Baselines and bigrams: simple, good sentiment and topic classification, in: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2, in: ACL 1 2 , Association for Computational Linguistics, Stroudsburg, PA, USA, 2012, pp. 90-94. [91] W. Wei, J.A. Gulla, Sentiment learning on product reviews via sentiment ontology tree, in: Proceedings of the 48th Annual Meeting of the Association for Computa- tional Linguistics, Association for Computational Linguistics, 2010, pp. 4 0 4 - 4 1 3. 00134 [92] R. West, H.S. Paskov, J. Leskovec, C. Potts, Exploiting social network structure for person-to-person sentiment analysis, CoRR (2014) abs/1409.2450. [93] F. Wu, J. Shu, Y. Huang, Z. Yuan, Co-detecting social spammers and spam messages in microblogging via exploiting social contexts, Neurocomputing 201 (2016) 5 1 -
[94] R. Xia, C. Zong, Exploring the use of word relation features for sentiment classifi- cation, in: Proceedings of the 23rd International Conference on Computational Lin- guistics: Posters, in: COLING 1 0 , Association for Computational Linguistics, Strouds- burg, PA, USA, 2010, pp. 1336-1344. [95] Z. Xiaomei, Y. Jing, Z. Jianpei, H. Hongyu, Microblog sentiment analysis with weak dependency connections, Know.-Based Syst. 142 (2018) 170-180. [96] J. Yang, J. Leskovec, Patterns of temporal variation in online media, in: Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, ACM, 2011, pp. 177-186. [97] Y. Yang, J. Eisenstein, Overcoming language variation in sentiment analysis with social attention, Trans. Assoc. Comput. Linguist. 5 (2017) 295-307.