Deep Averaging Networks: Marrying Speed and Accuracy in Sentiment Analysis | Resumos Dendrologia

Deep Unordered Composition Rivals Syntactic Methods

for Text Classification

Mohit Iyyer,1Varun Manjunatha,1Jordan Boyd-Graber,2Hal Daum´

e III1

1University of Maryland, Department of Computer Science and UMIAC S

2University of Colorado, Department of Computer Science

{miyyer,varunm,hal}@umiacs.umd.edu,[email protected]

Abstract

Many existing deep learning models for

natural language processing tasks focus on

learning the compositionality of their in-

puts, which requires many expensive com-

putations. We present a simple deep neural

network that competes with and, in some

cases, outperforms such models on sen-

timent analysis and factoid question an-

swering tasks while taking only a fraction

of the training time. While our model is

syntactically-ignorant, we show significant

improvements over previous bag-of-words

models by deepening our network and ap-

plying a novel variant of dropout. More-

over, our model performs better than syn-

tactic models on datasets with high syn-

tactic variance. We show that our model

makes similar errors to syntactically-aware

models, indicating that for the tasks we con-

sider, nonlinearly transforming the input is

more important than tailoring a network to

incorporate word order and syntax.

1 Introduction

Vector space models for natural language process-

ing (NL P) represent words using low dimensional

vectors called embeddings. To apply vector space

models to sentences or documents, one must first

select an appropriate composition function, which

is a mathematical process for combining multiple

words into a single vector.

Composition functions fall into two classes: un-

ordered and syntactic. Unordered functions treat in-

put texts as bags of word embeddings, while syntac-

tic functions take word order and sentence structure

into account. Previously published experimental

results have shown that syntactic functions outper-

form unordered functions on many tasks (Socher

et al., 2013b; Kalchbrenner and Blunsom, 2013).

However, there is a tradeoff: syntactic functions

require more training time than unordered compo-

sition functions and are prohibitively expensive in

the case of huge datasets or limited computing re-

sources. For example, the recursive neural network

(Section 2) computes costly matrix/tensor products

and nonlinearities at every node of a syntactic parse

tree, which limits it to smaller datasets that can be

reliably parsed.

We introduce a deep unordered model that ob-

tains near state-of-the-art accuracies on a variety of

sentence and document-level tasks with just min-

utes of training time on an average laptop computer.

This model, the deep averaging network (

DAN

works in three simple steps:

take the vector average of the embeddings

associated with an input sequence of tokens

pass that average through one or more feed-

forward layers

perform (linear) classification on the final

layer’s representation

The model can be improved by applying a novel

dropout-inspired regularizer: for each training in-

stance, randomly drop some of the tokens’ embed-

dings before computing the average.

We evaluate

DAN

s on sentiment analysis and fac-

toid question answering tasks at both the sentence

and document level in Section 4. Our model’s suc-

cesses demonstrate that for these tasks, the choice

of composition function is not as important as ini-

tializing with pretrained embeddings and using a

deep network. Furthermore,

DAN

s, unlike more

complex composition functions, can be effectively

trained on data that have high syntactic variance. A

Deep Averaging Networks: Marrying Speed and Accuracy in Sentiment Analysis, Resumos de Dendrologia

Documentos relacionados

Pré-visualização parcial do texto

Baixe Deep Averaging Networks: Marrying Speed and Accuracy in Sentiment Analysis e outras Resumos em PDF para Dendrologia, somente na Docsity!

Deep Unordered Composition Rivals Syntactic Methods

for Text Classification

Mohit Iyyer,^1 Varun Manjunatha,^1 Jordan Boyd-Graber,^2 Hal Daum´e III^1

1 University of Maryland, Department of Computer Science and UMIACS

2 University of Colorado, Department of Computer Science

{miyyer,varunm,hal}@umiacs.umd.edu, [email protected]

Abstract

1 Introduction

2 Unordered vs. Syntactic Composition

|X|

4 Experiments

5 How Do DANs Work?

References