Natural Language Processing - Lecture Slides | IST 664, Assignments of Information Technology

Material Type: Assignment; Class: Natural Language Processing; Subject: Information Studies; University: Syracuse University; Term: Unknown 2001;

Typology: Assignments

Pre 2010

Uploaded on 08/09/2009

koofers-user-2l7
koofers-user-2l7 🇺🇸

10 documents

1 / 36

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Natural Language Processing
IST 664 / CIS 563
Nancy McCracken
using materials developed in previous courses
by Liz Liddy and others
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24

Partial preview of the text

Download Natural Language Processing - Lecture Slides | IST 664 and more Assignments Information Technology in PDF only on Docsity!

Natural Language Processing

IST 664 / CIS 563

Nancy McCracken

using materials developed in previous courses

by Liz Liddy and others

Natural Language Processing (NLP) •^

A range of computational techniques

-^

for analyzing and representing naturally occurringtexts

-^

at one or more levels of linguistic analysis

-^

for the purpose of achieving human

  • like language -^

for the purpose of achieving human

  • like language

processing

-^

for a range of particular tasks or applications.

-^

Compuational Linguistics – doing linguistics oncomputers^ –

Closely related, often treated as synonymous with NLP

Where is NLP now? •^

Goals can be far-reaching^ –

True text understanding– Reasoning about knowledge in text– Real-time participation in spoken dialogs

-^

Or very down

  • to
  • earth

•^

Or very down

  • to
  • earth -^

Finding the price of products on the web

-^

Context-sensitive spell-checking

-^

Analyzing authorship or opinions statistically

-^

Extracting facts or relations from documents

-^

Currently, NLP is providing these practical applications (yetstill dreaming of the AI goals)

4

Need for NLP •^

Huge amounts of data^ –

Internet = at least 20billions pages– Intranet

-^

Applications for

Classify text into categoriesIndex and search large textsAutomatic translation of web

documents in different languages Speech understanding

Understand phone conversations Information extraction

processing largeamounts of texts require NLP expertise

5

Information extraction

Extract useful information fromresumes Automatic summarization

Condense 1 book into 1 page Question answeringKnowledge acquisitionText generations / dialogues

Natural Language Processing’s Mixed Lineage •^

Linguistics^ –

concerned with formal, structural models of language– goal is the discovery of language universals– not concerned with computational effectiveness of their^ modelsmodels

-^

Computer Science^ –

concerned with developing internal representations of data– emphasis on efficient processing of these structures

Natural Language Processing’s Mixed Lineage •^

Cognitive Psychology^ –

concerned with modeling the use of language in a psychologicallyplausible way– language as a vehicle for studying human cognition

-^

Artificial Intelligence^ –

interested in development of a computational theory of humanlanguage capacity and processing

-^

Statistics^ –

frequencies, probabilities for detecting linguistic patterns

Synchronic Model of Language

Pragmatic Discourse

Semantic

Syntactic

Lexical

MorphologicalMorphological Phonetic

Synchronic Model of Language

-^

The more exterior the level of language processing:

-^

The larger the unit of analysis^ –^

phoneme-> morpheme -> word -> sentence -> text -> world

-^

The less precise the language phenomena

-^

The more free choice & variability^ –^

less rule

  • oriented, more exceptions -^

less rule

  • oriented, more exceptions -^

just regularities

-^

The more levels it presumes a knowledge of or reliance on

-^

Theories used to explain the data move more into the areas ofcognitive psychology and AI

•^

Lower levels of the model have been more thoroughlyinvestigated and incorporated into NLP systems

  • deals with the componential nature of lexical entities:

prefix

pre – registra – tion

suffix

stem/root

Morphological Analysis Morphological Analysis

stem/root

  • What features do inflections reveal?

Verbs

tense & number

Nouns

single/plural

Adjectives

comparison features

Lexical 1.

Part-of-speech (POS) tagging 03/14/1999 (AFP)

… the extremist Harkatul Jihad group,

reportedly backed by Saudi dissident Osama bin Laden... … the|

DT

extremist|

JJ

Harkatul_Jihad|

NP

group|

NN

… the|

DT

extremist|

JJ

Harkatul_Jihad|

NP

group|

NN

reportedly|

RB

backed|

VBD

by|

IN

Saudi|

NP

dissident|NN Osama_bin_Laden|

NP

Productive rules which explain how new words areformed

highchairegghead

Lexical (Lexico-Semantics) --------------------------------------------------------------------------(launch)(a large, usually motor-driven boat used for carrying people

on rivers, lakes harbors, etc.)

Word Level Meaning ((CLASS BOAT) (PROPERTIES (LARGE) ((CLASS BOAT) (PROPERTIES (LARGE) (PURPOSE

(PREDICATION (CLASS CARRY) (OBJECTPEOPLE))))

---------------------------------------------------------------------------^

requires a large, well-organized lexicon

Sample Lexical Entry Data Elements Headword Pronunciation

Part of SpeechInflectionInflectional PronunciationDefinition – Sense 1 Usage EtymologyEtymology

Grammatical CodeIdiom DefinitionNational Varieties (Irish English)Region (South)Level & Attitude (Derogatory)Time & Frequency (Obsolete)Semantic Restrictions (Animate, Liquid)

On subject of verbOn object of verb Subject Field (Agriculture, Economics)

Bracketed text[[S^

NP^

the [

NP

glorious sun]]

[VP

[ VP

will shine] [

PP^

in [

NP

the [

NP

winter]]]]]

Nested Boxes

S

NP

NP the

glorious sun

VP^ VP

will shine

PP

NP

NP

in^

the

winter

Tree Structure

S

VP

PP

NP

Determiner

NP

VP

Prep

Aux

Verb

Determiner

NP

NP2 Noun winter

the in

shine

the

adjective

Noun

glorious

sun

will

Sentence

Noun Phrase

Verb Phrase

Determiner

Noun

Verb

Noun Phrase Determiner

Noun

The

cat

ate

the

mouse

Parsing a sentence using simple phrase structure rules

The phase structure rules underlying this analysis are as follows:

Sentence

Noun Phrase

Verb Phrase

Noun Phrase

Determiner

Noun

Verb Phrase

Verb

Noun Phrase

Determiner = TheNoun = catNoun = mouseVerb = ate