Feature Extraction and Inner Product Representation in Tree Structures - Prof. Dan Roth, Study notes of Computer Science

Various methods for feature extraction and representation in tree structures, including feature vectors, all subtrees representation, and inner products. Concepts such as subtrees, non-terminal and terminal symbols, and infinite sets of features and sub-fragments. It also explains the importance of efficient computation of inner products using dynamic programming.

Typology: Study notes

Pre 2010

Uploaded on 03/16/2009

koofers-user-i9p
koofers-user-i9p 🇺🇸

5

(1)

10 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Features
A “feature” is a function on a structure, e.g.,
h
(
x
) =
Number of times A
BC
is seen in
x
T
1
A
B
D
d
E
e
C
F
f
G
g
T
2
A
B
D
d
E
e
C
F
h
A
B
b
C
c
h
(
T
1
) = 1
h
(
T
2
) = 2
pf3
pf4
pf5

Partial preview of the text

Download Feature Extraction and Inner Product Representation in Tree Structures - Prof. Dan Roth and more Study notes Computer Science in PDF only on Docsity!

Features

A “feature” is a function on a structure, e.g.,

h( x) (^) = (^) Number of times

B C A

is seen in

(^) x

1 T

A

d DE B e fg FG C

2 T

A

de DE B

C

h F b BC A c

h(

1 T

h(

2 T

Feature Vectors

A set of functions

1 h (^) : (^) : (^) : (^) h d (^) define a

(^) feature vector

x) (^) =

hh

(^1) (x ); (^) h (^2) (x ) (^) : : (^) : dh (^) (x

)i

1 T

A

d DE B e fg FG C

2 T

A

de DE B

C

h F b BC A c



(T

h 1

i



(T

h 2

i

All Sub-fragments for Tagged Sequences

Given: State symbols

fS

(^) ; (^) C ; (^) N

g

Terminal symbols

fa;

(^) b; c; : (^) : (^) :

g

An infinite set of sub-fragments

S

a j S

S — C

S — C

b j

An infinite set of features, e.g.,

3 h (^) (x ) (^) = (^) Number of times

S — b j C

is seen in

(^) x

Inner Products

x) (^) =

hh

(^1) (x ); (^) h (^2) (x ) (^) : : (^) : dh (^) (x

)i

Inner product (“

Kernel

”) between two structures

1 T (^) and 2 T (^) :

1 T

(^)  (T (^2) ) =

i=1 X d

i h( 1 T (^) )h i (^) (T (^2) )

1 T A

de DE B

fg FG C

2 T

A

de DE B

C

h F b BC A c

1 T (^) ) (^) =

h 1

i

2 T (^) ) (^) =

h 2

i

1 T

(^)  (T (^2) ) =

All Subtrees Representation



is now huge

But

inner product

(^) 

(T



(T

can be computed

efficiently using dynamic programming.