


Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The revised pseudo code for the non-probabilistic cky algorithm, which is a parsing algorithm used in natural language processing to identify the constituents of a given sentence. The document also includes an example of how to apply this algorithm to the sentence 'snow in oslo snores' using the provided grammar. The chart created during the algorithm's execution is shown step by step, illustrating how new constituents are built from existing ones.
Typology: Lab Reports
1 / 4
This page cannot be seen from the preview
Don't miss anything!



Ling 472 Lab, November 5, 2004
Revised pseudo code for the (non-probabilistic) CKY algorithm:
Create and clear chart [ #words , #words ]
(^1)
Step through the (non-probabilistic) CKY algorithm, using this grammar:
S NP VP S Aux S
VP V S VP V NP VP VP PP
NP Det N NP NP PP
NP Waikiki NP Oslo NP Kim NP snow
PP P NP PP P S
V adores VP snores
Aux does Aux can Aux is
P in P on P before
Det this Det these Det the
Use this sentence:
Snow in Oslo snores 1 2 3 4
First, start out with a chart with the appropriate cells. Each one corresponds to a substring of the input string:
The first loop:
for i 1 to #words chart [ i , i ] { | inputi }
This fills in the chart with pre-terminals.
So we go through i = 1 to i = 4; for each of these, we put an element in the corresponding cell in the chart for each preterminal that expands to that input. We end up with a chart that looks like this:
In the next set of nested loops, we build new constituents out of existing ones. Each time we execute the innermost loop, we are looking at two potential daughters and seeing if they form a constituent. If they do, we add that constituent to the appropriate place in the chart. The loops have these variables:
In the final iteration, weโre building constituents of length 4, so span will be 4. begin can just be 1. end can only be 4. m can range from 1 to 3.
we look at 1,1 (NP) and 2,4 (PP) and add NP to 1, we look at 1,2 and donโt find anything we look at 1,3 (NP) and 4,4 (VP) and add S to 1,
We end up with this table: