Re: CYK & Context-Free Expressions
- From: Tegiri Nenashi <TegiriNenashi@xxxxxxxxx>
- Date: Mon, 28 Apr 2008 13:50:25 -0700 (PDT)
On Apr 28, 9:58 am, Tegiri Nenashi <TegiriNena...@xxxxxxxxx> wrote:
There is one CYK puzzle that I fail see a satisfactory answer to.
Transitive closure could be algebraically written many ways:
lim (1+T)^n
1 + T + T^2 + T^3 + ...
(1+T) (1+T)^2 (1+T)^4 ...
Correction:
(1+T) (1+T^2) (1+T^4) ...
corresponding to naive, seminaive, and other evaluation methods.
Applied to CYK, the fancier methods (like the third one in the above
list) fail. If we just evaluate binary powers of the reflective CYK
matrix and multiply them, we'll miss some derivations!
Here is a little more detailed exposure of the problem. Consider the
following grammar:
X = aa
Y = bX
Z = cY
Then given the word "cbaa", the CYK method starts with the following
matrix T:
[0 0 0 0 0]
[c 0 0 0 0]
[0 b 0 0 0]
[0 0 a 0 0]
[0 0 0 a 0]
where 0 denotes the empty set. Now the T^2 is:
[0 0 0 0 0]
[0 0 0 0 0]
[X 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
What is the T^4? Well, assuming T^4 = T^2 T^2 it must evaluate to
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
Therefore, the expression
(1+T) (1+T^2) (1+T^4) ...
evaluates to
[1 0 0 0 0]
[c 1 0 0 0]
[X b 1 0 0]
[Y 0 a 1 0]
[0 0 0 a 1]
while the standard CYK procedure should derive
[0 0 0 0 0]
[c 0 0 0 0]
[X b 0 0 0]
[Y 0 a 0 0]
[Z 0 0 a 0]
The absence of the empty strings at the main diagonal is not
essential. The standard CYK method focuses on the values below the
diagonal only. What is important is the failure of the evaluation
(1+T) (1+T^2) (1+T^4) to derive the nonterminal Z for "cbaa".
Now given that optimization of the transitive closure fails to works,
it casts a doubt on faster than cube algorithms, particularly the
mentioned O(n^log(7)).
In the essence the problem reduces to the following. For ordinary
transitive closure we can split the path into any proportion. For
example path of length 4 could always be thought of as a concatenation
of the two paths of length 2. Contrast it to transitive closure in a
grammar where the only way to derive a word of length 4 may be
concatenating symbols of lengths 3 and 1.
It seems that in the grammar one have to evaluate higher power of the
CYK matrix like this:
A^4 = A A^3 + A^2 A^2 + A A^3
Needless to say that this is counter intuitive, and it even looks like
the matrix associativity law is broken!
.
- References:
- CYK & Context-Free Expressions
- From: Rock Brentwood
- Re: CYK & Context-Free Expressions
- From: Tegiri Nenashi
- CYK & Context-Free Expressions
- Prev by Date: Re: CYK & Context-Free Expressions
- Next by Date: Re: Exercise 2.1.2 from Aho&Ulman The theory of Parsing textbook?
- Previous by thread: Re: CYK & Context-Free Expressions
- Next by thread: A question about a LL parser
- Index(es):
Relevant Pages
|