Re: CYK & Context-Free Expressions



On Apr 28, 9:58 am, Tegiri Nenashi <TegiriNena...@xxxxxxxxx> wrote:
There is one CYK puzzle that I fail see a satisfactory answer to.
Transitive closure could be algebraically written many ways:

lim (1+T)^n

1 + T + T^2 + T^3 + ...

(1+T) (1+T)^2 (1+T)^4 ...

Correction:

(1+T) (1+T^2) (1+T^4) ...

corresponding to naive, seminaive, and other evaluation methods.
Applied to CYK, the fancier methods (like the third one in the above
list) fail. If we just evaluate binary powers of the reflective CYK
matrix and multiply them, we'll miss some derivations!

Here is a little more detailed exposure of the problem. Consider the
following grammar:

X = aa
Y = bX
Z = cY

Then given the word "cbaa", the CYK method starts with the following
matrix T:

[0 0 0 0 0]
[c 0 0 0 0]
[0 b 0 0 0]
[0 0 a 0 0]
[0 0 0 a 0]

where 0 denotes the empty set. Now the T^2 is:

[0 0 0 0 0]
[0 0 0 0 0]
[X 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]

What is the T^4? Well, assuming T^4 = T^2 T^2 it must evaluate to

[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]
[0 0 0 0 0]

Therefore, the expression

(1+T) (1+T^2) (1+T^4) ...

evaluates to

[1 0 0 0 0]
[c 1 0 0 0]
[X b 1 0 0]
[Y 0 a 1 0]
[0 0 0 a 1]

while the standard CYK procedure should derive

[0 0 0 0 0]
[c 0 0 0 0]
[X b 0 0 0]
[Y 0 a 0 0]
[Z 0 0 a 0]

The absence of the empty strings at the main diagonal is not
essential. The standard CYK method focuses on the values below the
diagonal only. What is important is the failure of the evaluation
(1+T) (1+T^2) (1+T^4) to derive the nonterminal Z for "cbaa".

Now given that optimization of the transitive closure fails to works,
it casts a doubt on faster than cube algorithms, particularly the
mentioned O(n^log(7)).

In the essence the problem reduces to the following. For ordinary
transitive closure we can split the path into any proportion. For
example path of length 4 could always be thought of as a concatenation
of the two paths of length 2. Contrast it to transitive closure in a
grammar where the only way to derive a word of length 4 may be
concatenating symbols of lengths 3 and 1.

It seems that in the grammar one have to evaluate higher power of the
CYK matrix like this:

A^4 = A A^3 + A^2 A^2 + A A^3

Needless to say that this is counter intuitive, and it even looks like
the matrix associativity law is broken!

.



Relevant Pages

  • Re: Does Valiant reduction make sense?
    ... CYK was married to LR-based technology by Tomita. ... already known as a context-free grammar. ... The corresponding context-free expression is ... the output for uv is an infinite set of parses whose ...
    (comp.theory)
  • Re: CYK & Context-Free Expressions
    ... CYK and Earley are, respectively, the dynamic programming algorithms ... I fail to inderstand the significance of Earley algorithm. ... and it doesn't introduce parse tree. ... Algebraic grammar methods school of thought emphasizes linear systems, ...
    (comp.theory)
  • Re: CYK & Context-Free Expressions
    ... going to Earley, however, let's stop at the CYK method for a moment. ... For each grammar a kind of universal CYK grammar formed along these ... Its complexity is O. ...
    (comp.theory)