Re: New Logic
- From: "pemo" <usenetmeister@xxxxxxxxx>
- Date: Sun, 27 Nov 2005 10:31:11 -0000
"Spidey" <amalhashim@xxxxxxxxx> wrote in message
news:1133071695.210774.263050@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> What is best solution for finding "a repeatedly occuring lenghtiest sub
> SEQUENCE from a given paragraph"
>
> ex:
> input:
> hello world is a good hello world. Is that the way it is
> a way.
> Output:
> hello world
>
It's reasonably difficult, as you're looking for n-gram frequency. So,
there's text chunking to do, and more than a little strtok'ing etc.
Have a look around some Computational Linguistics sites - hint: there are
tools on the web to do this (not neccessarily in C, or with source code
though)
.
- References:
- New Logic
- From: Spidey
- New Logic
- Prev by Date: Re: Need assistance with detecting improper input
- Next by Date: Re: New Logic
- Previous by thread: Re: New Logic
- Next by thread: Re: New Logic
- Index(es):
Relevant Pages
|