Re: Automatic text tagging



On Apr 8, 10:05 pm, Bruno Barberi Gnecco
<brunobgDELETET...@xxxxxxxxxxxxxxxxxxxxx> wrote:
        I need to implement an automatic text tagging system. Any suggestions
of algorithms? I've used Bayesian classification with great success when the
categories are fixed and in small number, but in the case of tags I believe
it won't work very well (too few items per tag to train well). I'm also looking
for something more sophisticated than simply finding tags in text.

        Any pointers to papers, books or code is appreciated. Thanks a lot.

You mean Part-Of-Speech tags (Noun, Verb, etc.)?

But these *are* a "fixed and small number of categories", are not
they?

For a small training set a very successful technique is to take into
account the context, namely the few words to the left and to the right
of the word under tagging. Work with probabilities. In an enlarged
context many times there are choices with probability 1 (e.g. words
"the", "at"). These "ground" choices help chose the others.

.



Relevant Pages

  • Re: Automatic text tagging
    ... I've used Bayesian classification with great success when the ... for something more sophisticated than simply finding tags in text. ... tagged "text mining, tag, probabilities". ... It's not quite ready for prime time, but take a look at http://openpipeline.org. ...
    (comp.theory)
  • Re: Automatic text tagging
    ... I've used Bayesian classification with great success when the ... for something more sophisticated than simply finding tags in text. ... account the context, namely the few words to the left and to the right ... tagged "text mining, tag, probabilities". ...
    (comp.theory)
  • Re: arrange form data in same order as on form
    ... >> fonts, colors, pictures and all that. ... Often sample code is out of context, it is simply a short clip. ... check to be sure html is formatted within a strict set of rules. ... no open tags without a matching closing tag. ...
    (comp.lang.perl.misc)
  • Re: Multiple meta tag solution?
    ... and therefore increase the likelihood that people will click on it, ... tags and description. ... of context) of a page. ... Grouping pages would help the problem with the meta tags, ...
    (microsoft.public.publisher.webdesign)
  • Re: Multiple meta tag solution?
    ... more than tags...searchable content meaning text that is NOT in an image. ... tags and description. ... It is then less likely that Google picks phrases out ... of context) of a page. ...
    (microsoft.public.publisher.webdesign)