Re: search engine problems...

From: Nikolai Chuvakhin (nc_at_iname.com)
Date: 01/26/04


Date: 26 Jan 2004 10:38:34 -0800


"Frank" <frank.sonck@pandora.be> wrote in message
   news:<Yy6Rb.436$k_2.43025551@hebe.telenet-ops.be>...
>
> I'm running a site with +20.000 articles. The articles (html files) are
> saved on the server as txt files. Alle other data (author, date, category
> and so on) are in a MySQL db. Before we had the articles put in the db also
> and then performed SQL queries for the search engine. But this is no longer
> feasable since there are too many articles and the db has gotten too big.
> The search engine does all of the db and the server cpu goes max.

???

Put the articles back into the database and index the database properly.
In particular, consider using FULLTEXT indexes:

http://www.mysql.com/doc/en/Fulltext_Search.html

Also, are you sure your hardware is adequate?

> I'm looking for a php type search engine that automatically indexes the txt
> files, produces 1 index file with all indexed words + the id's of articles
> having those words. Like that the search script doesn't have to query all
> the articles (the whole db) anymore but just this one index file. Would be
> nice also if there would be possibility to have a blacklist of words (the,
> a,...) and other admin things.

The performance of this "solution" is going to be even worse than
the performance of the problematic database.

Cheers,
NC



Relevant Pages

  • Re: 40tude dialog reporting errors
    ... Some get created by Windows as a ... A corrupted database could ... However since then it repeatedly reports "server returned ... articles databases got out of sync? ...
    (news.software.readers)
  • Re: 40tude dialog reporting errors
    ... A corrupted database could ... However since then it repeatedly reports "server returned ... articles databases got out of sync? ... When you uninstalled (and backed up the Dialog install folder beforehand ...
    (news.software.readers)
  • Workstation/Server file access and file caching
    ... me to specific MS KB articles that can help me figure this out. ... system and accesses the database located on a shared folder on a Windows 2000 ... The operation in question is the recreation of an index file. ... with a Windows 2000 client and an NT4 server where there could be a problem ...
    (microsoft.public.win32.programmer.networks)
  • Re: 40tude dialog reporting errors
    ... Some get created by Windows as a ... A corrupted database could ... However since then it repeatedly reports "server returned ... articles databases got out of sync? ...
    (news.software.readers)
  • Re: 40tude dialog reporting errors
    ... There are no registry ... A corrupted database could ... However since then it repeatedly reports "server returned ... articles databases got out of sync? ...
    (news.software.readers)