Re: search engine problems...

From: Nikolai Chuvakhin (nc_at_iname.com)
Date: 01/26/04


Date: 26 Jan 2004 10:38:34 -0800


"Frank" <frank.sonck@pandora.be> wrote in message
   news:<Yy6Rb.436$k_2.43025551@hebe.telenet-ops.be>...
>
> I'm running a site with +20.000 articles. The articles (html files) are
> saved on the server as txt files. Alle other data (author, date, category
> and so on) are in a MySQL db. Before we had the articles put in the db also
> and then performed SQL queries for the search engine. But this is no longer
> feasable since there are too many articles and the db has gotten too big.
> The search engine does all of the db and the server cpu goes max.

???

Put the articles back into the database and index the database properly.
In particular, consider using FULLTEXT indexes:

http://www.mysql.com/doc/en/Fulltext_Search.html

Also, are you sure your hardware is adequate?

> I'm looking for a php type search engine that automatically indexes the txt
> files, produces 1 index file with all indexed words + the id's of articles
> having those words. Like that the search script doesn't have to query all
> the articles (the whole db) anymore but just this one index file. Would be
> nice also if there would be possibility to have a blacklist of words (the,
> a,...) and other admin things.

The performance of this "solution" is going to be even worse than
the performance of the problematic database.

Cheers,
NC



Relevant Pages

  • Workstation/Server file access and file caching
    ... me to specific MS KB articles that can help me figure this out. ... system and accesses the database located on a shared folder on a Windows 2000 ... The operation in question is the recreation of an index file. ... with a Windows 2000 client and an NT4 server where there could be a problem ...
    (microsoft.public.win32.programmer.networks)
  • Re: search engine challenge
    ... > are saved on the server as txt files. ... Before we had the articles put ... > in the db also and then performed SQL queries for the search engine. ... hard work, and then either using the Google site search, or the Google ...
    (comp.lang.php)
  • Re: Large Amount of Data
    ... updating a database will take a long time... ... data to be processed on a server. ... I doubt any serious search engine would ... of physical memory. ...
    (comp.lang.python)
  • Re: Sahrepoint Database Problem
    ... on "cannot connect to the configuration database" listed in the KB Articles ... section of the WSS FAQ site ... The Server and sites are running fine. ...
    (microsoft.public.sharepoint.windowsservices)
  • Re: Attaching a database
    ... Have you looked at the several KB articles containing the words ... WSS FAQ at wss.collutions.com ... > Backed up my orignial database and have followed the steps in the KB ... > I am able to add the content database to the virtual server, ...
    (microsoft.public.sharepoint.windowsservices)