Re: Fwd: NUCULAR fielded text searchable indexing



aaron.watters@xxxxxxxxx writes:
...but it looks a little more akin to Solr than to Lucene. ...

I'm not sure but I think nucular has aspects of both since
it implements both the search engine itself and also provides
XML and HTTP interfaces

That sounds reasonable.

As a test I built an index with 10's of millions of entries
using nucular and most queries through CGI processes clocked
in in 100's of milliseconds or better -- which is quite acceptable,
for many purposes.

How many items did each query return? When I refer to large result
sets, I mean you often get queries that return 10k items or more (a
pretty small number: typing "python" into google gets almost 30
million hits) and you need to actually examine each item, as opposed
to displaying ten at a time or something like that (e.g. you want to
present faceted results).

So we're back to the perennial topic of parallelism in Python...

...Which is not such a big problem if you rely on disk caching
to provide the RAM access and use multiple processes to access
the indices.

Right, another helpful strategy might be to use a solid state disk:

http://www.newegg.com/Product/Product.aspx?Item=N82E16820147021
.