Initial Thoughts on Montezuma



Well, I saw the announcement of Montezuma on Planet Lisp (it's a
full-text search engine, a Lisp port of the Ruby port of Lucene), and
tried to install it; unfortunately due to the CLiki outage I wasn't able
to until this morning. But once CLiki was back up & running I very
quickly had it installed.

I've a PostgreSQL database with a tasting notes regarding beers I have
drunk; a Python web interface is available at
<http://latakia.dyndns.org/tasting-notes>. One of my rainy-day projects
is porting this over to Lisp, and so I figured that Montezuma might be
cool to evaluate as a search engine.

The API is mostly well thought-out, although there are a few quibbles I
have. Essentially, you have one or more indices which index a set of
documents, each of which is a set of fields. A field consists of two
strings: a name and a value. Essentially, a document is just a hash
table. When you search the index, you can constrain it to search only
certain fields, or all of them.

It's pretty simple to set up; once it's installed you make an index by
instantiating montezuma:index. There are some keyword arguments, but
it's unclear which of those are actually needed for some tasks or
improve efficiency (for example, what effect does specifying :fields
have?). Indices can be persistent, which is pretty cool.

Adding items ('documents,' in Montezuma's parlance) is fairly easy; you
can create them as a simple list of conses (representing field-text
pairs), or you can go whole-hog and create a document, add fields to it,
then add the document to the index.

Searching returns a score and the number of a document; given the number
you can return the fields you want. It's not clear what the score
is--it's not a 0..1 range, but higher scores are better.

My sample size was not over-large, but searches seemed to be quite
speedy.

Sub-word searching (e.g. returning 'eggdrop' on a search for 'egg')
doesn't appear to be implemented, although it's possible that I'm
missing an option.

All in all, for a 0.1.1 release Montezuma is pretty cool; there's a lot
of potential there. If your project needs this type of search
capability, it's worth taking a look.

--
Robert Uhl <http://public.xdi.org/=ruhl>
I believe in life, and I also believe in love, but the world in which
I live in keeps trying to prove me wrong. --P. Weller
.



Relevant Pages

  • Petition koders.com [sic] to add Lisp to its search category.
    ... to be a new search engine focused on indexing Free Software/Open ... Source code repositories on the web. ... They do not have Lisp of any kind ... Basic and TCL is bound to give the Lisp community some positive ...
    (comp.lang.lisp)
  • Re: Considering Lisp for project but not sure if it fits
    ... I also dread the learning curve with Lisp. ... I even looked at SmallTalk, ... a Lisp port will /accelerate/ the initial effort, ...
    (comp.lang.lisp)
  • Re: Initial Thoughts on Montezuma
    ... full-text search engine, a Lisp port of the Ruby port of Lucene), and ... tried to install it; unfortunately due to the CLiki outage I wasn't able ... Lisp kann nicht kratzen, denn Lisp ist fluessig ...
    (comp.lang.lisp)