Re: What's a good strategy for testing keywords for smart code editor?
From: Raptor (bogus_at_none.com)
Date: 02/14/05
- Next message: Bruce Roberts: "Re: Why use custom component for class?"
- Previous message: Raptor: "Re: Freeware and source code"
- In reply to: Bruce Roberts: "Re: What's a good strategy for testing keywords for smart code editor?"
- Next in thread: Martin Harvey (Demon account): "Re: What's a good strategy for testing keywords for smart code editor?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 14 Feb 2005 14:05:20 -0800
Bruce Roberts <ber@bounceitattcanada.xnet> wrote in message
news:Km4Qd.1864$dZ.132548@news20.bellglobal.com...
>
> "Raptor" <bogus@none.com> wrote in message
> news:cukkl902e5g@news2.newsguy.com...
>
> > function KeywordLookup02(Word: string): boolean;
> > const
> > KeywVector: array [65..90] of integer = (
> > 0, 9, 13, 25, 34, 42, 47, 49, 50, 62, 63, 64, 67, 68, 72,
> > 78, 108, 80, 89, 92, 99, 103, 106, 108, 108, 108);
>
> KeywVector : array ['A' .. 'Z'] of integer . . .
>
> > var
> > i: integer;
> > begin
> > result := false;
> > i := KeywVector[ord(Word[1])];
>
> i := KeywVector [Word [1]];
>
> > while (i <= high(KeywArr)) and (Word[1] = KeywArr[i][1]) do
> >
> > No real difference on the timing, though. Wouldn't surprise me if
Delphi's
> > case statement was generating much the same machine code as this
vectored
> > thing.
> >
> > My timings for various routines, and bypassing them, suggests that there
> is
> > not much time to be saved in tweaking the lookup algorithm. Scanning
~1000
> > words:
>
> Not really surprising. If one is going to be working with 1K or so words
and
> look up tokens in this list, a sorted tStringList is likely the optimal
> coding choice. Binary search is quite fast, and almost all the code is
> already written. On average, if my math is correct, a lookup is going to
> take 5 probes of the list. Efficiency of the above approach depends a
great
> deal on the distribution of 1st letters, but assuming an even
distribution,
> average sub-list length is going to be around 38 (using 1000 words). A
> linear scan of 38 entries is going to average 17 probes. So any savings by
> jumping to a shorter list is probably going to be lost by the sub-search.
> Ordering the sub-list based on frequency may help, but only if the
> distribution is truly heavily weighted in favor of a few entries.
Otherwise
> things will tend to the average case. If the sub-lists were sorted, then a
> binary search would, on average require 2.5 probes. So a real gain might
be
> realized.
>
> J French's suggestion of using the first two characters of each word is
> essentially a hash and could result in significant improvements. Assuming
> even distribution and a linear search of sub-lists, one would expect about
1
> probe per word. If this approach is to be used, one might consider writing
a
> little application to processes a word list and produces a unit or include
> file with the required declarations.
I saw someone do that recently for C code. Worthwhile if someone did it
often.
> I suspect that the only way to improve the speed beyond this would be to
> handle word recognition at an earlier stage of the lexical scan - as each
> character is processed. (Assuming a well formed lexical structure this is
> easily done with a state machine. I don't know if there are any tools for
> generating such machines in Pascal or Delphi, but certainly they exist for
> C.) The savings here are garnered simply by reducing the amount of code to
> be executed and the number of temporary strings allocated (which is a
> relatively expensive Delphi operation in this context).
Thanks for your take on this. It turns out that keyword lookup was only a
small fraction of the overhead, though.
Raptor
- Next message: Bruce Roberts: "Re: Why use custom component for class?"
- Previous message: Raptor: "Re: Freeware and source code"
- In reply to: Bruce Roberts: "Re: What's a good strategy for testing keywords for smart code editor?"
- Next in thread: Martin Harvey (Demon account): "Re: What's a good strategy for testing keywords for smart code editor?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|