Re: Searching in byte buffer




"J French" <erewhon@xxxxxxxxxx> wrote in message
news:43631ae4.88288747@xxxxxxxxxxxxxxxxxxxxxxx
> On Fri, 28 Oct 2005 10:35:57 -0400, "Bruce Roberts"
> <dontsendtober@xxxxxxxxxxxxxxxxxxxx> wrote:

> A text search is not strictly speaking a byte search, because of those
> confounded MBCS 'things', which is why it is always wise digging down
> into the library to see whether ASCIIZ string routines are used or an
> API ANSI comparison routine.

Doing a byte-buffer search makes this concern moot.

> The other wrinkle is that Boyers-Moore is good for repeated search for
> the same thing, but because it has considerably more 'setup' it might
> not be quicker for non-repetitive searches.

Very true, however, if one is searching large buffers or many buffers for
the same byte-string, this cost is minimal. Especially when one considers
that the algorithm allows one to do a search without examining every byte
in each buffer.

Although the OP didn't indicate the nature of the byte-strings being
sought, if the situation was one in which every incoming buffer had to be
searched for an occurrence of an invariant pattern the setup cost, IIRC, is
only paid at startup and not for each search. If there are multiple
patterns that have to be identified, a state machine might be a better
choice.


.