Re: asm grep



Rod Pemberton wrote:
"Frank Kotler" <fbkotler@xxxxxxxxxxx> wrote in message
news:cKhcj.6973$yC6.4200@xxxxxxxxxxx

;very dumb but short strstr
;
;esi - haystack
;edi - needle

strstr:
push esi
push edi

xor eax,eax
cmp [esi],byte 0
jz .rets

astrlen ecx, edi

Is the overhead of astrlen really needed here?

Well... it was intended to reduce the overhead of this code.

push esi
mov esi,edi
call strlen
mov ecx,eax
pop esi

He just needs the length of
string at edi, correct? If edi is byte 0 terminated,

Right. If it *isn't* 0-terminated, we're in deep *** either way. Lemme see... that's the "pattern" we got off the command line - all 0-terminated for us. Surely no need to calculate its length more than once!!!

I'd try a repz scasb.

Good thought. In *this* instance, since we want the length in ecx, and since the string is in edi, and since al is already zero, it might be a win. I may give that a shot... if the spirit moves me...

or ecx,ecx
jz .return

.next:
xor eax,eax

push ecx
push edi
repz cmpsb


It seems like this is doing extra work because it's checking a small section
at a time for the edi string... esi only gets incremented by the few
characters that aren't in edi, the increment is limited to ecx, so the rep
loop never gains any "momentum" since it keeps terminating.
Stop-go-stop-go-stop...i.e., it's not checking a potentially large section
of characters before it terminates and repeats. Wouldn't you want to use
repne scas to find the first (or last) char of edi - i.e., check and
potentially eliminate very large sections of text, and then use repz cmpsb
to check that the entire string is there? Repeat if not.

Okay. Are you optimizing for size or for speed? The comment assures us that this is "very dumb but short". We *may* be able to squeeze a few more bytes of size out of it. We can *surely* make it go faster! This code reads the file one byte at a time, and copies it into a buffer until a "line" is accumulated. After *that* abomination, I don't think it makes much difference *how* slow our search is...

I suspect that we can make this thing go a "lot" faster with only a "little" more size. What's our goal, here? Fast? Small? Full-featured? The asmutils are "obsessively small". That's "what it is". Some other tradeoff might be "better". If we want regular expressions, like a "real grep"... say goodbye to "small"...

Oh, where does the direction flag get set?

It doesn't. We ASSume that it is clear on program startup (have you ever seen this *not* be true?). We aren't going to tolerate all that bloat!!! :)

Thanks for the feedback, Rod. I don't know if it's worth "massaging" this code - for small, fast, or full-featured - but it's interesting to think about... and serves as an example that asm isn't always faster!

Best,
Frank
.


Quantcast