Re: asm grep



Charles Crayne wrote:
On Sat, 29 Dec 2007 18:33:27 GMT
Frank Kotler <fbkotler@xxxxxxxxxxx> wrote:


Seems to me this isn't going to work if we want to print non-matching lines... *or* report
line-number where a match is found,


If nothing else, at least you have given the group a good case study of
the pitfalls of premature optimization.

Yep. If all else fails, I can serve as a bad example!

But... Betov once described me as "the guy who turns around". True, I like to see all sides of an issue. :) I don't like to contradict the pioneers, but "another side" might be "It's never too early to start thinking about optimization!" I don't mean cycle-counting or byte-counting, but what do and don't you have to do to meet the specs? What's the best algorithm for *these* specs? How will you arrange the data for optimum access?...

In this case, I have the luxury - or the misfortune - of not really knowing what the "specs" are. I have the liberty to leave out functionality if it conflicts too much with size/speed. (at some point it ain't grep...)

So what we might call a "pitfall", I might prefer to see as an "experiment". I didn't have a lot of hope for scansb/cmpsb. To be honest, I had a lot of trouble getting it going 'cause I haven't used 'em much. Yeah, they're small, but they're "inflexible" and typically need regs to be saved/restored or need some "adjustment"... I'm more familiar with "discrete instructions" comparisons. Rod initially mentioned that "repnz scasb" would beat my "astrlen" macro - which it did, but only by one byte (due to push/pop edi, adjust ecx) (that code's gone now, anyway). And he mentioned that the "strstr" was dumber that it needed to be. I thought I'd take a shot at it, to see what the "short" instructions *would* do for me, knowing that it was going to eliminate any possibility of the "-i" switch (one of the couple I've actually used). The asmutils version doesn't support "-i", so it isn't necessarily part of the "spec", but I'd like to have it... if it isn't "too" costly...

I won't know how costly "too" costly is until I've decided the spec, and I'm not done optimizin' yet! :)

But... doesn't the "re" in "grep" stand for "regular expression"? I'm pretty sure regular expressions exceeds my ambition level (maybe my ability) at the moment...


A full regex implementation would, indeed, be a major project. However,
adding a few of the more common pattern matching capabilities would
seem to be worthwhile. For example, '^' to match the beginning of a
line, and '$' to match the end of a line (before the new line char).

Okay... I'm really reluctant to break the "barrier" of treating the "needle" as literal text, for fear of what will come flooding out! But it gives me an idea... how 'bout if we store each byte of the "needle" as a dword. The character, the character "flipped" if we're doing "-i" and it's alpha - the character duplicated, if not. That leaves us 16 bits for "flags" such as "must match start of line", "must match end of line"... maybe "match any alpha", "match any number". "match multiple characters" might be a problem. Dunno... might be worth fooling with. "KISS" would suggest leaving it as literal text, case sensitive, and be done with it. There's such a thing as too simple to be useful, though. One of the things I use grep for is "How do they spell that damn CamelCapsThingie?"...

So many bits...

Best,
Frank
.



Relevant Pages

  • Re: What Is Wrong With Newswatcher?
    ... statement that mt-nw is following specs that others ignore is still ... The more conservative 78 character recommendation is to ... it can easily be argued that mt-nw is causing information to be lost ... clients that try to read mt-nw's links. ...
    (comp.sys.mac.advocacy)
  • Re: What Is Wrong With Newswatcher?
    ... set to 1000 chars, you'd have to cut up long lines either way, so to ... statement that mt-nw is following specs that others ignore is still ... The more conservative 78 character recommendation is to accommodate the many implementations of user interfaces that display these messages which may truncate, or disastrously wrap, the display of more than 78 characters per line, in spite of the fact that such implementations are non-conformant to the intent of this specification. ... clients that try to read mt-nw's links. ...
    (comp.sys.mac.advocacy)
  • Re: What Is Wrong With Newswatcher?
    ... statement that mt-nw is following specs that others ignore is still ... The more conservative 78 character recommendation is to ... by wrapping hyperlinks where they are not *required*, ... clients that try to read mt-nw's links. ...
    (comp.sys.mac.advocacy)
  • Re: Success :-)
    ... class but different specs... ... so many times I reach for a spell I only have ... on the other character:-D ... light - greater heal/holy light, pom / holy shock, etc on the same keys. ...
    (alt.games.warcraft)