Re: Implementing strstr
From: Roger Willcocks (rkww_at_rops.org)
Date: 04/24/04
- Next message: Reza Ganjavi \(www.rezamusic.com\): "Pascal Compilers"
- Previous message: Reza Ganjavi \(www.rezamusic.com\): "Re: AN OLD PROGRAMMER NEEDS ADVICE"
- In reply to: CBFalconer: "Re: Implementing strstr"
- Next in thread: CBFalconer: "Re: Implementing strstr"
- Reply: CBFalconer: "Re: Implementing strstr"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sat, 24 Apr 2004 10:17:08 +0100
"CBFalconer" <cbfalconer@yahoo.com> wrote in message
news:4089F35A.D7B3484D@yahoo.com...
> Roger Willcocks wrote:
> > "Tron Thomas" <tron.thomas@verizon.net> wrote in message
> >
> >> As part of an interview for a programming position, I was asked
> >> to write an implementation for the strstr C runtime library
> >> function which determines if a substring is contained within
> >> another text string.
> >>
> >> I was able to develop an implementation that worked, and the
> >> intviewer commented that the design was more complicated than it
> >> needed to be.
> >>
> ... snip convoluted code ...
> >>
> >> Any comments on the code would be appreciated.
> >
> > It's simpler to use an additional variable than to rewind
> > pszHaystack. And by combining tests and using the comma operator
> > you can eliminate most of the squiggly brackets (although this
> > could start style wars). Consider:
> >
> > const char* strstr(const char* haystack, const char* needle)
> > {
> > for (;;) {
> > const char* p = needle;
> > const char* q = haystack++;
> > while (*p && *p == *q)
> > p++, q++;
> > if (*p == 0)
> > return haystack - 1;
> > if (*q == 0)
> > return 0;
> > }
> > }
> >
> > -- or equivalently --
> >
> > const char* strstr(const char* haystack, const char* needle)
> > {
> > for (;;) {
> > const char *p, *q;
> > for (p = needle, q = haystack++; *p && *p == *q; p++, q++)
> > ;
> > if (*p == 0)
> > return haystack - 1;
> > if (*q == 0)
> > return 0;
> > }
> > }
>
> Let the style wars begin :-) I dislike comma operators and for
> (::), thus my take on your quite elegant coding is:
>
> const char* strstr(const char* haystack, const char* needle)
> {
> const char *p, *q;
>
> do {
> p = needle; q = haystack++;
> while (*p && (*p++ == *q++)) continue;
> } while (*p && *q);
> if (!(*p)) return --haystack;
> return 0;
> }
>
> This seems to return haystack for (*needle == '\0'). I may have
> introduced insects, not tested.
This falls at the first hurdle, I'm afraid. Using a test program:
#include <stdio.h>
int main(int argc, char* argv[])
{
const char* q = strstr(argv[1], argv[2]);
if (q)
printf("%s\n", q);
return 0;
}
a call to (your) strstr("thisisatest", "isa") wrongly returns "isisatest",
because when the inner 'while' exits, p and q are one beyond the mismatched
character. (It returns the same answer if you search for "isx" even though
that string doesn't appear at all).
Also strstr("thisisatest", "tests") wrongly returns "test" for the same
reason, and strstr("thisisatest", teststr") wrongly returns "teststr" (a
pointer into the wrong string!) because (ummm) haystack++ walks right off
argv[1] and into argv[2].
Here's a (working) version in your style:
const char* strstr(const char* haystack, const char* needle)
{
const char *p, *q;
do {
p = needle; q = haystack++;
while (*p && *p == *q) { p++; q++; }
if (*p == 0) return --haystack;
} while (*q);
return 0;
}
but personally I'd stick with my original forever/while version because it's
much clearer how it works: compare needle with each start position in
haystack; find the first mismatched character; if it's the zero at the end
of 'needle' we've found a match; if it's the zero at the end of 'haystack'
there are no more start positions. Otherwise just keep on going.
The generated code in all my versions is likely to be identical, so the
'best' version is (presumably) the one that's easiest to understand.
However this approach is very inefficient for long strings. The Boyer-Moore
algorithm actually gets faster as the 'needle' string gets longer
(http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/).
-- Roger
- Next message: Reza Ganjavi \(www.rezamusic.com\): "Pascal Compilers"
- Previous message: Reza Ganjavi \(www.rezamusic.com\): "Re: AN OLD PROGRAMMER NEEDS ADVICE"
- In reply to: CBFalconer: "Re: Implementing strstr"
- Next in thread: CBFalconer: "Re: Implementing strstr"
- Reply: CBFalconer: "Re: Implementing strstr"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]