Re: Removing duplicates from an array of pointers
From: Rufus V. Smith (nospam_at_nospam.com)
Date: 06/29/04
- Next message: Anand Hariharan: "Re: no match for complain"
- Previous message: Arthur J. O'Dwyer: "[OT] Re: no match for complain"
- In reply to: Bert: "Removing duplicates from an array of pointers"
- Next in thread: Bert: "Re: Removing duplicates from an array of pointers"
- Reply: Bert: "Re: Removing duplicates from an array of pointers"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 29 Jun 2004 13:25:17 GMT
"Bert" <maatjesharing@gmx.de> wrote in message
news:a65a4043.0406290307.382bd757@posting.google.com...
> I want to remove duplicate strings from an array of pointers to
> strings.
>
> Assume we have an array of pointers called "parray" of variable
> length. The pointers point to the contents of a file which is read
> into memory ("sfile"). Strings are created by replacing end of line
> characters with nuls. An array of pointers (parray) points to the
> first character of each line.
>
> Suppose sfile contains this (can be another length or other content):
> parray[0] AA
> parray[1] DDD
> parray[2] CC
> parray[3] DDD
> parray[4] EEEEEE
> parray[5] FFFF
> parray[6] DDD
>
> If viewed as a flat memory area, it will look like this:
> AA-DDD-CC-DDD-EEEEEE-FFFF-DDD- (- = '\0')
> p0.p1..p2.p3..p4.....p5...p6..
>
> The easy solution is to set the first character to NUL.
>
> That would result in this memory area:
> AA-DDD-CC--DD-EEEEEE-FFFF--DD-
> p0.p1..p2.p3..p4.....p5...p6..
>
> Two pointers (p3 and p6) now point to zero length strings. However,
> zero length strings are unusuable for later operations.
>
> What I'd like to get is this:
> AA-DDD-CC--DD-EEEEEE-FFFF--DD-
> p0.p1..p2.....p3.....p4.......
>
> The order of the pointers isn't important.
>
> I'm having trouble getting this done in real C code. Can anyone help?
>
> Thanks!
>
> Bert
As you come across duplicates, move the pointer in the last available
element into the duplicate's
position, and decrement your last element index. This reduces the pointer
count while maintaining
pointers to the unique or unchecked strings. If you wanted to maintain
order, you could slide
all the array elements down one position, but you said that wasn't a
requirement.
e.g.
for ( uniqueindex = 0; uniqueindex <= lastelement; uniqueindex++) {
for (searchindex = uniqueindex+1 ; searchindex <= lastelement ;
searchindex++) {
while ((searchindex <= lastelement) &&
(strcmp(parray[uniqueindex],parray[searchindex]) == 0)) {
free(parray[searchindex]); // dispose of duplicate string, if
appropriate
parray[searchindex] = parray[lastelement]; // bring down last
element
parray[lastelement] = NULL; // this isn't strictly necessary,
but I like to clean up my pointers.
lastelement--; // decrement element count
} // while duplicate string present at searchindex
} // comparing strings after reference string
} // for each string
Note, that as you pull an element down from the end of the array, you must
check against the reference string again, as it maybe a duplicate as well.
Hence
the while statement.
Rufus
- Next message: Anand Hariharan: "Re: no match for complain"
- Previous message: Arthur J. O'Dwyer: "[OT] Re: no match for complain"
- In reply to: Bert: "Removing duplicates from an array of pointers"
- Next in thread: Bert: "Re: Removing duplicates from an array of pointers"
- Reply: Bert: "Re: Removing duplicates from an array of pointers"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|