Re: Binary Search



John Dough wrote:
On Sun, 17 Jun 2007 22:07:02 -0500, Rob Kennedy <me3@xxxxxxxxxxx>
wrote:

Make your data structure store arrays instead of single items.

My data structure is an array of strings with 100 elements [1..100]

Yeah. Who says you can't change it to better suit your needs? So far, your stated needs are to sort the list and to find duplicates. Why use an array for that?

For testing purposes, I've divided the array up into 4 groups of 25
strings. There's 25 instances of "test", along with 25 instances of
"aaa"..... "bbb"....."ccc".

I sort the array before searching (as is required with binary search).

Binary search requires that the elements be in sorted order. It doesn't require that the search be preceded by an explicit sort operation. (So, for instance, if you're inserting things in the right place already, you don't need a separate sorting step.)

The reason you're not getting examples is because an example would require laying out a data structure and then writing the binary search into which the solution would be added. That's a lot for a one-off example. If you'd shown what you already have, an example might have been more forthcoming.

I already did that.

Right, sorry. I missed that message.

function TForm1.BinarySearch(const searchstring: string; const
Left,Right: integer): integer;

OK, since you want to find a range, rather than just one element, let's start by changing the return value:

type
TArrayRange = record
First, Last: Integer;
end;

This will represent a half-open interval [First,Last). So the first element in the array will be at index First and the final element in the range will be at index Last-1. That has the nice property that any range where First=Last is empty. It also means that the size of the range is Last-First.

So here's the new function signature:

function TForm1.BinarySearchForDuplicates(const searchstring: string; const Left, Right: Integer): TArrayRange;

var L,R,M,CompareResult: integer;
begin
L := Left;
R := Right;
while (L <= R) do
begin
M := (L + R) div 2;
CompareResult := CompareText(datalist[M],searchstring);
if (CompareResult = 0) then
begin

Result.Left := M;
Result.Right := M + 1;
// Now do two more binary searches. First, on the left side for
// the greatest L such that datalist[L] <> searchstring. It might
// not exist.
L := Left;
R := Result.Left;
while L < R do begin
M := (L + R) div 2;
Assert(M < R); // L < R and "div" rounds down.
CompareResult := CompareText(datalist[M], searchstring);
Assert(CompareResult <= 0); // We've already found one match on
// the far right of our subrange.
if CompareResult < 0 then begin
// We jumped too far to the left. We're in the non-match zone.
// M is not a match, so go to the next one.
L := M + 1;
end else begin
// We jumped too far to the right. We're in duplicate zone.
R := M;
end;
end;
Assert(L = R);
Result.Left := L;
// Now for the second search, finding the least R such that
// searchstring < datalist[R]. R might not exist.
L := Result.Right + 1;
R := Right;
while L < R do begin
M := (L + R) div 2;
Assert(M < R); // L < R and "div" rounds down.
CompareResult := CompareText(datalist[M], searchstring);
Assert(CompareResult >= 0); // We've already found one match on
// the far left of our subrange.
if CompareResult > 0 then begin
// We jumped too far to the right. We're in the non-match zone.
// M is not a match, so go to the next one.
R := M;
end else begin
// We jumped too far to the left. We're in duplicate zone.
L := M + 1;
end;
end;
Assert(L = R);
Result.Right := R;

Result := M;
foundlist.Add(inttostr(m)+' - '+datalist[m]);
L := M + 1; // this is the only line I've played with (it
used to be just "exit")

A call to "exit" is sufficient at this point.

end
else if (CompareResult < 0) then L := M + 1
else R := M - 1;
end;
Result := -1;
end;

--
Rob
.



Relevant Pages

  • Re: A faster way of finding historical highs/lows
    ... >>using sort(), then a binary search. ... By binary search, ... If you have this in a relational database, ... database can answer the question quicker for you, ...
    (comp.lang.python)
  • Re: A faster way of finding historical highs/lows
    ... some preprocessing will likely make things ... > using sort(), then a binary search. ... By binary search, ...
    (comp.lang.python)
  • Re: Efficient Java Programming
    ... If you're using a binary search on a large data set and adding missing ... faster than using a sort. ... martin@ | Martin Gregorie ...
    (comp.lang.java.programmer)
  • Re: Confusing Arraylist BinarySearch problem
    ... this whole time or something else was always sorting my data prior to ... When I run this code directly after the array creation: ... Items must be sorted in order to search an item using binary search. ... can call the arraylist's 'Sort' method to sort it. ...
    (microsoft.public.dotnet.languages.vb)