Re: Need efficient search strategy in list of time intervals



On Jul 31, 4:18 am, lemmi <dlemmerm...@xxxxxxxxx> wrote:
Hi,

I was wondering if someone of you could help me out with an
algorithmic problem I have. This is not a homework exercise but
something I need for a UI framework that I have developed
(www.dlsc.com).

I have a list of time intervals ([long,long]). The list is sorted
based on the start times of the intervals. I now need a fast
algorithm, which takes a time span TS as input and returns the index
of the first time interval A in the list that is contained in the
given time span. The rendering engine of my UI framework will then
iterate over the list starting at the calculated index.

A time interval A is "contained" when A does not end before TS begins
and also does not start after TS ends (!(A.end < TS.start || A.start >
TS.end).

The algorithm gets constantly called as part of a paint() method so it
needs to be extremely fast (linear is not good enough, logarithmic
would be ideal).

To make things complicated. The intervals can overlap each other. An
interval A that starts earlier than B might very well end after B.
A.start < B.start && A.end > B.end.

Note: a standard binary search does not work because it might find a
time interval that ends before the input time span TS and then
interrupts its search even though another time interval is located
before it that ends after it.

Dirk

I've had to do this myself recently. What I did was created a class
that represented an Instant (either start or end), and a collection of
all spans that were in the instant (it took some time to build this
index for large datasets), the collection of Instants and be sorted,
and the you can do a binary search for the Instant. This worked
pretty well, but it did take some extra time to build the indexes.

I had a few other ideas that I haven't tried yet. One was to use
Lucene to build an index of my data.

The other was to keep track of the maximum interval size, sort the
intervals by start, binary search for a value, and then move backward
until the start + maxInterval < searchStart; This approach is
dependent on the data a little more, one long interval can cause a big
slow down.

Depending on your dataset size, it might actually be fast enough just
to do a linear search.

.



Relevant Pages