Re: Choice of data structures

From: Mark McConnell (mmcconnell17704_at_yahoo.com)
Date: 09/27/04


Date: 27 Sep 2004 04:31:03 -0700

Jon Boone <ipmonger@comcast.net> wrote in message news:<BD7C6DA9.371A8%ipmonger@comcast.net>...
> Folks,
>
> I'm looking for some insight into choosing appropriate data structures for
> a program that I'm working on. Right now it's in the "design" stage where I
> am working out the different parts of the program and how they will work
> together.
>
> The fundamental data that I will work with is a series of variables that
> are entered on a periodic basis (perhaps as often as once a day). Not all
> variables will receive data daily.
>
> There are fundamentally two things I will use the data for:
>
> 1. Displaying textual/graphical output to the user to provide a means for
> them to easily determine the data trends. Add on modules will color
> code the data outputs to indicate important analytical properties.
>
> In typical usage, I imagine that not all variables will be graphed
> simultaneously, although I intend to support that.
>
> 2. Perform statistical testing on the data to highlight correlations that
> that exist (and that the user might not think to display graphically).
>
> If I were programming this in C (or perl), I'd create a hash to store each
> variable, keyed by the date that the data was collected. So, for 15
> variables, I'd have 15 different hashes, all keyed the same.
>
> If I were trying to do this in an OOP style (with Java or C++), I'd
> probably use some generic sequence data type (and implement it as a hash),
> but otherwise still use 15 different sequences for 15 different variables.

I'm wondering if you need the hashing. If you need to do some
random-access operation, then sure, hashing is necessary. But
random-access operations are not mentioned directly in 1 and 2.

You could store the time in each piece of data, then use binary search
to find ranges of data defined by time. Put the data into the generic
sequences (Java's ArrayList, Lisp's adjustable-array) in the order it
comes in, i.e., sorted by time. To find all the data from Tuesday
9/21 to Thursday 9/23, use a binary search to find 9/21, then another
binary search (starting from the 9/21 point) to find 9/23.



Relevant Pages

  • Re: How to best extract a list of identical keys in a sorted ArrayList with BinarySearch ?
    ... where that entry contains all of the items that match that key. ... If the indexer on key has to do a binary search then ... I presume that SortedList doesn't do hashing, because someone who wanted hashing would just use the Hashtable class instead, or possibly would use the two class together to achieve hashing with sorting. ... Even if you assume that the binary search is slower than a lookup by hash value, it's not going to be *much* slower. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: How to best extract a list of identical keys in a sorted ArrayList with BinarySearch ?
    ... where that entry contains all of the items that match that key. ... If the indexer on key has to do a binary search then ... hashing, because someone who wanted hashing would just use the Hashtable ... Hashing and sorting are so fundamentally different ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: fast dictionary search algorithm
    ... > I have a text file with words and meanings. ... The algorithm ... > I have considered hashing and multiple level hashing as an option ... and use binary search on them from there. ...
    (comp.programming)
  • Re: fast dictionary search algorithm
    ... >> I have considered hashing and multiple level hashing as an option ... > don't go the route of rehashing. ... and use binary search on them from there. ... A reasonable system will never ...
    (comp.programming)