Re: Algorithm help for unique string searching/counting within an array.
- From: Paul Van Delst <Paul.vanDelst@xxxxxxxx>
- Date: Tue, 30 May 2006 16:48:50 -0400
Thomas Koenig wrote:
Paul Van Delst <Paul.vanDelst@xxxxxxxx> wrote:
I want to be able to search an array of strings (with many repeated elements) so I can count the unique elements.
Are you sure you want to use Fortran for this?
Yes.
Languages like Perl or
awk have built-in arrays with strings as subscripts. Perl calls
them "hashes".
As do many other languages. Or "directories".
Here's a short Perl program (for illustration), which assumes that
all the strings are in separate lines.
#! /usr/bin/perl
while(<>) { # Loop over all entries in the array, putting values into $_ .
chomp; # Eat newlines
$string{$_}++; # Increment the hash element with the subscript $_ by
# one, creating the element if necessary.
}
print scalar(keys(%string)),"\n"; # Count the number of keys to the hash.
I can also do the above in Ruby in one line. Doesn't help me in a Fortran executable though. :o) To be fair to folks who think doing this in Fortran is not entirely appropriate, this capability of sensor counting is an existing function (say, method) associated with a particular derived type (object) so there's a Fortran-centric history in my requirements. We're moving from representing sensors with an arbitrary integer to a people-readable string. My request to this group was due to me not being very smart when it comes to working with strings.
Hash tables for my original little application are a bit of overkill I reckon. I figure doing a sort and then a count using CSHIFT, e.g.:
! Define some sensor strings
cArray = (/ 'abi_gr ','imgr_g13 ','windsat_coriolis',&
'amsua_metop-c ','amsub_n15 ','amsua_n18 ',&
'amsua_n18 ','hirs3_n17 ','hirs3_n15 ',&
'modis_terra ','mhs_metop-a ','amsua_metop-b ',&
'amsua_metop-c ','amsub_n15 ','amsua_n18 ',&
'amsua_n18 ','hirs3_n17 ','hirs3_n15 ',&
'modis_terra ','mhs_metop-a ','amsua_metop-b ',&
'abi_gr ','imgr_g13 ','windsat_coriolis'/)
! Sort them
CALL InsertionSort( cArray )
! Count the unique entries
nSensors = MAX(COUNT(cArray /= CSHIFT(cArray,1)),1)
This seems to work pretty well. I'm also implementing a version something like so:
! Sort them
CALL InsertionSort( cArray, Idx ) ! Return a sorted index
! Count the unique entries
nSensors = MAX(COUNT(cArray(Idx) /= CSHIFT(cArray(Idx),1)),1)
for those cases where I don't want to change the order of the original array.
As *** Hendrickson pointed out in his post, once the array is sorted, the uniqueness determination is pretty trivial. I find the above pretty easy to understand. It sure is a *lot* tidier than the dog's dinner of a program I originally posted! :o)
cheers,
paulv
--
Paul van Delst Ride lots.
CIMSS @ NOAA/NCEP/EMC Eddy Merckx
Ph: (301)763-8000 x7748
Fax:(301)763-8545
.
- Follow-Ups:
- Re: Algorithm help for unique string searching/counting within an array.
- From: Arjen Markus
- Re: Algorithm help for unique string searching/counting within an array.
- References:
- Algorithm help for unique string searching/counting within an array.
- From: Paul Van Delst
- Re: Algorithm help for unique string searching/counting within an array.
- From: Thomas Koenig
- Algorithm help for unique string searching/counting within an array.
- Prev by Date: Test send
- Next by Date: Re: NaN in intel fortran 9.0 for windows integrated with VS.Net
- Previous by thread: Re: Algorithm help for unique string searching/counting within an array.
- Next by thread: Re: Algorithm help for unique string searching/counting within an array.
- Index(es):