Re: help with statistics library
- From: Richard Weeks <rweeks@xxxxxxxxxx>
- Date: Thu, 26 Apr 2007 16:16:16 GMT
Jens Thoms Toerring wrote:
Richard Weeks <rweeks@xxxxxxxxxx> wrote:
http://members.shaw.ca/bystander/statsource.html
My first impression after a short look is that it looks rather
well-written and I so simply start with nit-picking:
1) Your always using an int for the sizes of the arrays you pass
to the functions. But since there is a size_t type introduced
for exactly this purpose why not use that? The same holds for
the loop variables used to iterate over arrays.
2) a_mean() could easily rewritten as
double a_mean(double *datalist, size_t listsize) {
return sum(datalist, listsize) / list_size;
}
A rule I learn to appreciate more and more is "Don't repeat
yourself" and by having the same code in a_mean() and sum()
that rule is violated.
Good point.
3) The DBL_ISEQUAL macro defined in the header file looks fishy. You
define it as
#define DBL_ISEQUAL(a,b) ((-DBL_EPSILON)<((a)-(b))&&((a)-(b))<(DBL_EPSILON))
DBL_EPSILON is the minimum x so that 1.0 + x != 1.0, so your macro
will work for a and b having a similar order of magnitude to 1, but
if DBL_EPSILON is e.g. 1e-9 and 'a' is e.g 1.0e-23 and 'b' 1.1e-23
it will falsely flag them as equal even though 'a' and 'b' differ
by 10% and they probably shouldn't be treated as being equal. I
guess you need to scale DBL_EPSILON to match the values of 'a' and
'b'.
This is a dicey issue that never seems to get resolved to everyone's satisfaction. Some sources simply say "avoid comparing doubles for equality." I guess it's necessary to deal with it on a case by case basis.
4) There are a few instances where you cast where it isn't necessary.
E.g. if you divide a double by an int you don't have to cast the
int to double, the compiler will do that for you. The same holds
for instances where you assign an int to a double. Of course, the
cast doesn't hurt, but as a rule I avoid casts unless they are
really necessary and thus see casts as a "red flag" that says
"here's something happening that may need re-examination".
5) In std_norm_ptile() you use the comma operator too much for my
liking. E.g.
if(k >= .01 && k <= .1) x = -2.5, limit = -1.25;
is probably better written as
if ( k >= 0.01 && k <= 0.1 )
{
x = -2.5;
limit = -1.25;
}
Obviously, it will behave exactly the same but it's probably
going to be simpler to parse for someone reading the code (and
the compiler will probably emit exactly the same machine code).
std_norm_ptile() is a kludge I've never been happy with (even though it works). I need to rework the whole thing.
6) I guess you shouldn't export the compare() function. As far as I
can see it's only for use within your code, so it probably would
be better to define it as static and not to include it in the
libraries header file.
7) Defining macros with names as 'PI' and 'E' can be problematic, at
least as you expect this library to be used widly. Beside not
forcing those names into the namespace of the user I also would
recommend to use a few more digits, on many systems a double has
at least 14 to 16 significant digits and there may be many where
there a lot more. Finding more precise values for these constants
shoudn't be hard.
8) Again, if you expect your library to get a wider user base you
probably should avoid using too simple names for your functions.
Expect your users to come up with simple function or variable
names like "mean", "mode", "min_max", "agm" etc. and help them
avoid clashes with the names from your library by e.g. prepen-
ding the names of all your functions with e.g. "rwsp_' (short
for "Richard Weeks Statistics Package") so that they don't
have to worry if a simple function or variable name is already
use by your library.
Take all these points with a grain of salt. With the exception of
the problem with the DBL_ISEQUAL macro it's probably mostly not
much more than nit-picking. And I didn't go through all functions
very carefully or did read up on those where I don't have any idea
what they are used for (I am not an expert on statistics!). All in
all it was a pleasure to see an example rather well-written code -
I would love to see examples of that more often;-)
Thanks very much for your helpful suggestions and kind remarks.
Richard
.
Regards, Jens
--
\ Jens Thoms Toerring ___ jt@xxxxxxxxxxx
\__________________________ http://toerring.de
- References:
- help with statistics library
- From: Richard Weeks
- Re: help with statistics library
- From: Ico
- Re: help with statistics library
- From: Richard Weeks
- Re: help with statistics library
- From: Willem
- Re: help with statistics library
- From: Richard Weeks
- Re: help with statistics library
- From: Jens Thoms Toerring
- help with statistics library
- Prev by Date: Re: [C++] Command line parser
- Next by Date: Take part in my research project
- Previous by thread: Re: help with statistics library
- Next by thread: Re: help with statistics library
- Index(es):
Relevant Pages
|