Re: matrix stuff (solving b = A*x) --> using numerical recipes



Martin Jørgensen schrieb:
Michael Mair wrote:
Martin Jørgensen schrieb:

-snip-

Ok, I added it now (actually I also had it at some earlier moment, but then I removed it since I couldn't see any compiler warnings/errors without stdio.h).

Then
- invoke your compiler in strict C89 or C99 mode
- turn the warning level up to the maximum
If this does not help or if the former is not possible, I'd
feel a certain amount of distrust towards the compiler.

Hmmm.. Strange. I tried it again and it doesn't complain. Which functions are needed by stdio.h?

I don't know how to turn up the warning levels or put the compiler in strict C89 or C99 mode... The compiler is Microsoft visual studio 2005 c/c++. I don't even know how to invoke it with the command-line or what the file-name for the compiler is. I assume I could just type (compiler-name.exe) /help in a dos cmd-prompt window to learn about the options...

<OT>Right-click on your project, go to Properties.
Within the properties window, click on "C/C++"/General, turn up the
warning level to W4; consider switching on Treat Warnings As Errors
as long as you are still learning.
Go on to "C/C++"/Language. Disable language extensions. Consider
making char defaulting to unsigned char.
Go on to "C/C++"/Advanced. Set to Compile As C.

Note:<stdio.h> is often included by "stdafx.h".
Try to keep your includes where they belong.

I did not actually try it out because I do not have VC 2005 at hand.
</OT>

This is really weird. My program now compiles but stops execution because of an "unhandled win32 exception":

Unhandled exception at 0x004155a3 in band_diag_system_(numerical_recipes).exe: 0xC0000005: Access violation reading location 0xabababaf.

And this location is part of ??

I think (at least hope) you should be able to get the same message with the new code I will be posting now...

-snip-

Compiler error; I do not have nrutil.h -- it obviously is not
used here.

It is public domain: http://www.nr.com/pubdom/nrutil_nr.h.txt

But you won't need it so far I think...

/* definitions */
#define number_of_rows 7
#define TINY 1.0e-20
#define SWAP(a,b) {dum=(a);(a)=(b);(b)=dum;}

Note that this swap is not exactly typesafe.

Why?

Let us make it a game:
- If dum is of your largest integer type, I swap double variables
containing values too large to be represented by dum.
- If dum is of your largest floating point type and this type has
less mantissa bits than your largest integer type has bits, I
use variables of that type containing a sufficiently large number
with enough bits set to be not representable.
<OT>With MS VC++, you have only 64 bit long doubles, so I can
leave it at using 64 bit integers</OT>.

If the above conversions can take place implicitly, you do not
get the least warning that your swap throws away everything.
In addition: Where do we get "dum"? Is it a global variable?

-snip-

When I debug and get to the line in banmul that says: k=i-left-1, then i is 1, left is 2 and from that we subtract 1. The total sum is zero so k should be = 0. But it doesn't become 0 - it becomes 4294967294 according to my debugger. I don't understand that... Any explanation?

Yes. 1-2-1 is -2 which wraps around to 4294967296-2.
Everything fine there.

Ok, but actually there's something wrong with the "numerical recipes" code, I think then - it seems to try to access a pointer with index k which is far away from the malloc'ed memory... Explanation follows later...

Okay. Throw in a check that your indices are in the valid range.
Provide enough context and let the programme die gracefully.


I assume I just need to cast, like writing: "k = (unsigned long)i-left-1"; but I would like to know why this exception error happens...

Won't help.

Are you sure that the above is the faulty code?
I cannot see any problem with it.

Q: Does the error also occur for the following variant?

Yep, but you didn't have a chance to see that before now (I hoped it wasn't necessary to make it more complicated but I now post about 3 extra lines to the banmul function) + 2 definitions from the (public domain) header file which is included to make things easy for you...

I took your proposal and modified my program. I made a small error with the matrix I think - but it's corrected now (I think it should be a "compact matrix" form for the bandmul function to work properly).

Okay.

Try this program (just copy/paste) and I hope you'll see my error (I also included the original matrix outcommented for clarity reasons):
----------

/* Including header files */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
//#include "nrutil.h"

/* definitions */
#define number_of_rows 7
#define number_of_cols 4
#define TINY 1.0e-20
#define SWAP(a,b) {dum=(a);(a)=(b);(b)=dum;}

static long lminarg1,lminarg2;
#define LMIN(a,b) (lminarg1=(a),lminarg2=(b),(lminarg1) < (lminarg2) ?\
(lminarg1) : (lminarg2))

static long lmaxarg1,lmaxarg2;
#define LMAX(a,b) (lmaxarg1=(a),lmaxarg2=(b),(lmaxarg1) > (lmaxarg2) ?\
(lmaxarg1) : (lmaxarg2))

/* prototypes */
void banmul(float **a, unsigned long n, int left, int right, float x[], float b[]);


/* Start of main program */
int main(void)
{
/* declaration */
float **a;
int i;

/* remember that we're solving "b = A * x" */
float x[number_of_rows] = {1., 2., 3., 7., 6., 5., 4.};
float b[number_of_rows] = {0.f};
float testmatrix[][number_of_rows] =
{
{0.,0.,3.,1.},
{0.,4.,1.,5.},
{9.,2.,6.,5.},
{3.,5.,8.,9.},
{7.,9.,3.,2.},
{3.,8.,4.,6.},
{2.,4.,4.,0.}
};

Are you sure that this is the memory layout you want to have?

//{
// {3.,1.,0.,0.,0.,0.,0.},
// {4.,1.,5.,0.,0.,0.,0.},
// {9.,2.,6.,5.,0.,0.,0.},
// {0.,3.,5.,8.,9.,0.,0.},
// {0.,0.,7.,9.,3.,2.,0.},
// {0.,0.,0.,3.,8.,4.,6.},
// {0.,0.,0.,0.,2.,4.,4.}
//};

a = malloc(number_of_rows * sizeof *a);
if (a == NULL)
{
fprintf(stderr, "%s, %d: You're running low on memory -"
" you should do something.\n",
__FILE__, __LINE__);
exit(EXIT_FAILURE);
}

a[0] = malloc(number_of_rows * number_of_rows * sizeof *a[0]);

Okay, this is more or less my mistake: If you change the size of
testmatrix, this will foul up the whole thing.
As you are only copying testmatrix, replace the line by
a[0] = malloc(sizeof testmatrix);
if (a[0] == NULL)
{
fprintf(stderr, "%s, %d: You're running low on memory -"
" you should do something.\n",
__FILE__, __LINE__);
exit(EXIT_FAILURE);
}
memcpy(&a[0][0], &testmatrix[0][0], sizeof testmatrix);

for(i = 1; i < number_of_rows; i++)
{
a[i] = a[0] + i*number_of_rows;

Depending on what kind of matrix you are trying to arrive at,
this is not necessarily correct. Replace number_of_rows by the actual
number of columns (which is
sizeof testmatrix[0]/ sizeof testmatrix[0][0]).

If you do not want to keep testmatrix as it is at initialization,
then you also can skip the second allocation and the memcpy() and
just set up a[i] to point to the start of the "i'th row".

}



banmul(a, number_of_rows, 2, 1, x, b);
/* the number 2 = number of rows to the left of diagonal (see matrix)
and the number 1 = number rows to the right of the diagonal (see matrix) */

printf("\nFinishing program now.\n\n");

free(a[0]);
free(a);
return(0);
}

////////////////////////////////////////////////////////////////////////////////////////////////////////////

// m1= left, m2= right.
////////////////////////////////////////////////////////////////////////////////////////////////////////////



void banmul(float **a, unsigned long n, int left, int right, float x[], float b[])
{
unsigned long i,j,k,tmploop;
for (i=1;i<=n;i++) {
k=i-left-1;
tmploop=LMIN(left+right+1,n-k);
b[i]=0.0;
for (j=LMAX(1,1-k); j<=tmploop;j++) b[i] += a[i][j]*x[j+k];
}

Question: Why don't you just output the value of i, k, tmploop
within the outer loop and the value of j and b[i] within the
inner loop?
Consider calculating and outputting the values as signed values and
unsigned values.
<OT>As you have a 64 bit integer type, you can introduce auxiliary
variables of that type, do the computations in these variables,
assign the value to the intended variables and output the auxiliary
values and the actual values.</OT>
}


----------

It put a breakpoint in the banmul function. I get that k = 4294967294 (first time the computer enters the function, so i=1). tmploop = 4. b[1] = 0 (ofcourse). Now I wanted to try my visual studio compiler function which prints something like the following out if I change something and continue debugging:

-------- Edit and Continue build started --------

--------------------- Done ----------------------

My problem is that I wanted to insert lines that calculates 1-k = 1-4294967294 such as:

tmploop = 1-k; // = 1-4294967294

With the debugger I then want to see tmploop (I don't know what the result is because I guess I would have to think "binary" then). Then I would like to continue and put tmploop back to its original value:

tmploop=LMIN(left+right+1,n-k);

Is that possible with this "edit and continue build"-stuff with my compiler?

Sorry, this is a compiler/debugger specific thing. I do not
work with the VC debugger that much, so I cannot help you there.
An appropriate newsgroup might be better for that.

Ok, the real problem AFAICS is last line: b[i] += a[i][j]*x[j+k], because at this moment k is 4294967294 ! So it's trying to multiply a[i][j] with x[a very big number outside the real data]...

What do you think?

Not necessarily. The authors may rely on defined overflow
behavior. If j is large enough that the unsigned long representation
of j+k is >= 0, everything is alright. This seems to be the case
on first glance.
If your compiler implicitly promotes the result to __int64 or
something like that, you are in trouble; then you may want to
cast j+k to unsigned long to be on the safe side. However, first
try to find out what really happens before introducing casts.

If you or somebody else want there's a brief and okay description at http://www.library.cornell.edu/nr/bookcpdf/c2-4.pdf (banmul described on page 3 (p.52) ). I hope I didn't miss anything from there...

I hope there are some clever guys in this forum that can solve the problem :-)

Work with the above. If I have the time later on, I will have
a closer look at what actually happens (if you did not report
any results by then).


Cheers
Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.
.



Relevant Pages

  • Re: cpu type idea
    ... compiler related recently. ... int main ... float a; ... just to parallelize the vectror and tensor operations. ...
    (alt.lang.asm)
  • Re: Parameter Name Warning?
    ... the names on the declarations do not matter, it should perhaps give a warning regarding different names being used. ... > there is more to those names than just what the compiler does with them. ... > You know only what you get in the header. ... >>> int main ...
    (microsoft.public.vc.language)
  • Re: is order urgent doubt
    ... Which *compiler* are you using? ... layout of the warning messages.) ... i = sizeof(long int); ... about the code in the editor vs. the code in the source file. ...
    (comp.lang.c)
  • Re: Whats the deal with size_t?
    ... ES> allowed to emit diagnostics that C doesn't require. ... in warning about a signed to unsigned conversion. ... a size_t to an int it is quite justified in warning about the reverse, ... The compiler is also quite justified in warning if you use a signed ...
    (comp.lang.c)
  • Re: Compendiums of compiler warning/error messages mapped to actual code problems?
    ... The general problem is of course that the compiler messages must be short, and that tends to make them so cryptic that it isn't always immediately obvious what the problem is. ... int this; ... Warning twarn.c: ...
    (comp.lang.c)