Re: newbie: capabilities of c++ in numerical computation
From: Benoit Mathieu (benoit.mathieu2_at_NOSPAM.free.fr.invalid)
Date: 04/18/04
- Next message: Ahmed S. Badran: "Re: Some C/C++ books"
- Previous message: lokman: "Re: simple pointer question"
- In reply to: Cy Edmunds: "Re: newbie: capabilities of c++ in numerical computation"
- Next in thread: Mark Ng: "Re: newbie: capabilities of c++ in numerical computation"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Sun, 18 Apr 2004 12:12:03 +0200
>>1) Does object oriented programming introduce significant overhead as
>>opposed to using just arrays. I know this is a very subjective issue and
>>depends on operating system, compiler... What I have in mind is
>>
>>class vector
>> double x[3]
>>
>> some functions here
>>end class
>>
>>class node
>> vector position, velocity
>> double temp, pres, density
>>
>> some functions here
>>end class
>>
>>Now say if I am having
>>
>> node domain[128][128][128] // 3 dimensional array
>>
>>In a given function usually, I will be accessing just one quantity (say
>>velocity) at all nodes in the domain.
Doing this, you change the way you represent data at the
lower level. In fortran, you have to choose at the begining
of the project how you represent an array of coordinates:
COORD[n][3]
or COORD[3][n]
The most efficient representation depends on the way data is
accessed in the program, cache memory considerations,
whether you work on a vector computer or not...
C++ allows you to hide these details so that you can change
the representation afterwards. The higher level routines in
your code will access data through
domain.position(node, coord);
I personnaly think that arrays of doubles are better at low
level than arrays of structures. This allows you to call
fortran or c library routines from your code. Take care to
encapsulate those calls in the lower level routines, so that
you don't have to care about the following in the higher
level routines:
- index translation (fortran begins at index 1 and so on...),
- type checking,
- const and non const cast (when the library function
prototypes do not represent logical constness of the
parameters),
- aliasing (you must check when calling fortran routines,
because it is forbidden in fortran),
Also, before calling library routines, performing
prerequisite checks on array sizes and so on with assert()
is great.
C++ can have a huge performance impact in some cases:
- if you create too fine grained classes, like one class to
represent a 3D vector {x,y,z}, and put some virtual
functions inside (32 bytes per element instead of 24, well
in that particular case, it might in some cases be faster
with 16 bytes because of cache alignment...)
- if you use virtual functions to access data at fine grain:
double sum = 0.;
for (i = 0; i < n_nodes; i++)
for (j = 0; j < dimension; j++)
sum += domain.velocity(i,j) * domain.velocity(i,j);
if velocity(i,j) is a virtual function, you will multiply
your execution time by 2, 3 or more... virtual functions
cannot be inlined
Computing the norm of a vector is typically something you do
with a dedicated routine :
norm = domain.velocity().norm();
or norm(domain.velocity()) as you wish...
And this routine will use low level optimized routines (blas
or so...).
>>2) Does object oriented programming significantly affect parallelising
>>the code? Is MPI support in C++ good enough or is there any better
>>parallel programming language?
MPI has a C and a Fortran interface. We encapsulate MPI
calls in communication classes, so we can perform
prerequisite checking, toggle debug information, switch to
PVM easily...
Encapsulation means more function calls, but with a typical
network latency of 5 micro-seconds, this is not a
significant overhead. If you are on smp machines, this
might, perhaps, become a concern...
>>3) Does OOP overhead significantly affect the performance of code when
>>calling fftw(3.0) libraries?
When working on more than 32 elements arrays, and provided
that you don't have convert the data structure because the
data is stored differently in your code, the overhead is
less than a few percent. It is important that your data
structures match the data structure of the most commonly
used library functions, those where you spend the most cpu
time. Doing this, you have a chance that you can vectorize
the code on vector machines...
>>As you probably would have guessed by now, I am a fortran programmer
>>trying to see if it is worth shifting to OOP for numerical computation
>>purposes. Should I stick to Fortran 90 or is it worth shifting to c++?
> I would say you would do better as a good Fortran programmer than you would
> as a bad C++ programmer. If you are going to use C++ be prepared to take
> some time to learn it. It's kind of complicated and although it has great
> power it also has a lot of traps. Read the best textbooks you can find and
> work on it for a few months. If you do you won't be tempted by Fortran
> again.
>
> Study the way the standard library (particularly the part formerly known as
> the Standard Template Library or STL) works. Use existing containers rather
> than writing your own. For high performance use highly optimized C library
> functions (fftw being a perfect example) using pointers obtained from the
> C++ containers. This way the dreaded "OOP overhead" can be pretty much
> eliminated.
>
> That's what I do and I was a Fortran programmer for many years.
I agree with that.
One advantage of c++ is that there exists g++ which is a
good, free and widely used compiler. Vendor c++ compilers,
especially on supercomputers might be less efficient than
the corresponding vendor fortran compiler, especially on
vector computers...
c++ makes sense for large projects, but it takes years to
write an efficient core that is easy to use. For us, the
most important is that people new to the project can quickly
write code. It requires both a good experience in C++ to
write the core classes (so that you encapsulate things
correctly, with correct constness checking, asserts where
they are needed, prevent people from doing weird things) and
a good knowledge of the "big picture" of your computing
program so that your classes match what you need.
One thing which is very difficult to achieve is to hide the
data representation enough so you can change your mind
afterwards without changing all the code (this is one
advantage of c++), but not too much because in this case you
increase data access overhead and you globally increase the
complexity of the code. Moreover, you never know in advance
how you will change your mind later, and it is impossible to
design the code so that *any* change is easy.
One last thing: be sure to always have a skilled c++
programmer maintain the code...
Hope this helps,
Benoit
- Next message: Ahmed S. Badran: "Re: Some C/C++ books"
- Previous message: lokman: "Re: simple pointer question"
- In reply to: Cy Edmunds: "Re: newbie: capabilities of c++ in numerical computation"
- Next in thread: Mark Ng: "Re: newbie: capabilities of c++ in numerical computation"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|