Re: Rounding errors
From: Richard (riplin_at_Azonic.co.nz)
Date: 08/25/04
- Next message: Robert Wagner: "Re: Revisiting an Old Prejudice: READ INTO/WRITE FROM"
- Previous message: antoniobastiao_at_pop.com.br: "Re: Microfocus Printing in Columns"
- In reply to: Robert Wagner: "Re: Rounding errors"
- Next in thread: Robert Wagner: "Re: Rounding errors"
- Reply: Robert Wagner: "Re: Rounding errors"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 25 Aug 2004 00:59:08 -0700
Robert Wagner <robert@wagner.net.yourmammaharvests> wrote
>>> Intuition tells us half the numbers will round up and half will
>>> round down. Given a very large sample, the average will remain
.500 ..
>>> we think.
>
> >It may be what you had thought, but why do you assume that everyone
> >else thought that ?
>
> Because it's in the Cobol standard.
No. Wrong. Your comment was 'the average will remain .500.. we
think'.
I don't think that because the average of your example wasn't .500 in
the first place. You created a fantasy situation.
> >> Suppose we have a million random numbers formatted v999, adding up to
> >> 500,000.
It is fantasy because if there is an even distribution of all possile
values in a v999 it won't add up to 500,000. If it does add up to
that it isn't random.
If the random numbers were long enough then the total would be 500,000
(or close).
> >> Let's divide them into three groups: one containing rightmost
> >> digit of zero, a second containing 1-4 and a third containing 5-9.
> >> Let's round each group the Cobol way and sum the rounded numbers.
> >>
> >> Digit Population Sum
> >> 0 100,000 50,000
> >> 1-4 400,000 200,000 - 1,000 (-.0025 * 400,000)
> >> 5-9 500,000 250,000 + 1,500 (+.0003 * 500,000)
> >> Total 500,500
> >
> >Your methodology is flawed. In fact you have made a gross statistical
> >error.
> >
> >The fault is that you have claimed they were 'random' when you have
> >contrived to truncate after the third digit. If they had not been
> >truncated and you had left the remaining additional digits in, say, a
> >v99999999 and then added them up you would have got very close to the
> >total of 500,500.
> You're all wet. It is intuitively obvious that collections of random
> numbers formatted v99, v999 or v9(infinity) will average .50000.
That may be 'intuitively obvious' to _you_, but it is 'intuitively
obvious to us that are numerate' that it is quite wrong.
If you have an even distribution of the numbers 0.000 to 0.999 with 3
digits then the average is _NOT_ 0.500. It is in fact 0.4995.
This is very easy to prove. Just add up the 1000 numbers from 000 to
999 and the total is 499500.
A short cut to this is that there are 500 pairs: the first is 000 +
999 -> 999, the second is 001 + 998 -> 999, ... the last pair is 499 +
500 -> 999.
500 * 999 -> 499500
Similarly the average of:
v9 0 - 9 is 0.45
v99 00 - 99 is 0.495
v999 000 - 999 is 0.4995
v9999 0000 - 9999 is 0.49995
Now it is true that given an arbitrary number it will approach 0.50000
very closely.
The mistake you made is that if you have a random distribution of
fractions with an arbitrary precision then the average will indeed be
.500000 or close to it.
You then truncated that to 3 digits and lost and average of .0005 per
value reducing the average to the .4995 that the v999 numbers now
actually add up to form.
Rounding of the original set of long numbers, or the truncated set as
it only requires the third digit to do rounding, restores the average
back to .5000.
> >So the flaw is not that the rounding at the third digit increased the
> >rounded total, but your truncation of the 4th and later digits
> >decreased the original total.
>
> You are wrong. Support this with an example.
To show that an even distribution of all fractions with 3 digits does
not average .5 :
MOVE ZERO TO FracTotal
MOVE ZERO TO FracCount
PERFORM VARYING Frac FROM 0.000 BY 0.001 UNTIL Frac > 0.999
ADD Frac TO FracTotal
ADD 1 TO FracCount
END-PERFORM
COMPUTE FracAverage = FracTotal / FracCount
DISPLAY FracAverage
-> 0.4995
If you had started with a realistic set of random numbers between
0.000000000 and 0.999999999999999999999 then the average would indeed
be close to 0.5. Truncating after the 3rd digit would indeed affect
the total of all these numbers by an average of 0.0005 per number.
The result of the truncation should result in an even distribution of
the 3 digit values between 0.000 and 0.999 which _provably_ averages
to 0.4995 - exactly matching the loss of data.
When you round the numbers to two digits what existed beyond the 3rd
digit is irrelevant, so the truncation doesn't matter. The rounding
will recover the original average of .50.
> >> Here's another way of looking at what we did. We discarded the
> >> rightmost digit, producing numbers that look like v99. Then we left
> >> half of them unchanged and added .01 to the other half. By doing so,
> >> we increased the total by (.01 * 500,000) = 500.
>
> The rounded answer is incorrect.
No. The rounded answer is correct for a set of random numbers of
arbitrary length. The 3 digit truncated set of numbers doesn't
represent that set accurately, thus there are 3 averages:
arbitrary length random -> .500000
truncated 3 digit (your set) -> .4995
rounded 2 digit set -> .500000
> There are two types of ignorance -- simple and volitional. The former
> simply doesn't know; the latter doesn't WANT to know and becomes
> hostile when you attempt to educate him. Most of us have encountered
> the latter type in our daily lives.
As we in fact often encounter this exactly in your messages.
> >I have done systems that carry the rounding forward. That is when the
> >first number is rounded the difference is added to the next number
> >before that is rounded (or truncated, as preferred). This ensures
> >that the total is always correct rather than being randomly incorrect
> >by a small amount.
>
> Rounding intermediate results is THE classic beginner's mistake. It
> doesn't surprise me that you advocate it.
You obviously didn't read or didn't understand the mechanism I used
which does _not_ round intermediate results.
Given a set of dollar and cent values that add up to a certain total
it may be required to show this as dollars only (for example for total
sales by branch). If the values are truncated to dollars they won't
add up to the total. If they are rounded individually they may, or
may not, add up to the rounded exact total.
The way to correct this is to round each value and take the difference
between that and the pre-rounded value and add that to the next number
before rounding that.
There is _no_ 'rounding intermediate result', there is _no_
advocating. Your criticism is based on not reading my message.
> The right way is to carry intermediate results to say six digits right
> of decimal and round them only when going to a report. The wrong way
> is to add rounded numbers into a total. I'm an autodidact but assume
> they used to teach this in Programming 101.
You are so sure that you are right that it never even occurs that you
might check your claims.
> >No. Wrong. "We" don't criticise floating-point for 'rounding errors'
> >at all. We criticise floating-point for not being able to represent
> >numbers exactly.
>
> Same thing. The error is in rounding a to .99999999 rather than 1.0.
No. Not the same thing at all. A binary floating point number cannot
represent 1 accurately. It isn't a 'rounding error' it is a matter of
precision. Rounding is the correction for this problem.
> Because I'm addressing more than one Cobol programmer.
I haven't seen any other Cobol programmer making these same errors.
> ---------------------------------------------------
> If someone else can show an error in my logic, without rancor, I'd be
> delighted to address his or her argument. Flammage offers neither
> information nor entertainment .. unless it's artful. The level of art
> in evidence here doesn't support the effort to respond.
There is no 'rancor', no flammage, in saying that you are wrong. You
obviously feel that you are being personally attacked by the mere
suggestion that you could ever make an error.
However, it was your claim that everyone else was wrong too, and that
we all made the assumptions that you did that started the rancor.
While I criticised your _methodolgy_ and said that your _conclusions_
were wrong. you responded with personal insults such as "You're all
wet", implied that I am ignorant by volition, and that I am a
beginner, and generally made ad hominem attacks.
> Succinctly, it's like 'pissing into the wind.'
I have noticed that attempting to educate you is exactly that.
- Next message: Robert Wagner: "Re: Revisiting an Old Prejudice: READ INTO/WRITE FROM"
- Previous message: antoniobastiao_at_pop.com.br: "Re: Microfocus Printing in Columns"
- In reply to: Robert Wagner: "Re: Rounding errors"
- Next in thread: Robert Wagner: "Re: Rounding errors"
- Reply: Robert Wagner: "Re: Rounding errors"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|