Re: Rounding errors

From: Robert Wagner (robert_at_wagner.net.yourmammaharvests)
Date: 08/26/04

  • Next message: Chuck Stevens: "Re: cobol data format!!! urgent!!!"
    Date: Wed, 25 Aug 2004 23:22:42 GMT
    
    

    On 25 Aug 2004 00:59:08 -0700, riplin@Azonic.co.nz (Richard) wrote:

    >Robert Wagner <robert@wagner.net.yourmammaharvests> wrote

    >If you have an even distribution of the numbers 0.000 to 0.999 with 3
    >digits then the average is _NOT_ 0.500. It is in fact 0.4995.
    >
    >This is very easy to prove. Just add up the 1000 numbers from 000 to
    >999 and the total is 499500.
    >
    >A short cut to this is that there are 500 pairs: the first is 000 +
    >999 -> 999, the second is 001 + 998 -> 999, ... the last pair is 499 +
    >500 -> 999.
    >
    > 500 * 999 -> 499500

    The spoiler is the inclusion of zero. If we take 999 numbers between
    .001 and .999, the average is .500. See below.

    >The mistake you made is that if you have a random distribution of
    >fractions with an arbitrary precision then the average will indeed be
    >.500000 or close to it.

    That's because a fraction will not produce zero. Zero isn't a rational
    number, it's a limit. As x approaches infinity, 1/x approaches zero.

    You can find documents on the Web saying zero is not a natural number
    but is a rational number. My counter is that every rational number can
    be expressed as a fraction of natural numbers. Show me a fraction that
    produces zero.
             
    >You then truncated that to 3 digits and lost and average of .0005 per
    >value reducing the average to the .4995 that the v999 numbers now
    >actually add up to form.

    Alternatively,if I eliminated zeros (post-truncation), the average of
    the other 99.9% would be .500.

    >Rounding of the original set of long numbers, or the truncated set as
    >it only requires the third digit to do rounding, restores the average
    >back to .5000.
    >
    >> >So the flaw is not that the rounding at the third digit increased the
    >> >rounded total, but your truncation of the 4th and later digits
    >> >decreased the original total.

    Assume my test set contained no zeros, the sum was 500,000, the
    average was .500. After rounding was applied, the sum became 500,500.
    This demonstrates that rounding introduced an error, an upward bias.

    Suppose I added a million zeros to that set. The sum would still be
    500,000 but the average would be .250.

    >> You are wrong. Support this with an example.
    >
    >To show that an even distribution of all fractions with 3 digits does
    >not average .5 :
    >
    > MOVE ZERO TO FracTotal
    > MOVE ZERO TO FracCount
    > PERFORM VARYING Frac FROM 0.000 BY 0.001 UNTIL Frac > 0.999
    > ADD Frac TO FracTotal
    > ADD 1 TO FracCount
    > END-PERFORM
    > COMPUTE FracAverage = FracTotal / FracCount
    > DISPLAY FracAverage
    >
    > -> 0.4995

    If PERFORM VARYING Frac FROM .001 BY .001 UNTIL Frac > .999,
    the answer is 499.5/999 = .500.

    Thank you for providing a framework that demonstrates the error in
    Cobol rounding without the issue of truncating random numbers.
    This program rounds an evenly distributed set of numbers and displays
    the average with and without rounding. The upward bias caused by
    rounding is as predicted.

     identification division.
     program-id. test27.
    *> author. Robert Wagner.
    *> Test rounding error
    *> To insure the same number of rounds up and down, 499 each,
    *> do not round .999.
    *> Findings: .5000 .5004
     data division.
     working-storage section.
     01 unqualified-variables.
         05 Frac pic 9v999.
         05 FracRounded pic 9v99.
         05 FracTotal-1 value zero pic 99999v999.
         05 FracTotal-2 value zero pic 99999v999.
         05 FracCount value zero pic 9999.
         05 FracAverage-1 pic zz.9999.
         05 FracAverage-2 pic zz.9999.

     procedure division.
     main.
         PERFORM VARYING Frac FROM .001 BY 0.001 UNTIL Frac > .999
              COMPUTE FracRounded ROUNDED = Frac
              ADD Frac TO FracTotal-1
              IF Frac = .999
                 ADD Frac TO FracTotal-2
              ELSE
                 ADD FracRounded TO FracTotal-2
              END-IF
              ADD 1 TO FracCount
         END-PERFORM
         COMPUTE FracAverage-1 = FracTotal-1 / FracCount
         COMPUTE FracAverage-2 = FracTotal-2 / FracCount
         DISPLAY FracAverage-1 FracAverage-2.

    Result: .5000 .5004

    Since the above does not use truncated random numbers, I clipped the
    remainder of that debate. It was clouding the issue of Cobol rounding.

    >> >I have done systems that carry the rounding forward. That is when the
    >> >first number is rounded the difference is added to the next number
    >> >before that is rounded (or truncated, as preferred). This ensures
    >> >that the total is always correct rather than being randomly incorrect
    >> >by a small amount.
    >>
    >> Rounding intermediate results is THE classic beginner's mistake. It
    >> doesn't surprise me that you advocate it.
    >
    >You obviously didn't read or didn't understand the mechanism I used
    >which does _not_ round intermediate results.

    I understand. You are propogating rounding backward into detail so
    that the details sum to the (rounded) total. You can still be off by 1
    .. unless you add the last difference into the total.

    >Given a set of dollar and cent values that add up to a certain total
    >it may be required to show this as dollars only (for example for total
    >sales by branch). If the values are truncated to dollars they won't
    >add up to the total. If they are rounded individually they may, or
    >may not, add up to the rounded exact total.

    In the financial data warehouse industry, we deal with this all the
    time -- reports that don't quite add up to the total. The worst-case
    error is plus or minus the number of detail lines. It's not a big
    deal. Our numbers are in thousands rather than currency units. For
    Brazil, they're in millions.

    Your system would screw us up. On one report you might say you are
    holding 1,000 (thousands of Euros) in XYZ. On the next report the same
    holding might be reported as 1,001. We would think you bought one.
    Some analytical methods are based on the number of 'decisions'. They
    don't care about quantity bought or sold. Your rounding correction
    would falsely count as a decision.

    Moreover, we measure turnover rate in a portfolio -- the number of
    changes since the previous report. Your corrections would make it look
    like you are 'churning' i.e. trading more than your peers. That would
    make you look bad to potential investors.

    >> >No. Wrong. "We" don't criticise floating-point for 'rounding errors'
    >> >at all. We criticise floating-point for not being able to represent
    >> >numbers exactly.
    >>
    >> Same thing. The error is in rounding a to .99999999 rather than 1.0.
    >
    >No. Not the same thing at all. A binary floating point number cannot
    >represent 1 accurately. It isn't a 'rounding error' it is a matter of
    >precision. Rounding is the correction for this problem.

    No system of numeration can store all real numbers exactly. Integers
    cannot represent pi, the square root of 2 and other irrational
    numbers. Nor can they express many rational fractions as a single
    number, for example 1/3.

    This discussion is about rounding. The function of rounding is to
    create a LESS precise representation.

    >> Because I'm addressing more than one Cobol programmer.
    >
    >I haven't seen any other Cobol programmer making these same errors.

    Every time they say ROUNDED they're creating an error of 500 parts per
    million. They've been doing it for 45 years.

    >> ---------------------------------------------------
    >> If someone else can show an error in my logic, without rancor, I'd be
    >> delighted to address his or her argument. Flammage offers neither
    >> information nor entertainment .. unless it's artful. The level of art
    >> in evidence here doesn't support the effort to respond.
    >
    >There is no 'rancor', no flammage, in saying that you are wrong. You
    >obviously feel that you are being personally attacked by the mere
    >suggestion that you could ever make an error.

    I have no problem with being found in error. Nor even name-calling
    when I deserve it. What I object to is criticism when my conslusion
    is, in fact, correct .. which it is in this case.

    >While I criticised your _methodology_ and said that your _conclusions_
    >were wrong. you responded with personal insults such as "You're all
    >wet", implied that I am ignorant by volition, and that I am a
    >beginner, and generally made ad hominem attacks.

    I apologize for those remarks.


  • Next message: Chuck Stevens: "Re: cobol data format!!! urgent!!!"

    Relevant Pages

    • Re: division by 7 without using division operator
      ... integer for which it fails? ... It depends on the implementation's precision and the rounding mode. ... the exact difference ... towards zero, ...
      (comp.lang.c)
    • Re: Decimal ROUND_HALF_EVEN Default
      ... >ROUND_HALF_DOWN (to nearest with ties going towards zero), ... >ROUND_HALF_EVEN (to nearest with ties going to nearest even integer), ... >> Which type of rounding is this? ...
      (comp.lang.python)
    • Re: When double precision isnt very precise
      ... I am me not aware that bankers rounding was used in the first ones. ... years in the industry where also stating that it wasn`t like that in COBOL ... No Midpoint rounding away from zero that is something different as just ... If you always round away from zero you'll always be over. ...
      (microsoft.public.dotnet.languages.vb)
    • Re: Blow me away, (long) does not truncate.
      ... >>It still doesn't explain what you think the difference is between rounding ... >>toward zero and truncation toward zero. ... University of Leicester, Leicester, LE1 7RH, UK ...
      (comp.lang.java.help)
    • Re: faster way to implement a circular buffer
      ... Divison rounds towards zero, bitshift rounds down. ... When using integer calculation of coordinates, including best approximation of straight lines and circles, rounding toward negative infinity avoids extra discontinuity around zero. ...
      (comp.dsp)