Feature suggestion: sum() ought to use a compensated summation algorithm




I did the following calculation: Generated a list of a million random numbers between 0 and 1, constructed a new list by subtracting the mean value from each number, and then calculated the mean again.

The result should be 0, but of course it will differ from 0 slightly because of rounding errors.

However, I noticed that the simple Python program below gives a result of ~ 10^-14, while an equivalent Mathematica program (also using double precision) gives a result of ~ 10^-17, i.e. three orders of magnitude more precise.

Here's the program (pardon my style, I'm a newbie/occasional user):

from random import random

data = [random() for x in xrange(1000000)]

mean = sum(data)/len(data)
print sum(x - mean for x in data)/len(data)

A little research shows that Mathematica uses a "compensated summation" algorithm. Indeed, using the algorithm described at
http://en.wikipedia.org/wiki/Kahan_summation_algorithm
gives us a result around ~ 10^-17:

def compSum(arr):
s = 0.0
c = 0.0
for x in arr:
y = x-c
t = s+y
c = (t-s) - y
s = t
return s

mean = compSum(data)/len(data)
print compSum(x - mean for x in data)/len(data)


I thought that it would be very nice if the built-in sum() function used this algorithm by default. Has this been brought up before? Would this have any disadvantages (apart from a slight performance impact, but Python is a high-level language anyway ...)?

Szabolcs Horvát
.



Relevant Pages

  • Salamin-Brent algorithm
    ... page of that title and learned of the Salamin-Brent algorithm (I'm new ... etc. (neglecting any benefits the binary calculation ... "floating point epsilon" problem, i.e., when the precision of the ...
    (sci.math)
  • Re: c vs java floating point errors under windows XP
    ... >>algorithm in java and in C to accomplish this calculation. ... would get a max precision of 15-16 digits using double. ...
    (comp.programming)
  • Re: demonic numbers !
    ... > corresponding calculation of error intervals. ... Then they're either not doing enough math for rounding errors to ... accumulate or they're not using a wide enough range of magnitudes. ... They don't need more than 53 bits of precision to store those values. ...
    (comp.lang.lisp)
  • Re: General Opinion on a how to?
    ... So different prize ... not have to change the database as you add new algorithm to the system. ... The way i see my calculation working with the aid of Jameys tips is this: ... Load into a prize pool object instance for that game ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Kresss Probability Trilogy Qs
    ... We know our mathematics is true by direct insight, ... Any calculation has an experimental aspect to it (albeit one that is ... objections and answers which I'm ... Our "algorithm" is an ...
    (rec.arts.sf.science)