Re: Linear regression in NumPy

nikie wrote:
I'm a little bit stuck with NumPy here, and neither the docs nor
trial&error seems to lead me anywhere:
I've got a set of data points (x/y-coordinates) and want to fit a
straight line through them, using LMSE linear regression. Simple
enough. I thought instead of looking up the formulas I'd just see if
there isn't a NumPy function that does exactly this. What I found was
"linear_least_squares", but I can't figure out what kind of parameters
it expects: I tried passing it my array of X-coordinates and the array
of Y-coordinates, but it complains that the first parameter should be
two-dimensional. But well, my data is 1d. I guess I could pack the X/Y
coordinates into one 2d-array, but then, what do I do with the second

Mor generally: Is there any kind of documentation that tells me what
the functions in NumPy do, and what parameters they expect, how to call
them, etc. All I found was:
"This function returns the least-squares solution of an overdetermined
system of linear equations. An optional third argument indicates the
cutoff for the range of singular values (defaults to 10-10). There are
four return values: the least-squares solution itself, the sum of the
squared residuals (i.e. the quantity minimized by the solution), the
rank of the matrix a, and the singular values of a in descending
It doesn't even mention what the parameters "a" and "b" are for...

Look at the docstring. (Note: I am using the current version of numpy from SVN,
you may be using an older version of Numeric.

In [171]: numpy.linalg.lstsq?
Type: function
Base Class: <type 'function'>
String Form: <function linear_least_squares at 0x1677630>
Namespace: Interactive
Definition: numpy.linalg.lstsq(a, b, rcond=1e-10)
returns x,resids,rank,s
where x minimizes 2-norm(|b - Ax|)
resids is the sum square residuals
rank is the rank of A
s is the rank of the singular values of A in descending order

If b is a matrix then x is also a matrix with corresponding columns.
If the rank of A is less than the number of columns of A or greater than
the number of rows, then residuals will be returned as an empty array
otherwise resids = sum((b-dot(A,x)**2).
Singular values less than s[0]*rcond are treated as zero.

Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco


Relevant Pages

  • Re: Calculating a generalized inverse matrix
    ... the matrix is 10x10 but only has rank 8. ... Paige Miller's advice is probably best, but if that's not possible then ... but the problem is I know S is singular. ... languages mentioned in Paige Miller's post, but I just thought Excel ...
  • Re: rank one matrix and SVD
    ... Can we find vector A and B using singular value decomposition of C ... Let's say we only have matrix C and know the rank of C is one. ... I assumed the first column of UC should be same with A and the first ...
  • Re: Matrix decomposition in vectors
    ... For higher dimensioned matrices, the ... matrix must be rank 1 to be decomposed in this way. ... So is your 2x2 matrix singular? ... for a general 2x2 symmetric matrix. ...
  • Re: Close to singular matrix with Rcond=10^-21
    ... I notice that rank of matrix A differs from its size by exactly 2. ... 4)I checked for repetitive rows and zero rows but found none. ... singular one. ... if you can freely translate the problem in either ...
  • what are the differences between those two statements?
    ... if A is skinny and full rank, ... if A is fat and full rank, ...