Re: statistics::linear-model question
- From: claird@xxxxxxxxx (Cameron Laird)
- Date: Wed, 5 Sep 2007 10:50:54 +0000
In article <1188974953.743153.131960@xxxxxxxxxxxxxxxxxxxxxxxxxxx>,
Alexandre Ferrieux <alexandre.ferrieux@xxxxxxxxx> wrote:
On Sep 5, 7:59 am, Luc Moulinier <mou...@xxxxxxxxxxxxxxxxxx> wrote:
To clarify: Taking a new X which is not in the set of points used to
define the line, and which is outside this range of points
(extrapolation), the regression line give you an estimate of Y. I know
it is possible to calculate the error associated with Y, this error
depending on x (the error is bigger if you go away along the line) and
depends also on the confidence you want. But I don't know how to
compute it ....
Then what you're after is the Y std deviation.
The best you can do with the tools is that library is a 2nd order
approximation, hence for your extrapolated Xe, the Y distribution is
N(BXe+A,Sy)
Then to aim for a given confidence level, you take the reciprocal of
the erf() function. Since erf(1.4)~0.95, your answer is roughly
1.4*Sy.
-Alex
I applaud the precision with which Alexandre, in particular, has
followed up in this thread.
As we're speculating about what the original questioner *really*
is after, I perceive a few important qualitative differences that
deserve mentioning. Given a collection of (X,Y) data, it's indeed
reasonable to experiment with a least-square regression as already
described. Very mild assumptions on the distribution of (formal)
errors in X and Y make the regression a relatively robust artifact
of analysis.
The error bounds are a more sensitive matter. Alexandre has of
course reported the pertinent calculation correctly. Note, though,
before reporting with confidence to your constituents, "I'm 83%
sure the new Y will be in *this* range" that those bound intervals
depend rather delicately on the details of the distributions of X
and Y. Prior knowledge that those will be exactly normal is ...
well, it's rarer than the managers of billion-dollar private equity
funds, for example, seem to recognize (catastrophic video available
on request).
I summarize: I'm somewhat more comfortable putting a best-fit curve
in the hands of naive consumers, than I am of doing the same with
confidence intervals.
.
- References:
- statistics::linear-model question
- From: Luc Moulinier
- Re: statistics::linear-model question
- From: Alexandre Ferrieux
- Re: statistics::linear-model question
- From: Luc Moulinier
- Re: statistics::linear-model question
- From: Alexandre Ferrieux
- statistics::linear-model question
- Prev by Date: Is there a diff between the source and do methods of executing a script
- Next by Date: Re: Is there a diff between the source and do methods of executing a script
- Previous by thread: Re: statistics::linear-model question
- Next by thread: About env variables in TCL
- Index(es):
Relevant Pages
|