When one variable of distribution is measured or predicted with the help of another variable of the distribution then that equation through which this result is acquired is called regression of equations. In this situation you’ll find:
Now we come to regression line. First we’ll see X on Y:
Here X =c +dY,
Suppose the equations are moving forward using least square methods as (X1, Y1), (X1, Y2), (X3, Y3)…So to be able to calculate the values of both a andb, we are coming to:
X – X̅ = r (σx/ σy)(Y – Y̅)
X – X̅ = bxy(Y – Y̅)
Bxy = r(σx/ σy) = regression coefficient of x on y.
Second we’ll come to Y on X where Y = a + bX
Y – Y̅ = r σy/ σx(X – X̅)
Y – Y̅ = bxy(X – X̅)
Bxy = r(σy/ σx) = regression coefficient of y on x.
By following the method of regression equation, we are coming to few points:
tanӨ= 1- r2/r . σxσy/ σ2x + σ2y
You’ll come across formulas of Regression coefficients in this case,
bxy=∑XY/ ∑Y2, byx= ∑XY/ ∑X2
bxy=∑xy – (∑x.∑y/n)/ ∑y2 – (∑y)2/n, byx= ∑xy – (∑x.∑y/n)/ ∑x2 – (∑x)2/n
bxy = {∑dx.dy– (∑dx.dy)/n}/ ∑d2y – (∑dy)2/n
byx = {∑dx.dy– (∑dx.dy)/n}/ ∑d2x– (∑dx)2/n
Here n = number of observations.
If the measurement is based upon grouped data and N = f then we’ll consider:
bxy= {∑fdxdy– (∑fdx.(∑fdy )/N}/∑fd2y – {(∑fdy)2/N}
byx= {∑fdxdy– (∑fdx.(∑fdy )/N}/∑fd2x – {(∑fdx)2/N}
What is Standard Error of Estimation?
When studying for regression of equations, a standard error of estimation comes in front. This is the value of root mean square deviation that is obtained from the regression line of either X on Y or Y on X. In the first case we’ll use the equation,
Sx= σx√1 – r2
In the second case we’ll use,
Sy= σy√1 – r2
The necessities of this standard error are:
Next is Coefficient of Determination where one independent and one dependent variable are accounted and their percentage variation is what we’ll measure:
R2 = Explained variance / Total variance
For example if we suppose that X is the independent variable and Y is the dependent one and R2 = 0.85 then reduction is done 85% from the variation of Y on X.
If Y; = Go + al X; (i = 1, 2, …... n,) be the fitted values to the observation (xi , yi) i = 1, 2 … … n, then
R2 = {n ∑yi2 – (∑yi)2}/ n ∑ (yi)2 – (∑yi)2, 0 ≤ R2 ≤ 1
R 2 = 1 > all n observations lie on the fitted regression line.
Let us now measure regression lines and correlation coefficient from the data mentioned below:
Sales (x) | 100 | 98 | 78 | 85 | 110 | 93 | 80 |
Purchase (y) | 85 | 90 | 70 | 72 | 95 | 81 | 74 |
Find out the value of y when x = 82.
Solution:
Here ∑x = 644, ∑y = 567, n = 7
x̅ = ∑x/n = 92
y̅ = ∑y/n = 81
x | X = x – x̅ | X2 | y | Y = y – y̅ | Y2 | XY |
100 98 78 85 110 95 80 | 8 6 -14 -7 18 3 -12 | 64 36 196 49 324 9 144 | 85 90 70 72 95 81 70 | 4 9 -11 -9 14 0 -11 | 16 81 121 81 196 0 121 | 32 54 154 63 252 0 132 |
∑ | 822 | 616 | 687 |
Here regression coefficients are:
byx= ∑XY/ ∑X2 = 0.84
bxy= ∑XY/ ∑Y2 = 1.12
Regression equation of y on x:
Y-y’ = byx(x-x’)
y-81 = 0.84(x-92)
y=0.84x+ 3.72
Regression equation of x on y
x-x’ = bxy(y-y’)
x-92 = 1.12(y-81)
x=1.12y+ 1.28
The coefficient of correlation:
= √bxy.byx
= √(0.84). (1.12) = 0.97
For x = 82, the value of y to be obtained from the regression equation of y on x. Hence
y = (0.84) (82) + 3.72 = 72.6.
Links of Previous Main Topic:-
Links of Next Statistics Topics:-