When one variable of distribution is measured or predicted with the help of another variable of the distribution then that equation through which this result is acquired is called regression of equations. In this situation you’ll find:
- Two variables of distribution involved in this equation.
- There will be total of two regressions of equations, both of them predicted by the other. For example if it is X on Y then we’ll measure the value of X with the help of Y and if it is Y on X then we are going to measure Y with the help of X.
- These two variables of distribution are called predictor variable and regressed variable.
- In regression lines, any regression equation is found to be in straight line.
Now we come to regression line. First we’ll see X on Y:
Here X =c +dY,
Suppose the equations are moving forward using least square methods as (X1, Y1), (X1, Y2), (X3, Y3)…So to be able to calculate the values of both a andb, we are coming to:
X – X̅ = r (σx/ σy)(Y – Y̅)
X – X̅ = bxy(Y – Y̅)
Bxy = r(σx/ σy) = regression coefficient of x on y.
Second we’ll come to Y on X where Y = a + bX
Y – Y̅ = r σy/ σx(X – X̅)
Y – Y̅ = bxy(X – X̅)
Bxy = r(σy/ σx) = regression coefficient of y on x.
By following the method of regression equation, we are coming to few points:
- The mean value of X and Y is something important for regression lines.
- bxy.byx= r 2
- r = ±√bxy.byx
- Here r is represented with same importance when placing in regression coefficients.
- Regression equations of two variables are different from each other and only remains identical with an exception of r = ± 1. We’ll find the angle with:
tanӨ= 1- r2/r . σxσy/ σ2x + σ2y
You’ll come across formulas of Regression coefficients in this case,
bxy=∑XY/ ∑Y2, byx= ∑XY/ ∑X2
bxy=∑xy – (∑x.∑y/n)/ ∑y2 – (∑y)2/n, byx= ∑xy – (∑x.∑y/n)/ ∑x2 – (∑x)2/n
bxy = {∑dx.dy– (∑dx.dy)/n}/ ∑d2y – (∑dy)2/n
byx = {∑dx.dy– (∑dx.dy)/n}/ ∑d2x– (∑dx)2/n
Here n = number of observations.
If the measurement is based upon grouped data and N = f then we’ll consider:
bxy= {∑fdxdy– (∑fdx.(∑fdy )/N}/∑fd2y – {(∑fdy)2/N}
byx= {∑fdxdy– (∑fdx.(∑fdy )/N}/∑fd2x – {(∑fdx)2/N}
What is Standard Error of Estimation?
When studying for regression of equations, a standard error of estimation comes in front. This is the value of root mean square deviation that is obtained from the regression line of either X on Y or Y on X. In the first case we’ll use the equation,
Sx= σx√1 – r2
In the second case we’ll use,
Sy= σy√1 – r2
The necessities of this standard error are:
- It presents a standard version of deviations of the predictions done on the values of either X from the equation of X on Y or Y from the equation of Y on X.
- The size of the error is what predicted through this standard error.
- The model used in this equation is also predicted with this standard error.
Next is Coefficient of Determination where one independent and one dependent variable are accounted and their percentage variation is what we’ll measure:
R2 = Explained variance / Total variance
For example if we suppose that X is the independent variable and Y is the dependent one and R2 = 0.85 then reduction is done 85% from the variation of Y on X.
If Y; = Go + al X; (i = 1, 2, …... n,) be the fitted values to the observation (xi , yi) i = 1, 2 … … n, then
R2 = {n ∑yi2 – (∑yi)2}/ n ∑ (yi)2 – (∑yi)2, 0 ≤ R2 ≤ 1
R 2 = 1 > all n observations lie on the fitted regression line.
Let us now measure regression lines and correlation coefficient from the data mentioned below:
Sales (x) | 100 | 98 | 78 | 85 | 110 | 93 | 80 |
Purchase (y) | 85 | 90 | 70 | 72 | 95 | 81 | 74 |
Find out the value of y when x = 82.
Solution:
Here ∑x = 644, ∑y = 567, n = 7
x̅ = ∑x/n = 92
y̅ = ∑y/n = 81
x | X = x – x̅ | X2 | y | Y = y – y̅ | Y2 | XY |
100
98 78 85 110 95 80 |
8
6 -14 -7 18 3 -12 |
64
36 196 49 324 9 144 |
85
90 70 72 95 81 70 |
4
9 -11 -9 14 0 -11 |
16
81 121 81 196 0 121 |
32
54 154 63 252 0 132 |
∑ | 822 | 616 | 687 |
Here regression coefficients are:
byx= ∑XY/ ∑X2 = 0.84
bxy= ∑XY/ ∑Y2 = 1.12
Regression equation of y on x:
Y-y’ = byx(x-x’)
y-81 = 0.84(x-92)
y=0.84x+ 3.72
Regression equation of x on y
x-x’ = bxy(y-y’)
x-92 = 1.12(y-81)
x=1.12y+ 1.28
The coefficient of correlation:
= √bxy.byx
= √(0.84). (1.12) = 0.97
For x = 82, the value of y to be obtained from the regression equation of y on x. Hence
y = (0.84) (82) + 3.72 = 72.6.
Links of Previous Main Topic:-
- Introduction to statistics
- Knowledge of central tendency or location
- Definition of dispersion
- Moments
Links of Next Statistics Topics:-
- Theorem of total probability addition theorem
- Random variable
- Binomial distribution
- What is sampling
- Estimation
- Statistical hypothesis and related terms
- Analysis of variance introduction
- Definition of stochastic process
- Introduction operations research
- Introduction and mathematical formulation in transportation problems
- Introduction and mathematical formulation
- Queuing theory introduction
- Inventory control introduction
- Simulation introduction
- Time calculations in network
- Introduction of game theory