To calculate the strength and direction of the linear relationship of two variables of bivariate distribution, the quantity (r) used is known as linear correlation coefficient. This is the often called as the Pearson product as it was developed by Karl Pearson.
When the linear relationship of bivariate variables x and y is needed to be measured then the coefficient of correlation will be rxy =Cov (x,y) /σx . σy
In the equation, Cov is Covariance between bivariate variables, σx is S.D of x and σy is S.D of y and n is the number of pairs of data.
1/n ∑ (xi – x̅) (yi – y̅)
rxy = _________________________
√1/n ∑(xi – x̅)2 . √1/n ∑(yi – y̅)2
∑ (xi – x̅) (yi – y̅)
= ____________________
√∑(xi -x̅)2 √∑(yi -y̅)2
There are some drawbacks found when measuring correlations:
- As it is already mentioned, that by coefficient correlation one can measures the linear relationship between bivariate variables, so this correlation fails to provide any calculation when there is neither linear relationship nor other types of relationship. This is exactly why scatter diagrams are scrutinized thoroughly.
- In case data is picked from a different source, different coefficient of correlation will come as a result.
- It is not necessary that two bivariate variables are influenced specifically by each other. There may be other effects. If those effects are removed then it may be possible to have a zero coefficient correlation.
Features:
You can find some specific characteristics of this correlation:
- This correlation coefficient is completely independent of their source and those referential data.
- -1 ≤ rxy ≤ 1
Let us measure:
ui = xi – x̅ and vi = yi – y̅
σx σy
Then we have,
1/n ∑ui2 = 1, 1/n ∑vi2 =1, 1/n ∑ui vi = rxy
1/n ∑(ui -vi)2 ≥ 0
1/n ∑ui2 + 1/n ∑vi2 – 2/n ∑ui vi ≥ 0
2( 1- rxy ) ≥ 0
rxy ≤ 1
1/n [∑(ui + vi)2] ≥ 0
1/n ∑ui2 + 1/n ∑vi2 + 2/n ∑ui vi ≥ 0
2 ( 1 + rxy )≥ 0
rxy ≥ -1
So we get -1 ≤ r xy ≤ 1.
Here, two uncorrelated variables are found. They are independent but the converse isn’t supposedly true always. If rxy is correlation coefficient and n is data of pairs then the standard error is probable error which you can find from:
P.E. (rxy) = 0.6745 (1 – rxy2)
√n
By Step Deviation Method:
Let dx = x – A, dy = y – B which are the deviations and A, B are assumed values, then
∑dx dy –∑dx . ∑dy
n
rxy = _________________________________
√[∑dx2 – (∑dx)2] . √[∑dy2 – (∑dy)2] .
n n
Here n = number of observations.
The coefficient of correlation given by Karl Pearson is
∑XY
rxy= __________ = 85/ √90*112 = 0.85
∑X2 ∑Y2
So let us calculate the coefficient correlation of this price and supply related grouped data:
Price (Rs.) | 60 | 65 | 70 | 75 | 80 | 85 | 90 | 95 | 100 |
Demand (Qts) | 35 | 30 | 25 | 25 | 23 | 21 | 20 | 20 | 18 |
Solution:
Price (x) | dx = x – 80
<dx = x – A> |
dx2 | Demand (y) | Dy = y – 25
<dy = y – B> |
dy2 | dxdy |
60
65 70 75 80 85 90 95 100 |
-20
-15 -10 -5 0 5 10 15 20 |
400
225 100 25 0 25 100 225 400 |
35
30 25 25 23 21 20 20 18 |
10
5 0 0 -2 -4 -5 -5 -7 |
100
25 0 0 4 16 25 25 49 |
-200
-75 0 0 0 -20 -50 -75 -140 |
∑ | 0 | 1500 | – | -8 | 244 | -560 |
∑dx dy – ∑dx . ∑dy
9
rxy = _________________________________
√[∑dx2 – (∑dx)2] . √[∑dy2 – (∑dy)2] .
- 9
=________-560____________ = -0.94.
√1500 √244 – 64/9
Links of Previous Main Topic:-
- Introduction to statistics
- Knowledge of central tendency or location
- Definition of dispersion
- Moments
Links of Next Statistics Topics:-
- Theorem of total probability addition theorem
- Random variable
- Binomial distribution
- What is sampling
- Estimation
- Statistical hypothesis and related terms
- Analysis of variance introduction
- Definition of stochastic process
- Introduction operations research
- Introduction and mathematical formulation in transportation problems
- Introduction and mathematical formulation
- Queuing theory introduction
- Inventory control introduction
- Simulation introduction
- Time calculations in network
- Introduction of game theory