To understand the correlation of bivariate frequency one has to divide their attention toward marginal and joint distribution of variables. Let us consider this following table with two variables X and Y:
X
Y |
Classes | ||
Midpoints
X1 X2…………………………… Xn |
|||
|
f (x, y) |
g (y1)=∑ₓ f(x, y1)
g (y2)=∑ₓ f(x, y2)
g (ym)=∑ₓ f(x, ym)
|
|
h(X1)………………………… h(Xn)
= ∑y f (x1, y) = ∑y f (xn, y) |
N = ∑ₓ∑y f (x, y) |
x̅ = 1/N ∑xi h (xi), y̅ = 1/N ∑yj g (yj)
Or
= 1/N. ∑xh (x)
= 1/N ∑y g(y)
σ2 = 1/N ∑x2 h(x) – (x̅)2, σ2y = 1/N ∑y2 g (y) – (y̅)2
Cov (x,y) = 1/N∑x ∑y xy f (x,y) – x̅.y̅
So we get,
rxy = Cov. (x,y)
σx σy
Here a special note should be made on large data where this two way frequency table is very advantageous.
Calculate the correlation of coefficient of this table:
X Y | 0 – 8 | 8 – 16 | 16 – 24 |
1 -5
5 – 9 9 – 13 |
2
3 2 |
0
2 5 |
4
2 1 |
Solution:
y x | Mid Values
4 12 20 |
g(y) |
Mid 3
Values 7 11 |
2 0 4
3 2 2 2 5 1 |
6
7 8 |
h(x) | 7 7 7 | 21 |
x̅ = 1/21 [(4) (7) + (12) (7) + (20) (7)] = 12
y̅ = 1/21 [(3) (6) + (7) (7) + (11) (8) ] = 7.38
σx2 = 1/21 [(16) (7) + (144) (7) + (400) (7)] –(12)2 = 42.67
σy2 = 1/21 [(9) (6) + (49) (7) + (121) (8)] –(7.38)2 = 10.54
Cov (x,y) = 1/21 ∑x ∑y xy f (x,y) – x̅.y̅
= 1764/21 – (12) (7.38) = -4.56
The correlation of coefficient is:
rxy = -4.56/√42.67. √10.54 = -0.22.
Links of Previous Main Topic:-
- Introduction to statistics
- Knowledge of central tendency or location
- Definition of dispersion
- Moments
Links of Next Statistics Topics:-
- Theorem of total probability addition theorem
- Random variable
- Binomial distribution
- What is sampling
- Estimation
- Statistical hypothesis and related terms
- Analysis of variance introduction
- Definition of stochastic process
- Introduction operations research
- Introduction and mathematical formulation in transportation problems
- Introduction and mathematical formulation
- Queuing theory introduction
- Inventory control introduction
- Simulation introduction
- Time calculations in network
- Introduction of game theory