To understand the correlation of bivariate frequency one has to divide their attention toward marginal and joint distribution of variables. Let us consider this following table with two variables X and Y:

X

Y

Classes
Midpoints

X1         X2…………………………… Xn

 ClassesMid PointsY1Y2:::Ym

f (x, y)

g (y1)=∑ₓ f(x, y1)

g (y2)=∑ₓ f(x, y2)

g (ym)=∑ₓ f(x, ym)

h(X1)………………………… h(Xn)

= ∑y f (x1, y)       = ∑y f (xn, y)

N = ∑ₓ∑y f (x, y)

x̅ = 1/N ∑xi h (xi),  y̅ = 1/N ∑yj g (yj)

Or

= 1/N. ∑xh (x)

= 1/N ∑y g(y)

σ2 = 1/N ∑x2 h(x) – (x̅)2, σ2y = 1/N ∑y2 g (y) – (y̅)2

Cov (x,y) = 1/N∑x y xy f (x,y) – x̅.y̅

So we get,

rxy = Cov. (x,y)

σx σy

Here a special note should be made on large data where this two way frequency table is very advantageous.

Calculate the correlation of coefficient of this table:

 X                                Y 0 – 8 8 – 16 16 – 24 1 -55 – 99 – 13 232 025 421

Solution:

 y                                                  x Mid                       Values4                     12                   20 g(y) Mid                     3Values                711 2                     0                      43                     2                      22                     5                      1 678 h(x) 7                     7                      7 21

x̅ = 1/21 [(4) (7) + (12) (7) + (20) (7)] = 12

y̅ = 1/21 [(3) (6) + (7) (7) + (11) (8) ] = 7.38

σx2  = 1/21 [(16) (7) + (144) (7) + (400) (7)] –(12)2 = 42.67

σy2  = 1/21 [(9) (6) + (49) (7) + (121) (8)] –(7.38)2 = 10.54

Cov (x,y) = 1/21 ∑x y xy f (x,y) – x̅.y̅

= 1764/21 – (12) (7.38) = -4.56

The correlation of coefficient is:

rxy  =  -4.56/√42.67. √10.54 = -0.22.

Links of Previous Main Topic:-

Links of Next Statistics Topics:- 