I’m writing about variance in this blog post series. I have discussed about Mean Deviation or Average Deviation in my previous post.

Dispersion or variance in observations is what we need to explain. Data researchers wants to know why some samples are above/below the average for a given variable.

A variance measures the degree of spread (dispersion) in a variable’s values.

Population variance for variable X_{i} is given as below.

σ^{2} = Σ (X_{i}-X̄)^{2}/n

### Standard Deviation

This is a method to find the distance between each sample and a center point, like mean. This distance is called deviations from the measures of central tendency. It is also the square root of the variance.

#### Steps to calculate the standard deviation

- Calculate the mean of the series
- Find the deviations for various items from the mean d = x – x̄
- Square the deviations d
^{2} - Multiply the respective frequencies f * d2
- Total the product Σ f * d
^{2} - Apply the formula

#### Example – 1

Let’s calculate the standard deviation for the following data set.

Village code |
4 wheeler count |

10 | 8 |

20 | 12 |

30 | 20 |

40 | 10 |

50 | 7 |

60 | 3 |

Solution

Village code (x) |
4 wheeler count (f) |
f * X |
d = x – x̄ |
d^{2} |
f * d^{2} |

10 | 8 | 80 | -20.83 | 434.03 | 3472.22 |

20 | 12 | 240 | -10.83 | 117.36 | 1408.33 |

30 | 20 | 600 | -0.83 | 0.69 | 13.89 |

40 | 10 | 400 | 9.17 | 84.03 | 840.28 |

50 | 7 | 350 | 19.17 | 367.36 | 2571.53 |

60 | 3 | 180 | 29.17 | 850.69 | 2552.08 |

X = 210 |
N = 60 |
Σ f * X = 1850 |
Σ f * d = 10858.33^{2} |

Mean = **Σ** f*X / N

Mean = 1850/60

Mean = 30.8 (Corrected to single decimal place)

Standard deviation σ^{2} = √ (*f * d ^{2}*)/N

σ^{2} =√ (10858.33 / 60)

σ^{2} =√180.97

σ^{2} =13.45

#### Example – 2

5 samples of a spare parts manufacturing plant is given below.

64, 68, 74, 76, 78

Mean x̄ = (64 + 68 + 74 + 76 + 78)/5 = 72

Samples |
64 | 68 | 74 | 76 | 78 |

Mean x̄ |
72 | 72 | 72 | 72 | 72 |

(Xi – x̄)^{2} |
64 | 16 | 4 | 16 | 36 |

Σ (Xi – x̄)^{2} |
136 |

S^{2} = (Σ (Xi – x̄)^{2})/N |
27.2 |

σ2 = √S^{2} |
5.21536192416212 |

#### Example – 3

Let’s use the similar data set. All the samples are of same size 50. Let’s see how it differs.

Samples |
50 | 50 | 50 | 50 | 50 |

Mean x̄ |
50 | 50 | 50 | 50 | 50 |

(Xi – x̄)^{2} |
0 | 0 | 0 | 0 | 0 |

Σ (Xi – x̄)^{2} |
0 | ||||

S^{2} = (Σ (Xi – x̄)^{2})/N |
0 | ||||

σ2 = √S^{2} |
0 |

The deviation is 0.

#### Example – 4

Let’s consider the same example with a high variation.

Samples |
0 | 30 | 60 | 90 | 120 |

Mean x̄ |
60 | 60 | 60 | 60 | 60 |

(Xi – x̄)^{2} |
3600 | 900 | 0 | 900 | 3600 |

Σ (Xi – x̄)^{2} |
9000 | ||||

S^{2} = (Σ (Xi – x̄)^{2})/N |
1800 | ||||

σ2 = √S^{2} |
42.4264068711929 |

Let’s try to visualize it.

Do you see the difference between graph 1 and 2?

See you in another post with another interesting concept.

Pingback: Measures of shape – Skew & kurtosis | JavaShine

Pingback: Working with data types of R | JavaShine