 # Correlation and Pearson’s correlation coefficient

Correlation is a measure of degree to which two variables are related.

For instance, Tamilnadu government opened more number of primary schools and offered free noon-meal scheme in the past, which improved the education quality. The same government has opened liquor shops all over Tamilnadu, which increased the number of accidents.

So we are talking about two variables here. In our first example, education promotion activities taken by government is one variable, education quality is another variable. Next example has number of liquor shops as first variable and number of accidents in second variable.

So, correlation is linear association between two variables which wold help to determine the relationship between the. Correlation coefficient lies in the range of -1.00 to +1.00 as +ve or -ve probability.

It not only gives the estimate of degree of association between two or more variables but also helps us to test the interdependence of the variables.

We use Spearman’s coefficient ρ (rho), which is apt for both continuous, discrete and ordinal variables.

### Types of Correlation

1. Positive Correlation
2. Negative Correlation
3. Simple Correlation
4. Multiple Correlation
5. Partial Correlation
6. Total Correlation
7. Linear Correlation
8. Non-linear Correlation

#### Positive Correlation

The correlation depends on the direction of the variables. An increase in variable A causing an increase in variable B leads to positive correlation.

Examples

• Height and Weight
• Demand and Price

#### Negative Correlation

The correlation depends on the direction of the variables. An increase in variable A causing a decrease in variable B leads to positive correlation.

Examples

• Number of files and free space in the hard drive
• Price and competition

#### Simple & Multiple Correlation

We have already seen that correlation is relation between  two variables. This is simple correlation. Sometimes, you may see more than two variables sometimes. This would be multiple correlation. For example, Number of students enrolled in a school, number of similar schools available in its vicinity and Number of school going children around the same.

#### Partial & Total Correlation

Analyzing the correlation excluding one or more variables is called partial correlation. We’d consider all variables in a total correlation.

#### Linear & Non-linear Correlation

If the ratio of change between two variables is uniform (directly or reverse proportional), we say it is linear correlation. if not, it is non-linear.

### Computing Coefficient of Correlation Manually

Lets take the following data set for analysis.

 x y 12 14 9 8 8 6 10 9 11 11 13 12 7 3

Karl Pearson’s formula for coefficient of correlation r is given as –

r = (Σxy * N) – Σx * Σy / [(Σx2 * N – (Σx)2] * [Σy2 * N – (Σy)2]

 x y x2 y2 xy 12 14 144 196 168 9 8 81 64 72 8 6 64 36 48 10 9 100 81 90 11 11 121 121 121 13 12 169 144 156 7 3 49 9 21 Σx = 70 Σy = 63 Σx2=728 Σy2=651 Σxy = 676

N = Total number of samples / column count

N = 14/2 = 7 ————-(1) Pearson Equation for Correlation Coefficient

r = (Σxy * N) – Σx * Σy / √([Σx2 * N – (Σx)2] * [Σy2 * N – (Σy)2])

r = ((676 * 7) – 70 * 63) / √([728 * 7 – (70)2] * [651 * 7 – (63)2])

r = 4732 – 4410/√([5096-4900] * [4557 – 3969])

r = 322 / 339.4819582835 = 0.948504013668671 pearson correlation coefficient interpretation (c) http://www.mathcaptain.com/statistics/pearson-correlation-coefficient.html

Between the scale of -1 to +1, our coefficient is +0.95, which shows a strong +ve correlation.