We know that Pearson correlation coefficient (r) ranges from -1 to +1. And a zero Pearson correlation coefficient means there exists no linear relationship between the variables.
Here the word linear is crucial. Why? Let's find this out using an example where Pearson Correlation Coefficient = 0.
Consider a case where Y=X2.
X -3 -2 -1 +1 +2 +3
Y +9 +4 +1 +1 +4 +9
If we calculate the Pearson Correlation Coefficient for this dataset, it will be 0.
But we know that Y is dependent on X because Y=X2.
This relationship is non-linear. You can see from the above graph that the relation created a curve. Pearson Correlation Coefficient is unable to capture this non-linear association.
Linear relations produce straight lines such as the ones shown below:
What to do when the relationship is non-linear?
- Use other correlation measures such as Spearman's rank correlation coefficient or Kendall's rank correlation coefficient to capture the non-linear association.
- In some cases, transformation such as log transformation of variables might be useful. For more on this, you can read this blog.