Correlation is a statistical measure that indicates the extent to which two or more variables fluctuate in relation to each other. A positive correlation indicates the extent to which those variables increase or decrease in parallel; a negative correlation indicates the extent to which one variable increases as the other decreases.
A correlation coefficient is a statistical measure, of the degree to which changes to the value of one variable predict change to the value of another. When the fluctuation of one variable reliably predicts a similar fluctuation in another variable, there’s often a tendency to think that means that the change in one causes the change in the other. However, correlation does not imply causation. There may be, for example, an unknown factor that influences both variables similarly. Distinguishing between correlation and causation can be valuable when it comes to consumer data patterns, and provide valuable insights. The beer and diapers example is frequently used to highlight this in the context of marketing.
Here’s one example: A number of studies report a positive correlation between the amount of television children watch and the likelihood that they will become bullies. Media coverage often cites such studies to suggest that watching a lot of television causes children to become bullies. However, the studies only report a correlation, not causation. It is likely that some other factor – such as a lack of parental supervision – may be the influential factor.
See a brief demonstration of finding the correlation coefficient for two variables: