How to Find Pairwise Dependencies in Your Data with Map/Reduce
Hans-Henning Gabriel
Does your customer’s browser choice relate to the amount of money they spend in your online store? Or are people that come to your site through Pinterest more likely to download your trial than those who come from Facebook? In this class, you will learn how to compute those correlations with a pairwise dependency value across all columns of your data, applying Map/Reduce, on a data stream. Based on mutual information, this measure is derived from two-dimensional histograms, no matter whether the columns are numerical or categorical. The final result is a heat map matrix that compares all columns with each other, visualizing their pairwise dependency values.

Level : Intermediate