Crash Course in Machine Learning
Machine learning is a broad topic with many categories of problems and approaches, as well as lots of special-purpose tricks of the trade. We can’t cover all that, of course, but we can give you a taste of the most common problems addressed by machine learning and some of the techniques used to address them.
We’ll discuss supervised learning, where a system is trained with data that has already been classified, then use that system to classify new data. We’ll use the particular example of classifying SPAM vs. non-SPAM e-mails. Classification applies a set of labels to data. We’ll experiment with a SPAM classifier. Finally, we’ll finish this section with a brief discussion of regression, where a value for a continuous number is assigned to data, such as predicting housing prices from historical data.
How does Netflix determine what movies to recommend to you? How does Amazon know what products you might want? We’ll examine recommendation engines that compare either user preferences or item features to make recommendations to you. We’ll experiment with a movie-recommendation engine.
Clustering is an example of unsupervised learning, where we don’t train the system in advance. Instead, the system finds structure in the data "on its own." We’ll look at two examples: k-means clustering, where you find k clusters and their centers in a data set; and nearest-neighbors, where you find the nearest neighbors to a given data point in an efficient way. We’ll experiment with k-means.
We’ll finish with a brief discussion of topics you might pursue on your own; the importance of data preparation, probabilities and statistics; and more advanced machine-learning concepts, such as probabilistic graphical models and neural networks.
Side notes: This tutorial is good for intermediate to advanced developers and data analysts. Some programming ability will be assumed, such as using a text editor, writing simple scripts in some language, and using simple Linux "shell" commands.
Level : Advanced