What is machine learning?
One definition: “Machine learning is the semi-automated extraction of knowledge from data”
- Knowledge from data: Starts with a question that might be answerable using data
- Automated extraction: A computer provides the insight
- Semi-automated: Requires many smart decisions by a human
What are the two main categories of machine learning?
Supervised learning: Making predictions using data
- Example: Is a given email “spam” or “ham”?
- There is an outcome we are trying to predict

Unsupervised learning: Extracting structure from data
- Example: Segment grocery store shoppers into clusters that exhibit similar behaviors
- There is no “right answer”

How does machine learning “work”?
High-level steps of supervised learning:
- First, train a machine learning model using labeled data
- Then, make predictions on new data for which the label is unknown

The primary goal of supervised learning is to build a model that “generalizes”: It accurately predicts the future rather than the past!