The Decision Tree is one of the most popular classification algorithms in current use in Data Mining and Machine Learning. This tutorial can be used as a self-contained introduction to the flavor and terminology of data mining without needing to review many statistical or probabilistic pre-requisites. If you're new to data mining you'll enjoy it, but your eyebrows will raise at how simple it all is! After having defined the job of classification, we explain how information gain (next Andrew Tutorial) can be used to find predictive input attributes. We show how applying this procedure recursively allows us to build a decision tree to predict future events. We then look carefully at a question so fundamental, it is the basis for much of all statistics and machine learning theory: how do you choose between a complicated model that fits the data really well and an “Occam's razor” model that is succinct yet not so good at fitting data (this topic will be revisited in later Andrew Lectures, including “Cross-validation” and “VC-dimension”). We also discuss the very wide world of improvements and tweaks on the basic decision tree idea.
Powerpoint Format: The Powerpoint originals of these slides are freely available to anyone who wishes to use them for their own work, or who wishes to teach using them in an academic institution. Please email Andrew Moore at firstname.lastname@example.org if you would like him to send them to you. The only restriction is that they are not freely available for use as teaching materials in classes or tutorials outside degree-granting academic institutions.
Advertisment: I have recently joined Google, and am starting up the new Google Pittsburgh office on CMU's campus. We are hiring creative computer scientists who love programming, and Machine Learning is one the focus areas of the office. If you might be interested, feel welcome to send me email: email@example.com .