16 Overview
Tree-based models are a class of nonparametric algorithms that work by partitioning the feature space into a number of smaller (non-overlapping) regions with similar response values using a set of splitting rules. Such divide-and-conquer methods can produce simple rules that are easy to interpret and visualize with tree diagrams. As we’ll see, decision trees offer many benefits; however, they typically lack in predictive performance compared to more complex algorithms like neural networks and MARS. However, more advanced decision tree ensemble algorithms — like bagging and random forests — combine together many decision trees and can perform quite well.
16.1 Learning objectives
By the end of this module you should be able to:
- Explain how decision tree models partition data and how the depth of a tree impacts performance.
- Train, fit, tune and assess decision tree models.
- Explain and apply decision tree ensemble algorithms such as bagging and random forests.
16.2 Estimated time requirement
The estimated time to go through the module lessons is about:
- Reading only: 3-4 hours
- Reading + videos: 4 hours
16.3 Tasks
- Work through the 3 module lessons.
- Upon finishing each lesson take the associated lesson quizzes on Canvas. Be sure to complete the lesson quiz no later than the due date listed on Canvas.
- Check Canvas for this week’s lab, lab quiz due date, and any additional content (i.e. in-class material)