Tutorial: Predictive Modeling with R and the caret
Kuhn, Pfizer Global R&D, USA.
This course will provide an overview of using R for supervised
learning (aka machine learning, pattern recognition, predictive
analytics, etc). The session will step through the process of
building, visualizing, testing and comparing models that are
focused on prediction. The goal of the course is to provide a
thorough workflow in R that can be used with many different
modeling techniques. A case study is used to illustrate
Topics will include:
Introduction (philosophy, case study)
General Strategies (data splitting, resampling, model
Data Pre-Processing (transformations, variable filtering)
Conventions in R (OOP, function interfaces, consistency)
Building and Tuning Models (performance metrics, trees,
kernel methods, model comparisons)
Other Topics (as time allows) (parallel processing,
The length of the tutorial is not conducive to hands-on
exercises, so laptops are not required. However, the
illustrative data sets and code will be available online if
participants would like to follow along.
Basic understanding of R (matrices, data frames, functions,
etc) is needed. Some basic understanding of regression
techniques is helpful.