Title: Reshaping data in R Author: Hadley Wickham Restructuring data is a common task in practical data analysis, and it is often tedious and unintuitive. For example, to prepare data for a lattice plot we may need a cumbersome command of the form: d.music.df <- data.frame( artist = factor(rep(d.music[,1],5)), y = as.vector(as.matrix(d.music[,3:7])), meas = factor(rep(1:5, rep(62,5)), labels = names(d.music[,-c(1,2)])) ) This command takes a data matrix with five variables (columns) and reshapes it as single column of numbers (y) and labels (meas, artist). The code is not easy to read, difficult to explain to students, and it is cumbersome to create. This type of operation, where we reshape the data matrix, is extremely important and common to many analyses. Data often has multiple levels of grouping (nested treatments, split plot designs, or repeated measurements) and often requires investigation at multiple levels. For example, from a long term clinical study we may be interested investigating relationships over time, or between times, between patients or between treatments. Peforming these investigations fluently requires the data to be reshaped in different ways. Currently the R supplies a reshape function that can perform some of these tasks, but confounds multiple steps in the process and is hard to use. We propose a new conceptual framework for reshaping operations and an R package to "deshape" data frames and then flexibly "reshape" them to meet your needs. This framework also produces contingency tables, cross-tabulations, summary statistics, and provides a natural link to graphical methods such as parallel coordinate plots, scatterplot matrices, mosaic plots and trellis graphics.