useR

Tutorial: How to work with large R projects


Alex Zolotovitski, Medio Systems Inc.

Overview

Author shares his experience and technology of work simultaneously with 3-5 different large R projects, each having a number of files of R code with a few thousands lines of code.

Goals

To help attendees to work effectively with large R projects.

Outline

The tutorial will cover the following topics:

  • Structure of a project directory. Package ProjectTemplate
  • Reporting and Literate Programming. Packages knitr, highlight, brew, R2HTML.
  • IDE: Eclipse+StatET, RStudio and other.
  • R Work Journal: Code2HTML() Features:
    1. Transforms .R file into self-documented .html file, containing all R code with output pics, headers and table of contents.
    2. The titles in body and contents are clickable to navigate from contents to body and back.
    3. The pics are clickable to resize.
    4. The html file has partly R syntax highlighted. It is possible to do the full R syntax highlighting in resulting html, but the result file becomes almost twice heavier.
    5. Parts of the result html file could be folded.
    6. If you in browser fold TOC, select all, copy and paste from browser to a text editor, you should get the pure original R file.
    7. If modify .R code, recreate .html is fast.
    8. It is not replacement of knitr or sweave, because output is not a document to print, but rather an R work journal.
    9. Other helper functions to work with a number of large projects.

    Prerequisites

    Basic R programming and computing knowledge.

    Intended Audience

    Data mining professionals, statisticians, and anyone interested to learn how to work with large R projects.

    Workshop Materials

    The code can be downloaded from http://github.com/alexzolot/UseR-2013/tree/master/work/UseR-2013

    During tutorial I will use R-3.01 with Eclipse (Kepler) + StatET plugin. In spite all mentioned above code and functions can be used in any other IDE (e.g. RStudio), the most convenient and effective would be if attendees use the same environment, that is in my experience the most productive for large R projects.
    I created package for Win-64x environment that can be downloaded from https://dl.dropboxusercontent.com/u/37458038/UseR-2013-Tutorial.zip. The package contents my code mentioned above as well as Eclipse with necessary plugins and R. The package does not require installation: just download, unzip in any folder and click the .bat file in the package root.

    I would very high appreciate if attendees fill a small survey (www.surveymonkey.com/s/DB9PM7D) to help me better adjust the tutorial content and form for their needs.

    Related Links

    Tutorial: Work with R on Amazon's Cloud, UseR-2010, http://user2010.org//tutorials/Zolot.html , July 2010.

    URLs of the slides of the previous tutorials given by the authors: http://user2010.org/tutorials/Zolot_tut.pdf