Based on ideas collected and disussed on the R Wiki, the projects and students listed below were selected for participation and are being sponsored by Google during the summer 2011.
Mentor: Daniel Kaplan assisted by J J Allaire
Short description: The manipulate package in RStudio can be used to demonstrate mathematical ideas intuitively through interacting with sliders and watching a corresponding graph shift. I will create a variety of applets written in R to show basic calculus and statistical concepts for Professor Daniel Kaplan and J.J. Allaire’s contribution to Project MOSAIC.
Mentor: Kathryn Roeder assisted by Han Liu
Short description: Modern data acquisition routinely produces massive amount of complex datasets. Despite the high dimensionality and complexity, many problems have hidden structure that makes efficient statistical inference possible. One important hidden structure is sparse conditional independence graphs (or undirected graphical models). Our HUGE project aims at providing a fast and scalable toolkit for nonparametric graphical models in ultrahigh-dimensional data analysis.
Mentor: John Nash assisted by Ben Bolker
Short description: This project aims at building up a GUI based package of R to assist the preparation and solution of optimization problems. It is anticipated to improve the usability of optimization tools in R by providing users with meaningful suggestions on the choice of optimizer and parameters through a visible and interactive way. The program will also provide a mechanism to auto-generate codes that could be run in R to solve a specific optimization problem.
Mentor: Felix Schönbrodt assisted by Stefan Schmukle
Short description: The Actor-Partner Interdependence Model (APIM; Kashy & Kenny, 1999; Kenny, Kashy, & Cook, 2006) is a model of dyadic relationships that integrates a conceptual view of interdependence in two-person relationships with the appropriate statistical techniques for measuring and testing it. There is only one R package (“dyad”) that helps researchers to conduct dyadic analysis. And it suffers some limitation such that it cannot handle complex interaction effects. To overcome its functional deficiency,
Mentor: Ian Fellows
Short description: To bring full integration of ImageJ to R and to expand the RImageJ into a fully functional R image analysis engine.
Mentor: C. Beleites assisted by Colin Gillespie
Short description: Currently hyperSpec provides a limited GUI interface via the `locator()` function for basic graphics. This proposal will develop a Graphical User Interface for the hyperSpec package. This GUI will be made up of smaller widgets that can be chained, synchronised, and included in batch scripts.
Mentor: Antony Unwin
Short description: The project goal is to implement an interface in R which provides category order optimization for different types of input (such as tables, data frames or matrices) and 2- as well as k-dimensional categorical data.
Mentor: Virgilio Gómez-Rubio assisted by Barry Rowlingson
Short description: Analysis of disease data is important in order to detect disease outbreaks and risk factors. Some of the methods for cluster detection have been implemented in the DCluster package. However, a model-based approach would be of interest in order to explore disease incidence to potential risk factors. Model-based clustering will be implemented using Generalized Linear Models. Hence, many possible clusters will be proposed and the most likely cluster will be selected using model selection techniques.
Mentor: George Ostrouchov
Short description: As an existing project in the ideas list, it aims to use multi-threaded programming to impose parallelism based on multicore/shared memory architecture. As OpenMP is a well known specification for parallel programming, it is performed in a neat way without hassle in messaging passing or load balancing, and supports hybrid programming with MPI as well. The expected results include a usable R-OpenMP package that will reside on CRAN servers with good performance, compatibility and user experience.
Mentor: Niels Richard Hansen
Short description: To contribute with functions to help explore, visualize and analyze data from multivariate stochastic dynamic systems.
Mentor: Han Liu
Short description: The project aims at providing the “fastest and most scalable” implementations of three modern nonparametric predictive methods (SpAM, MT-SpAM and G-SpAM). This package has the potential to become a general-purpose exploratory data analysis toolbox for a wide range of data analysis practitioners. The targeted applications include large-scale scientific data analysis (e.g. genomics/proteomics/bio-imaging), social media data analysis (e.g. image/audio/video/text modeling) and financial time-series
Jennifer Feder Bobb
Mentor: Ravi Varadhan
Short description: The Expectation-Maximization (EM) algorithm is a useful and popular optimization approach that arises in a wide range of scientific applications. Adaptations of the original EM approach have been proposed that provide faster convergence rates without compromising its global convergence property. We propose to develop an R package which will provide a unified implementation of the diverse set of accelerations schemes to the EM algorithm in an open source, user-friendly environment.
Mentor: Roger Peng with assistance of Ravi Varadhan
Short description: This project aims at developing an R package that offers multiple latest acceleration schemes under a single call and can be used to accelerate any EM algorithm. In the proposal, I will show how flexible and convenient it will be for any R user to use this package and a reasonable timeline, which is the result of prior learning, is also included. In addition, I’d like to mention that I want R project as the mentoring organization and Professor Ravi Varadhan as my mentor.
Mentor: Brian G. Peterson
Short description: The existing packages have included necessary tools/functions to construct and apply trading strategies. More functions related to trading a portfolio, testing of parameters and evaluation of strategies can be added. This proposal is focus on some of the targets related to these new developments.
Mentor: Di Cook with assistance of Heike Hofmann
Short description: The project involves developing interactive time series and longitudinal data plots, in association with a new interactive graphics package for R called cranvas, which is based on Qt, and has the capability to handle large amounts of data. The purpose is to improve R’s capabilities for exploring temporal data. The time series plot will enable exploring slightly irregular seasonality, and associations between multiple series.The longitudinal plot will enable the study of the individual variation.
Last modified: May 19, 2011 by John C. Nash