## Tutorial: Introductory Probability and Statistics Using R

G. Jay Kerns, Department of Mathematics and Statistics, Youngstown State University, USA.

### Abstract

The purpose of this tutorial is to introduce new users of R to the basic environment and to share resources and tricks I have learned in my experience introducing R to students and colleagues of my own.

### Goals

Individuals walking out of this tutorial should be:
• familiar with the tools available for basic data import, export, and manipulation. Reshaping data, subsetting by groups, and formatting are included.
• competent with basic descriptive statistics including summaries and data display, and I plan to discuss all three (3) of the base, lattice, and ggplot2 graphics engines.
• proficient in the standard point estimation and hypothesis testing topics from a first course in statistics.
• equipped to perform standard linear regression modeling and diagnostics, and other topics such as resampling and permutation tests as time permits.
• able to prepare randomized quizzes, exams, answer keys, and/or study materials for students or colleagues.

### Outline

Topics will include:
• Data import, export, and manipulation: reshaping, subsetting, display
• Descriptive statistics: graphical, numerical
• Probability and distributions: base distributions and the distr family of packages
• Point and interval estimation: maximum likelihood, confidence intervals
• Hypothesis testing: parametric, nonparametric
• Simple and multiple linear regression: fitting, prediction, diagnostics
• Resampling: bootstrap percentile confidence intervals, permutation tests
• Document creation: Sweave, odfWeave, HTML export, and more

### Prerequisites

A person attending my tutorial should know how to turn on a computer. It would also help if they have had at least one semester of an upper-division undergraduate course in statistics.

In the last third of the tutorial I will discuss assessment material creation (exams, quizzes, class notes) and for that it would be helpful for users to have a passing familiarity with LaTeX; in the grand scheme of things, however, a person does not even really need that. There are freely available tools nowadays (LyX, GNU Emacs Org-mode + babel) that automate the LaTeX process to a large extent.

### Intended Audience

This tutorial is targeted at A) established professionals from other fields (physical/biological/environmental sciences, business, economics) who are comfortable with statistics but are just starting with the R language, B) computer scientists who may be quite competent with the R language but feel tentative about their basic statistical literacy and would like to learn more, or C) individuals who expect to be teaching people from groups A) or B) in the near future.

### Workshop Materials

Attendees of my workshop should go here before the tutorial to get up to speed. Other materials will be distributed to the tutorial participants on-site. At this time I do not believe people will need to bring anything except themselves, but if this changes I will post instructions here.
Please check here for up to date tutorial resources.