Discovering Statistics

A den for Learning

Introduction to R Programming

What is R?

  • The R system for statistical computing is an environment for data analysis and graphics.
  • The root of R is the S language, developed by John Chambers and colleagues (Becker et al., 1988,Chambers and Hastie, 1992, Chambers, 1998) at Bell Laboratories
  • The base distribution of R and a large number of user contributed extensions are available under the terms of the Free Software Foundation’s GNU General Public License in source code form
  • The R system for statistical computing is available to everyone.
  • All scientists, including, in particular, those working in developing countries, now have access to state-of-the-art tools for statistical data analysis without additional costs.
  • With the help of the Rsystem for statistical computing, research really becomes reproducible when both the data and the results of all data analysis steps reported in a paper are available to the readers through an R transcript file.
  • R is most widely used for teaching undergraduate and graduate statistics classes at universities allover the world because students can freely use the statistical computing tools.
  • The base distribution of R is maintained by a small group of statisticians, the R Development CoreTeam.
  • A huge amount of additional functionality is implemented in add-on packages authored andmaintained by a large group of volunteers. The main source of information about the R system is the world wide web with the official home page of the R project being
  • All resources are available from this page: the R system itself, a collection of add-on packages,manuals, documentation and more.

How to install R?

  • The R system for statistical computing consists of two major parts: the base system and a collection of user contributed add-on packages.
  • The R language is implemented in the base system. Implementations of statistical and graphical procedures are separated from the base system and are organised in the form of packages.
  • Both the base system and packages are distributed via the Comprehensive R Archive Network(CRAN) accessible under
  • The base system is available in source form and in precompiled form for various Unixsystems, Windows plqtforms and Mac OS X. For the data analyst, it is sufficient to download the precompiled binary distribution and install it locally. Windows users can follow the link:
  • The base distribution already comes with some high-priority add-on packages namely
  • Packages not included in the base distribution can be installed directly from the Rprompt using : install.packages(“package_name”).


Roughly, three different forms of documentation for the R system for statistical computing may be distinguished:

  1. online help that comes with the base distribution or packages,
  2. electronic manuals and
  3. publications work in the form of books etc

More extensive documentation is available electronically from the collection of manuals at

Some of the electronic manuals available are:

  • An Introduction to R: A more formal introduction to data analysis with R than this chapter.
  • R Data Import/Export: A very useful description of how to read and write various external data formats.
  • R Installation and Administration: Hints for installing R on special platforms.
  • Writing R Extensions: The authoritative source on how to write R programs and packages.