What is R?
The R system for statistical computing is available to everyone.All scientists, including, in particular, those working in developing countries, now have access to state-of-the-art tools for statistical data analysis without additional costs.
With the help of the Rsystem for statistical computing, research really becomes reproducible when both the data and the results of all data analysis steps reported in a paper are available to the readers through an R transcript file.
R is most widely used for teaching undergraduate and graduate statistics classes at universities allover the world because students can freely use the statistical computing tools.
The base distribution of R is maintained by a small group of statisticians, the R Development CoreTeam.
A huge amount of additional functionality is implemented in add-on packages authored andmaintained by a large group of volunteers. The main source of information about the R system is the world wide web with the official home page of the R project being http://CRAN.R-project.org
All resources are available from this page: the R system itself, a collection of add-on packages,manuals, documentation and more.
As per the R Language definition (Version 4.3.1 (2023-06-16) DRAFT) : R is a system for statistical computation and graphics. It provides, among other things, a pro- gramming language, high level graphics, interfaces to other languages and debugging facilities.
Why R??
- Free, Open Source and Available on every major platform
- Massive Set of packages for statistical modelling, machine learning, visualisation and importing and manipulating data
- Cutting edge tools
- Deep-seated language support for data analysis
- A vibrant community
- Powerful tools for communicating your results
- A strong foundation on functional programming
- Existence of IDE (www.rstudio.com/ide/)
- Powerful meta programming facilities
- Designed to connect to high-performance programming languages like C, Fortran and C++
Cons of R
- Codes written in haste
- R Community more focused on results rather that processes
- Code mostly written to reduce the amount of typing and hence too hard to understand
- Inconsistency across contributed packages. (almost 30 years of evolution)
- R is a profligate user of memory
Overview of the R-System
The R system for statistical computing consists of two major parts: the base system and a collection of user contributed add-on packages.The R language is implemented in the base system. Implementations of statistical and graphical procedures are separated from the base system and are organised in the form of packages.Both the base system and packages are distributed via the Comprehensive R Archive Network(CRAN) accessible under http://CRAN.R-project.orgThe base system is available in source form and in precompiled form for various Unixsystems, Windows plqtforms and Mac OS X. For the data analyst, it is sufficient to download the precompiled binary distribution and install it locally. Windows users can follow the link: http://CRAN.R-project.org/bin/windows/base/release.html
The base distribution already comes with some high-priority add-on packages namely:
mcgv | KernSmooth | MASS | base |
boot | class | cluster | codetools |
datasets | foreign | grDevices | graphics |
grid | lattice | methods | nlme |
nnet | rcompgen | rpart | spatial |
splines | stats | stats4 | survival |
tcltk | tools | utils |
Help and Documentation
Roughly, three different forms of documentation for the R system for statistical computing may be distinguished:
- online help that comes with the base distribution or packages,
- electronic manuals and
- publications work in the form of books etc
More extensive documentation is available electronically from the collection of manuals at http://CRAN.R-project.org/manuals.html
Some of the electronic manuals available are:
- An Introduction to R: A more formal introduction to data analysis with R .
- R Data Import/Export: A very useful description of how to read and write various external data formats.
- R Installation and Administration: Hints for installing R on special platforms.
- Writing R Extensions: The authoritative source on how to write R programs and packages.
Operators in R
Arithmetic Operators
Operators | Description |
---|---|
+ | Add |
– | Substract |
* | Multiply |
/ | Divide |
^ | Exponentiation |
%% | modulo (remainder) |
%/% | quotient |
Relational Operators
< |
> |
<= |
>= |
OTHER Operators
Operator | Description |
---|---|
:: , ::: | access variables in a namespace |
$ , @ | component/ slot extraction |
[ , [[ | indexing |
: | Sequence operator |
! | negation |
& , && | and |
| , || | or |
~ | as in formulae |
-> , ->> | rightwards assignment |
<- , <<- | assignment( right to left) |
= | assignment(right to left) |
? | help |
Constants in R
Constant Name | Description |
---|---|
LETTERS | the 26 upper-case letters of the Roman alphabets |
letters | the 26 lower-case letters of the Roman alphabets |
month.abb | the three-letter abbreviation for the English month names |
month.name | the English names for the months of the year |
pi | the ratio of the circumference of a circle to its diameter |
Numeric Constants in R
Name | Description | typeof() |
---|---|---|
Inf | Infinity | double |
NaN | Not a Number | double |
NA_real_ | Not Available/ Missing Real | double |
NA_integer | Not Available/ Missing Integer | integer |
In R numeric constants start with a digit or period and are either a decimal or hexadecimal constant optionally followed by L.Hexadecimal constants start with 0x or 0X followed by a nonemply sequence from 0-9 a-f A-F, which is interpreted as a hexadecimal number optionally followed by a binary exponent. A binary exponent consists of a P or p foowed by an optional plus or minus sign followed by a non-empty sequence of (decimal) digits, and indicates multipication by a power of two.Decimal constants consist of a nonempty sequence of digits possibly containing a period(decimal point), optionally followed by a decimal exponent. A decimal exponent consists of an E or e followed by an optional plus or minus sign followed by a non-empty sequence of digits, and indicates multipication by a power of ten.A constant followed by i is regarded as an imaginary complex number.A numeric constant immediately followed by L is regarded as an integer number when possible.Only the ASCII digits 0-9 are recognized as digits, even in languages which have other other representations of digits. The ‘decimal separator’ is always a period and never a comma.A leading plus or minus is not regarded by the parser as part of a numeric constant by as a unary operator applied to the constant.
Simple Mathematical Functions in R
Function | Description |
---|---|
abs(x) | absolute value |
sign(x) | Sign of elements (1,0,-1) |
sqrt(x) | square root |
floor(x) | largest integer |
ceiling(x) | smallest integer |
exp(x) | exponential |
expm1(x) | e(x) – 1 |
log2(x) | log with base 2 |
log10(x) | log with base 10 |
log1p(x) | log(1+x) |
cos(x) | cosine |
sin(x) | sine |
tan(x) | tangent |
acos(x) | arc cosine |
asin(x) | arc sine |
atan(x) | arc tangent |
cosh(x) | hyperbolic cosine |
sinh(x) | hyperbolic sine |
tanh(x) | hyperbolic tangent |
acosh(x) | arc hyperbolic cosine |
asinh(x) | arc hyperbolic sine |
atanh(x) | arc hyperbolic tangent |
cospi(x) | cos(pi*x) |
sinpi(x) | cos(pi*x) |
tanpi(x) | tan(pi*x) |
gamma(x) | gamma function |
lgamma(x) | natural log of gamma function |
digamma(x) | first derivative of gamma |
trigamma(x) | second derivative of gamma |
cumsum(x) | cumulative sums |
cumprod(x) | cumulative products |
cummax(x) | cumulative maxima |
cummin(x) | cumulative minima |
Im(x) | Imaginary part |
Re(x) | Real part of Complex Number |
Arg(x) | Argument part of Complex Number |
Conj(x) | Compex conjugate |
Mod(x) | Modulus |
Some Functions from base-package
Index | Description |
---|---|
.Call | Modern Interfaces to C/C++ code |
.Internal | Call an Internal Function |
.Primitive | Look Up a Primitive Function |
.Library | Search Paths for Packages |
capabilities | Report capabilities of this build of R |
Arithmetic | Arithmetic Operators |
Constants | Built-in Constants |
NA | Not Available / Missing Values |
NULL | The Null object |
Internal Methods | Many R-internal functions are generic and allow methods to be written for. |
source | Read R code from a file, a connection or expressions |
library | loading/attaching and listing of packages |
getwd | Get Working Directory |
setwd | Set Working Directory |
ls | list all objects |
Sys.info | |
Sys.time | Get current Date and Time |
Sys.timezone | Get current TIme Zone |
Sys.localeconv | Get details of the numerical and monetary representations in the current locale |
Sys.sleep | Suspend execution for a time interval |
system.time | CPU time used |
system.file | FInd name of R system file |
system | Invoke system command |
system2 | Invoke system command |
dyn.load | Foreign Function interface |
date | system date and time |
proc.time | running time of R |
quit | terminate an R session |
cat | concatenate and print |
typeof | The typeof an object |
array | creates array |
as.array | |
is.array | |
c | Combine values into a vector or list |
cbind | combine R objects by columns |
rbind | combine R objects by rows |
integer | Creates objects of type integer |
as.integer | |
is.integer | |
numeric | Numeric Vectors |
range | range of values |
vector | Produces a ‘simple’ vector of the given length and mode |
as.vector | A generic, attempts to coerce its argument into a vector of a given mode |
is.vector | returns TRUE if the vector is of the specified mode having no attributes other than names |
append | vector merging |
names | names of an object |
labels | Find labels from objects |
length | length of an object |
lengths | length of list or vector elements |
sum | sum of vector elements |
cumsum | cumulative sums |
cumprod | cumuative products |
cummax | cumulative maxima |
cummin | cumulative minima |
summary | object summaries |
attach | attach object to search path |
detach | detach object from search path |
diff | Lagged Differences |
function | function definition |
is.function | checks whether argument is a function |
lapply | Apply function over a list or vector |
data.frame | Data Frame |
data.matrix | convert dataframe to a numeric matrix |
mat.or.vec | Create matrix or vector |
t | transpose of a matrix |
subset | subsetting vectors or matrix |
max.col | Find maximum position in a matrix |
diag | Form diagonal matrix |
norm | Calculates the norm of the matrix |
lower.tri | lower triangle of a matrix |
upper.tri | upper triangle of a matrix |
dim | dimensions of an object |
dimnames | Dimension names of an object |
det | calculate determinant of a matrix |
chol | The Cholesky Decomposition |
chol2inv | Inverse from QR decomposition |
eigen | Spectral Decomposition of a matrix |
qr | The qr decompostion of a matrix |
qr.X | Reconstruct the Q, R or X matrices from a QR object |
colSumscolMeans | column Sums/Means |
rowSumsrowMeans | row Sums / Means |
crossprod | Matrix cross product |
unique | extract unique elements |
expression | unevaluated expression |
eval | evaluate an expression |
unlist | flatten list |
which | Which indices are true |
which.min | Where is the min() or max() or first TRUE of FALSE |
ifelse | conditional element selection |
warning | Print warning message |
warnings | Print warning messages |
write | Write data to a file |
tempfile | Create name for temporary file |
nchar | counts the number of characters |
rank | Returns the sample ranks of the values in a vector Ties |
zapsmall | Rounding of numbers |
utf8ToInt | converts a lenth-one character string encoded in UTF-8 to an integer vector of Unicode code points |
IntToUtf8 | converts a numeric vector of Unicode code points either to a single character string or a character vector |
cut | convert numeric to factor |
table | cross tabulation and table creation |
marginSums | compute table margins |
toString | convert an R object to a character string |
strsplit | split the elements of a character vector |
call | Function calls |
tapply | apply a function to each cell of a ragged array |
jitter | add noise to numbers |
Some Functions from utils-package
Index | Description |
---|---|
Rhome | R Home directory |
RSiteSearch | Search for keywords or phrases in documentation |
help | documentation |
help.search | search the help system |
help.start | hypertext documentation |
nsl | look-up for IP address |
install.packages | install packages for repositories or local file |
installed.packages | find installed packages |
available.packages | list available packages at CRAN like repositories |
update.packages | Compare Installed Packages with CRAN-like repositories |
remove.packages | remove installed packages |
browse.Env | Browse objects in environment |
browseURL | Load URL in an HTML browser |
head | return the first or last part of an object |
object.size | report the size allocated to an object |
str | list the structure of an arbitrary R object |
ls.str | list objects and their structure |
zip | create zip archive |
unzip | extract or list zip archives |
read.fortran | Read Fixed-Format Data in a Fortran-like Style |
read.table | Data input |
write.table | Data output |
as.roman | get roman numeral |
sessionInfo | Get and report version information about R, the OS and attached or loaded packages |
SHLIB | Build shared object/DLL for Dynamic loading |
download.file | download a file from the Internet |
example | Run al the R code from the examples part of R’s online help |
maintainer | show Package maintainer |
Some Functions from graphics-package
Index | Description |
---|---|
barplot | Barplots |
hist | histograms |
dotchart | cleaveland’s dot plots |
plot | scatter plot |
stripchart | 1-D scatter plots |
smoothScatter | Scatterplots with smootheed densities color representation |
plot.xy | basic internal plot function |
stem | Stem-and-Leaf Plot |
boxplot | Box Plots |
boxplot.matrix | draw a boxplot for each column(row) of a matrix |
bxp | draw Box plots for summaries |
matplot | plot columsn of matrices |
cdplot | Conditional Density Plots |
coplot | conditioning plots |
contour | display contours |
filledcontour | level (contour) plots |
curve | draw function plots |
mosaicplot | Mosaic plots |
pie | Pie Charts |
spineplot | Spine Plots and Spinograms |
stars | Star(Spider/Radar) Plots and segment diagrams |
strwidth | plotting dimensions of character strings and math expressions |
sunflowerplot | Produce a sunflower scatter plot |
symbols | Draw symbols(circles, squares, stars, thermometers, boxpots) |
frame | creat / start a new plot frame |
layout | specifying complex plot arrangements |
par | set or query graphical parameters |
grid | add grid to plot |
abline | add straight ines to a plot |
lines | add connected line segments to a plot |
segments | Add line segments to a plot |
box | draw a box around a plot |
clip | set clipping region |
axis | add an axis to a plot |
axTicks | compute axis tickmark locations |
arrows | add arrows to a plot |
legend | add legends to plots |
text | add text to a plot |
title | plot annotation |
mtext | write text into the margins of a plot |
image | display a color image |