R Statistics Tutorial: Creating Vectors and Using Basic Functions - Prof. A.L. Yuille, Study notes of Statistics

A tutorial on using r for statistical analysis, focusing on creating vectors and using basic functions such as help, c(), seq(), rep(), length(), and log(). It also covers vector indexing and matrix creation using cbind(), rbind(), and matrix().

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-r5b
koofers-user-r5b 🇺🇸

9 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Statistics 153
R tutorial
Instructor: Prof. A.L. Yuille (Fall 2005).
Invoking R in the Boelter Hall 9413 lab
The Boelter Hall lab has R installed.
Survival Kit
(a) To exit R, type q().
(b) To start a help window, type help.start().
(c) To get text only help on a command, type help(command) or ?command
(d) To start a graphics window, type X11().
(e) To close a graphics window, type dev.off(). (f) To get help on a function (e.g. cor) type help(cor).
Vectors
The simplest data type of R is vector. A scalar is just a vector with length 1. The following examples show how to create
vectors.
> c(1, 3, 5, 9)
[1]1359
> 1:10
[1]12345678910
> seq(1, 2, 0.1)
[1] 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
> rep(2, 4)
[1]2222
> rep(2:3, 2)
[1]2323
> length(3:9)
[1] 7
Expressions
You can use R to calculate any expression you typed; R will evaluate it, the result will be printed, and then discarded.
For arithmetical operations, if the two operands are not of the same length, the shorter vector is recycled as often as need
to match the length of the longer vector. For example,
> 1:4 + 2
[1]3456
> 1:4 / 2
[1] 0.5 1.0 1.5 2.0
> (2:3)^(2:3) ## note exponentiation has higher precedence
[1] 4 27
> log(c(2, 3))
[1] 0.6931472 1.0986123
> sqrt(4)
[1] 2
pf3
pf4
pf5

Partial preview of the text

Download R Statistics Tutorial: Creating Vectors and Using Basic Functions - Prof. A.L. Yuille and more Study notes Statistics in PDF only on Docsity!

Statistics 153 R tutorial

Instructor: Prof. A.L. Yuille (Fall 2005).

Invoking R in the Boelter Hall 9413 lab

The Boelter Hall lab has R installed.

Survival Kit

(a) To exit R, type q(). (b) To start a help window, type help.start(). (c) To get text only help on a command, type help(command ) or ?command (d) To start a graphics window, type X11(). (e) To close a graphics window, type dev.off(). (f) To get help on a function (e.g. cor) – type help(cor).

Vectors

The simplest data type of R is vector. A scalar is just a vector with length 1. The following examples show how to create vectors.

> c(1, 3, 5, 9) [1] 1 3 5 9 > 1: [1] 1 2 3 4 5 6 7 8 9 10 > seq(1, 2, 0.1) [1] 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2. > rep(2, 4) [1] 2 2 2 2 > rep(2:3, 2) [1] 2 3 2 3 > length(3:9) [1] 7

Expressions

You can use R to calculate any expression you typed; R will evaluate it, the result will be printed, and then discarded. For arithmetical operations, if the two operands are not of the same length, the shorter vector is recycled as often as need to match the length of the longer vector. For example,

> 1:4 + 2 [1] 3 4 5 6 > 1:4 / 2 [1] 0.5 1.0 1.5 2. > (2:3)^(2:3) ## note exponentiation has higher precedence [1] 4 27 > log(c(2, 3)) [1] 0.6931472 1. > sqrt(4) [1] 2

Assignments

An assignment evaluates an expression and passes the value to a variable but the result is not printed. Assignments are indicated by the assignment operators ”<-” and ”=”. They operate the same. For example,

> x <- c(1,3,5,7,8,9) > x [1] 1 3 5 7 8 9

Vector and Matrix indexing

In R the subscripts of vectors and matrices start from 1. A negative index means all the elements in the vector except it. If the index is out of bound, the result is an NA. NA is the value R uses for a missing or undefined value.

> x <- c(1,3,5,7,8,9) > x[3] [1] 5 > x[1:3] [1] 1 3 5 > x[-2] [1] 1 5 7 8 9 > x[10] [1] NA

You can create a matrix with the cbind(), rbind() and matrix() functions. cbind() binds vectors together in columns, rbind() binds vectors together in rows, and matrix() fills a matrix with the elements of a vector.

> X <- cbind(x, y=1:6) > X x y [1,] 1 1 [2,] 3 2 [3,] 5 3 [4,] 7 4 [5,] 8 5 [6,] 9 6 > Y <- matrix(0,2,3) > Y [,1] [,2] [,3] [1,] 0 0 0 [2,] 0 0 0 > Y <- matrix(x,2,3) > Y [,1] [,2] [,3] [1,] 1 5 8 [2,] 3 7 9 > Y[1,2] ## extract element on the first row and the second column [1] 5 > Y[1,] ## extract the first row [1] 1 5 8 > Y[,1] ## extract the first column [1] 1 3 > Y[2,c(1,3)] ## of row 2, extract elements (1,3) [1] 3 9

Graphics output

R provides comprehensive graphics facilities. Most frequently used tools probably will be scatter plots and histograms. By default, plot() function produces scatter plots. You can change the graph style to line plot by providing argument type = ”l”. You can have both points and lines by providing argument type = ”b”. There are a whole bunch of options you can specify in the plot() function.

> a <- rnorm(20) > b <- rnorm(20) > plot(a) > plot(a, type="l") > plot(a, type="b") > plot(sort(a),sort(b),type="l") > plot(a, b, main="Line Plot", xlab="X", ylab="Y") > hist(a)

Saving a plot to PDF file requires opening the file using the pdf command, plotting the graph again, and closing the file. For example,

> pdf(file="histogram.pdf",encoding="MacRoman") > hist(a) > dev.off() X 2

Another useful function is par, which enable you to set or ask about graphics parameters. Calling par(mfrow=c(2,2)) will divide the graphics window into 4 cells (2 rows and 2 columns), it is handy if you want to put more plots on one page. To restore to 1 cell setting use par(mfrow=c(1,1)).

Writing your own functions

Technically R is a function language. As you have seen, it has a lot of built-in functions, but you will soon come upon situations where you want to write one of your own. Here is a simple example.

std.dev <- function(x) {

Input: a vector x

Output: the standard deviation of x

return(sqrt(var(x))) }

The function takes one argument, a vector x, and returns a scalar. The lines beginning with # are comments. To invoke the function on a vector x, you can type std.dev(x). To see the commands which make up the function, just type std.dev (without any brackets). You can write a function with multiple outputs as shown below.

mean.stdev <- function(x){ m <- mean(x) stdev <- sqrt(var(x)) return(list(mean=m, stdev=stdev) } > mean.stdev(1:5) $mean: [1] 3 $stdev: [1] 1.

Logical objects

So far all the examples we’ve shown use numeric objects. There are some other modes in R as well, namely logical objects, factor objects and character objects. Logical objects are more often used so we discuss them here. A logical vector is a vector with each element either ”TRUE” or ”FALSE”. Operators like <, <=, >, >=, == and != (not equal) take two numeric argument and return a logical vector. Operators like |, &,! are logical or, and, negation. The above operators are vector operators too. Logical objects are often used in indexing. For example,

> x [1] 1 3 5 7 8 9 > x> [1] FALSE FALSE TRUE TRUE TRUE TRUE > x[x>4] [1] 5 7 8 9 > x[x>4 & x<=7] [1] 5 7

Logical vectors may be used in ordinary arithmetic. They are coerced into numeric vectors. F becoming 0 and T becoming

  1. It is useful in calculating the number of elements in a vector that satisfies a certain condition.

> a <- c(-1, -4, 3, 5, 7) > sum(a > 0) [1] 3 > b <- c(NA, 3, 6, 9, NA, 11) > is.na(b) [1] TRUE FALSE FALSE FALSE TRUE FALSE > sum(is.na(b)) [1] 2

Control structures

(a) Conditional execution The format is if (condition) statement else statement

if (is.numeric(x) && min(x) > 0) { sx <- sqrt(x) } else { stop("x must be all positive"); }

(b) Looping The formats are for (variable in sequence) statement while (condition) statement

sum <- 0 for (i in 1:100) sum <- sum + i

A more efficient way to do looping is through the apply() statement. Try help(apply) to get more information on how to use this function.

This tutorial has been adapted from a document by Tao Jiang (Stanford University) with permission.