RStudio Cheat Sheet: A Comprehensive Guide to Data Analysis and Statistical Inference | Lecture notes Statistics

RStudio Cheat Sheet

by Adela Vrtkova and Martina Litschmannova, Department of Applied Mathematics, FEECS, VB-TUO

via cheatography.com

Workspace, Using libraries

?boxplot

getting help documentation for function

boxplot

getwd()

returning the current working directory

setwd("C:/Users/RStudio")

setting the working directory to specied le

install.packages("packageZ")

downloading and installing a package called

packageZ

library(packageZ)

activating already installed package called

packageZ

packageZ::functionF(x)

calling function

functionF

from specied package

packageZ

moments

EnvStats

dunn.test

lsr

openxlsx

car

epiR

important packages

# After the hash, I can write whatever.

writing notes into the script

Importing data

data = read.csv2("C:/Users/RStudio/data.csv")

importing data in

csv

from specied le and saving as

data

data = read.csv2("http://am-nas.vsb.cz/DATA/dataset.csv")

importing data in

csv

from the internet and saving as

data

data = readWorkbook("C:/USER/DATA/dataset.xlsx", sheet=1,

startRow=4, colNames=TRUE, cols=2:9) # openxlsx package

importing data in

xlsx

Working with data

data = as.data.frame(data)

saving imported data as an object of class

data.frame

data.S = stack(data)

transferring data table into the standard data matrix

data.S.omit = na.omit(data.S)

omitting entire rows with missing values (NAs)

Probability distribution - Prexes

r- generating random numbers from the distribution

d- probability density function

f(x)

or probability mass function

P(X=x)

P(X≤x)

q- quantile function

Probability distribution - Discrete

-binom Binomial distribution

Bi(n, π)

-hyper Hypergeometric distribution

H(N, M, n)

! R code requires -

H(M, N −M, n)

-nbinom Negative binomial distribution

NB (k, π)

! denition in JASP/R - number of unsuccessful trials

-pois Poission distribution

P o(λt)

Probability distribution - Continuous

-unif Uniform distribution

U(a, b)

-exp Exponential distribution

Exp(λ)

-norm Normal distribution

N(µ, σ2)

! JASP applet Distributions requires

N(µ, σ2)

! R code requires -

N(µ, σ)

EDA for a Qualitative Variable

data$group = as.factor(data$group)

redening group variable as

factor

table(data$group)

frequency table

barplot(table(data$group))

creating a bar plot

pie(table(data$group))

creating a pie chart

EDA for a Quantitative Variable

summary(data$values) summary statistics

length(data$values) sample size (attention if

NAs present)

min(data$values) minimum

mean(data$values) arithmetic mean

quantile(data$values,probs=0.3) 30% quantile

max(data$values) maximum

sd(data$values) standard deviation

var(data$values) variance

moments::skewness(data$values) skewness

moments::kurtosis(data$values)-3 kurtosis

boxplot(data$values) boxplot

hist(data$values) histogram

plot(density(data$values)) plotting kernel density

estimation

qqnorm(data$values); qqline(data$values) QQ-plot

Function tapply()

tapply(dataS$values, dataS$group, mean)

calculates the mean for

values

group

data

tapply(dataS$values, dataS$group, quantile, probs=0.4)

calculates the 40% quantile for

values

group

data

tapply(dataS$values, dataS$group, moments::kurtosis)-3

calculates the kurtosis for

values

group

data

Statistical inference - One variable

shapiro.test(data$values)

Shapiro-Wilk test

varTest(data$values, sigma.squared=400, alternative="two.sided",

conf.level=0.95) # EnvStats package

condence interval for variance and one-sample Chi-squared test on

variance

(H0∶σ2=400, HA∶σ2≠400)

t.test(data$values, mu=5, alternative="less", conf.level=0.95)

condence interval for mean and one-sample Student's t-test

(H0∶µ=5, HA∶µ<5)

wilcox.test(data$values, mu=8, alternative="greater", conf.level=0.95,

conf.int=TRUE)

condence interval for median and one-sample Wilcoxon test

(H0∶x0,5=8, HA∶x0,5>8)

binom.test(x,n,p=0.18,alternative="two.sided",conf.level=0.95)

condence interval for probability and one-sample Binomial test

(Clooper-Pearson method)

(H0∶π=0.18, HA∶π≠0.18)

RStudio Cheat Sheet: A Comprehensive Guide to Data Analysis and Statistical Inference, Lecture notes of Statistics

Related documents

Partial preview of the text

Download RStudio Cheat Sheet: A Comprehensive Guide to Data Analysis and Statistical Inference and more Lecture notes Statistics in PDF only on Docsity!