Prepara i tuoi esami
Ottieni punti
Guide e consigli
Vendi su Docsity
Docsity AI

Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity

Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium

Guide e consigli

Vendi su Docsity

Docsity AI

Accedi Registrati

Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity

Cerca documenti

Prepara i tuoi esami con i documenti condivisi da studenti come te su Docsity

Cerca la tua università

Trova i documenti specifici per gli esami della tua università

Video Corsi

Preparati con lezioni e prove svolte basate sui programmi universitari!

Quiz

Rispondi a reali domande d’esame e scopri la tua preparazione

Docsity AINEW

Riassumi i tuoi documenti, fagli domande, convertili in quiz e mappe concettuali

Maturità 2026

Studia con prove svolte, tesine e consigli utili

Esplora domande

Togliti ogni dubbio leggendo le risposte alle domande fatte da altri studenti come te

Argomenti di studio

Esplora i documenti più scaricati per gli argomenti di studio più popolari

Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium

Condividi documenti

20 Punti

Per ogni documento caricato

Rispondi alle domande

5 Punti

per ogni risposta data (max 1 al giorno)

Tutti i modi per ottenere punti gratis

Ottieni punti subito

Scegli un piano Premium con tutti i punti di cui hai bisogno

Opportunità di studio

Scegli il tuo prossimo programma di studio

Entra in contatto con le migliori università del mondo e scegli il tuo percorso di studi

Classifica delle migliori università

Scopri le migliori università italiane secondo gli studenti

Community

Chiedi alla community

Chiedi aiuto alla community e sciogli i tuoi dubbi legati allo studio

Guide Gratuite

I nostri eBook salva studente

Scarica gratuitamente le nostre guide sulle tecniche di studio, metodi per gestire l'ansia, dritte per la tesi realizzati da tutor Docsity

SUMMARY: RStudio and Radiant, Appunti di Statistica

Università commerciale Luigi Bocconi Statistica

Summary of the lectures and book of RStudio and Radiant for the statistics exam in Bocconi

Tipologia: Appunti

2020/2021

In vendita dal 22/06/2022

rebeccacordioli 🇮🇹

5

(1)

28 documenti

1 / 9

Questa pagina non è visibile nell’anteprima

Non perderti parti importanti!

R STUDIO

INTRODUCTION

1) Console → It is possible to type in the commands we intend to execute and in which the output produced by the

commands is printed.

2) Script → It allows you to write several lines of code (while the console is useful for just few lines).

• If you want to execute just a single line use Ctrl+Enter

• To create a new script: File → New File → R Script

• To save the script in a specific folder:

o “More”

o “Set as working directory”

o Now you can save it (top left corner)

3) Environment → It shows the list of objects created so far with a summary for each of them.

To clean up the environment use rm(list=ls()), where ls() is the function to see the list of object created so far.

4) History → You can search for a variable and find all the occurrences where you used that variable.

5) Files → It shows the contents of the current directory, with the possibility to change it by clicking on the name of

the current directory.

6) Plots → It will sequentially report all the graphs produced during the working session.

7) Packages → It lists the packages installed on your computer and those that are currently loaded in memory

(indicated by the check-marks next to the package names).

Load data

To load the contents of an .RData file in RStudio, you can:

• choose File → Open File..., select the file to open and confirm your choice

• click on the file name in the Files tab and confirm

• use the load(“filename”) function

If the file was in a directory other than the current one, you must first change the working directory: you have to

move to that directory in the Files tab, click on the icon with the gear (“More”) and choose “Set As Working

Directory”.

Scopri Appunti di Statistica Università commerciale Luigi Bocconi

Documenti correlati

Comandi RStudio principali

(3)

Nozioni per esame Radiant

Statistica Esame Bocconi 30001 - Comandi RStudio 1° Parziale

Procedimenti RStudio - Excel

Esercitazione con Rstudio

rstudio serie storiche

Alcuni comandi Rstudio

(1)

Esercizi Statistica Numerica - RStudio

Introduzione al programma RStudio

(1)

Statistica 1 con Rstudio

Analisi delle Variabili Categoriche: Distribuzioni di Frequenza e RStudio - Prof. Recla

Funzioni utili per Rstudio

Anteprima parziale del testo

Scarica SUMMARY: RStudio and Radiant e più Appunti in PDF di Statistica solo su Docsity!

R STUDIO

INTRODUCTION

1) Console → It is possible to type in the commands we intend to execute and in which the output produced by the commands is printed. 2) Script → It allows you to write several lines of code (while the console is useful for just few lines).

If you want to execute just a single line use Ctrl+Enter
To create a new script: File → New File → R Script
To save the script in a specific folder: o “More” o “Set as working directory” o Now you can save it (top left corner) 3) Environment → It shows the list of objects created so far with a summary for each of them. To clean up the environment use rm(list=ls()), where ls() is the function to see the list of object created so far. 4) History → You can search for a variable and find all the occurrences where you used that variable. 5) Files → It shows the contents of the current directory, with the possibility to change it by clicking on the name of the current directory. 6) Plots → It will sequentially report all the graphs produced during the working session. 7) Packages → It lists the packages installed on your computer and those that are currently loaded in memory (indicated by the check-marks next to the package names). Load data To load the contents of an .RData file in RStudio, you can:
choose File → Open File..., select the file to open and confirm your choice
click on the file name in the Files tab and confirm
use the load(“filename”) function If the file was in a directory other than the current one, you must first change the working directory: you have to move to that directory in the Files tab, click on the icon with the gear (“More”) and choose “Set As Working Directory”.

Assign a variable → <- or =

x <- 7 x [1] 7 Note! The == logical operator, instead, checks if the left side is equal to the right side and returns TRUE or FALSE.

VECTORS

Create a vector → c(elements)

vector_name <- c(4,6, 8 ,12) Selecting elements from vectors → [position_of_the_element] vector_name[3] [1] 8 Slicing a vector → : x = 0:6 (from 0 to 6) Functions of vectors

mean(x)
median(x)
var(x)
sd(x)
cov(x, y) → covariance
cor(x, y) → correlation coefficient ρ
max(x)
min(x)
log(x)
- summary(x)
- quantile(x, probs = c(a, b…)) → it returns the quantiles of order a, b…
- length(x) → it returns the size of the vector
- sum(x)
- cumsum(x)
- rbind(x1, x2 …) → to combine by columns or rows two or more sequences of vectors
- seq(start, end, difference_between_values)

DATASET

data(“namefile”) → it loads the dataset named “namefile” on the environment
head() → it inspect the 1st^ row of a dataset (if the dataset is too long)
namefile[1,3] → it finds the element in the 1st^ row and 3rd^ column
load(“namefile”) → it loads the dataset, only if contained in the working directory
namefile$sub → to extract all the variables of the “sub” category Saving the dataset
“…” to retrieve the folder
Open the folder
Set the folder as a working directory by: o “More” o “Set as working directory” o Now everything you upload/save will be taken/put in that folder
Load the dataset by clicking on the name and confirm (or use the load() function)

PACKAGES

To download a package:

Go in the “Packages” tab
Click “Install”
Type the name of the package and ✓ it Note! If you click of the name, you can read all the functions of that package

BINOMIAL DISTRIBUTION

→ PMF:

choose(n, k)* pk^ * (1-p)n-k
dbinom(k, size=n, prob=p)

NORMAL DISTRIBUTION

→ PDF : f(x) = dnorm(x, mean, sd) → CDF : F(x) = P(X<x) = pnorm(x, mean, sd) F(x) = P(X>x) = 1- pnorm(x, μ, σ) = pnorm(x, μ, σ, lower.tail=FALSE) Quantiles of order x → Qx = qnorm(x, mean, sd)

If I don’t specify μ=0 and σ=
To find the z such that 5% of the values are more extreme than z, which means P(X<-z) + P(X>z) = 0.05 → z has to be the quantile of order 0.975, therefore z = qnorm(0.975)

T-STUDENT DISTRIBUTION

Density → PDF: f(x) = dt(x, df) Distribution function → CDF: F(x) = P(X<x) = pt(x, df) F(x) = P(X>x) = 1- pt(x, df) = pt(x, df, lower.tail=FALSE) Quantiles of order x → Qx = qt(x, df)

CHI SQUARE DISTRIBUTION

Density → PDF: f(x) = dchisq(x, df) Distribution function → CDF: F(x) = P(X<x) = pchisq(x, df) F(x) = P(X>x) = 1- pchisq(x, df) = pchisq(x, df, lower.tail=FALSE) Quantiles of order x → Qx = qchisq(x, df)

LINEAR REGRESSION MODEL

Define the model

Simple linear regression → m = lm(Y ~ X)
Multiple linear regression → m = lm(X ~ X 1 + X 2 ) Note! Copy and paste the tilde ~ symbol with ?tilde Graph
Scatterplot → plot(X, Y)
Visualize the regression line → abline(m) Anova table Get the variance decomposition SST = SSR + SSE → anova(m) Summary summary(m) If we store it as a variable info = summary(m), then we can recover the information we need by info$... Confidence intervals for coefficients (b0, b1…) confint(m, level) level = 1-α → if I don’t specify level=0.9 5 and α=0. Find Y

We can plug in X 1 =x 1 and X 2 =x 2 in the linear function to obtain the predicted Y: predict(m, data.frame(X 1 =x 1 , X 2 =x 2 ), level) level = 1-α → if I don’t specify level=0.9 5 and α=0.
Or we can find a confidence interval for the predicted Y: a) Confidence intervals for average values of Y predict(m, data.frame(X 1 =x 1 , X 2 =x 2 ), interval = “confidence”, level) b) Confidence interval for the real value of Y predict(m, data.frame(X 1 =x 1 , X 2 =x 2 ), interval = “predict”, level) *add this if it is a multiple linear regression model with more variables Check if assumptions are met through residual graphs
Linearity → plot(m, 1) residuals should be around 0
Homoscedasticity → plot(m, 1) variance of residuals should be constant
Normality → Q-Q Plot plot(m, 2) it should follow the straight line
Multicollinearity → cor(X 1 , X 2 ) it should be < 0.

“BASICS” TAB

PROBABILITY CALCULATOR

No dataset required
We can choose different distributions (normal, t- Student…)
To calculate probabilities → input = values
To calculate quantiles → input = probabilities SINGLE MEAN (Hypothesis test case 2)
You need a dataset (load it in the Data tab → Manage)
To compute the sample mean (mean) of a normal distribution
You need to select:
- The numerical variable for which you want to calculate the mean
- If H1 is one-sided or two-sided
- The confidence level
- The comparison value (= H0) The summary shows:
mean → sample mean
n → sample size
n_missing → number missing data
sd → standard deviation
se → standard deviation of the sample mean (= σ/√n)
me → margin of error, it half of the width of the confidence interval (= se*corresponding quantile)
diff → difference between sample mean and population mean under H
t.value → t-score
p.value → p-value
df → degrees of freedom
[x% y%] → upper and lower bound of the confidence interval COMPARE MEANS (Hypothesis test case 4 and 5)
You need a dataset (load it in the Data tab → Manage)
It compares different means Example: the mean wage (2nd^ variable, numerical) depending on ethnicity (1st^ variable, categorical)
You need to select: o At least two variables (1st: anything, 2nd: numeric variable) o If H1 is one-sided or two-sided o The confidence level o Sample type → independent (for case 5 ), paired (for case 4 ) o Test type → ALWAYS t-test
Assumptions: o Normally distributed populations o Population variances are unknown and possibly different o Independent samples Note! Tick “Show additional statistics” to see se, t.value, df, and the confidence interval

SINGLE PROPORTION (Hypothesis test case 3)

You need a dataset (load it in the Data tab → Manage)
To compute the sample proportion (p)
You need to select:
- The categorical variable for which you want to calculate the proportion
- The “level” (= value of the categorical variable for which to calculate the proportion)
- If H1 is one-sided or two-sided
- The confidence level
- The comparison value (= H0)
Test type → ALWAYS Z-test GOODNESS OF FIT
You need a dataset (load it in the Data tab → Manage)
You need to select: o The categorical variable o The probabilities that this variable is expected to follow
To check if the distribution of the variable is consistent with the specified distribution The summary shows:
“Observed” → to see the observed values
“Expected” → to see the expected values, following the specified probabilities
“Chi-squared” → contribution to the chi-square score
Chi-square score χ²
df → degrees of freedom (k-1)
p.value Note! Check that 0% of cells have expected values <5! CROSS-TABS
You need a dataset (load it in the Data tab → Manage)
You need to select the TWO categorical variables
To check if the is association between these variables The summary shows:
“Observed” → to see the contingency table of the observed values
“Expected” → to see the contingency table of the expected values
“Chi-squared” → contribution to the chi-square score
“Row percentages” → conditional frequency
“Column percentages” → conditional frequency
“Table percentages” → total relative frequency
Chi-square score χ²
df → degrees of freedom (r-1)(c-1)
p.value Note! Check that 0% of cells have expected values <5!