Statistics cheat sheet, Cheat Sheet of Statistics

Statistics cheat sheet for Ru students

Typology: Cheat Sheet

2023/2024

Uploaded on 07/10/2024

justdavid-poell
justdavid-poell 🇳🇱

2 documents

1 / 18

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
*Lecture 1: Intro + Discrete probability models,
Need to know: Observational unit: de persoon of ding dat gemeten wordt
Random variable: hetgene wat random is
Kansrelkening
Formula: Bernoulli and Binomial distributions, expectatio value, variance and covariance
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12

Partial preview of the text

Download Statistics cheat sheet and more Cheat Sheet Statistics in PDF only on Docsity!

*Lecture 1: Intro + Discrete probability models, Need to know: Observational unit: de persoon of ding dat gemeten wordt Random variable: hetgene wat random is Kansrelkening Formula: Bernoulli and Binomial distributions, expectatio value, variance and covariance

Poisson distribution *Lecture 2: Continuous probability models, Exponential family, Need to know: Convolution Formula: uniform and exponential distribution; formula for exponential family of distribution Uniform distribution Exponential distribution

Boxplot:

Mediaan: middelste getal of gemiddelde van middelste 2 getallen M Q1: mediaan van eerste getal tot aan de mediaan van de gehele set Q3: Meidaan van mediaan van de gehele set tot aan laatste getal *Lecture 4: Random sample Book + slides; Suffiency Need to know: Everything about the Q-Q plot Quantile: het spiltsen in delen (mediaan is de 0.5 quantile). Je kunt ook de mediaan nemen van het deel onder de hoofdmediaan (0.25 quantile) en het deel boven de hoofdmediaan (0.75 quantile). *Lecture 5: Sampling distribution Sampeling from normal distubution: Need to know: Fisherman-Neymann therom Formula: mean and variance of sampling distribution of sampling mean of normal distributed random variables Lecture 7: -

In other words, the method of moments says that we should equate the theoretical first moment with the observed empirical first moment. This will yield an equation with the unknown being the parameter we wish to estimate. Solving this equation for the unknown will yield an estimator of , the Method of Moments estimator. Remember: an estimator is a random variable. The realisation of this random variable is called an estimate. Formula: Maximum likelihood estiomator of parameters of Bernoulli and normal distribution *Lecture 10: Test statistics Need to know: what is a test statistic?

Optimal test: Simple: altijd optimale testen kunnen vinden Unilateral: optimale testen in specifieke situaties Bilateral: geen optimale test “we do not reject the null hypothesis H0” ipv we accept H *Lecture 11: likelihood ratio statistics Voor simple vs simple is de optimale test de likelihood ratio test

Paired t-statistic Two sample t-test

In a two-sample test, you have two independent samples. Of course, by

independence, we expect the two sets of measurements will not be

correlated. Sample sizes for the two samples need not be equal. (But it often

makes sense for them to be approximately equal.)

Bijvoorbeeld inkomen van twee verschillende type mensen

Paired t-test:

Paren worden met elkaar vergeleken en die hebben wel invloed op elkaar.

Bijvoorbeeld: gemiddelde leeftijd van een broer/zus paar. (before/after

meassurement)

Wat ook kan: random variabele D aanmaken en dan D = X – Y berekenen. Hou rekeninf met

covariantie

*Lecture 12: likelihood ratio, Wald, p-value Wald:

  1. Stel de hypotheses op:Nulhypothese (H0): De parameter is gelijk aan nul (geen effect).  Alternatieve hypothese (H1): De parameter is niet gelijk aan nul (er is een effect).
  2. Bepaal het geschatte verschil:  Haal de geschatte waarde van de parameter uit het regressiemodel. Dit is het geschatte verschil dat je wilt testen.
  3. Bereken de standaardfout van de geschatte parameter:  De standaardfout geeft aan hoeveel we verwachten dat de geschatte parameter kan variëren. Het wordt meestal verstrekt als onderdeel van de uitvoer van het regressiemodel.
  4. Bereken de Wald-statistiek:  De Wald-statistiek wordt berekend door het geschatte verschil te delen door de standaardfout van de geschatte parameter.

var ( θ ^ )=

I ( θ )

I ( θ )=− E [

d

2

d p

2 ln^ ( L^ (^ θ )^ )]

Wortel nemen voor standaard deviatie  Z- test Wald test kan ook gebruikt worden om twee populaties met elkaar te vergelijken

  1. Bepaal de kritieke waarde:  Vergelijk de Wald-statistiek met de kritieke waarde van de chi-kwadraatverdeling met het juiste aantal vrijheidsgraden. Dit aantal vrijheidsgraden is meestal gelijk aan één, omdat we slechts één parameter testen.
  2. Beslis over de nulhypothese:  Als de Wald-statistiek groter is dan de kritieke waarde, verwerp dan de nulhypothese en concludeer dat er een significant effect is.  Als de Wald-statistiek niet groter is dan de kritieke waarde, faal dan om de nulhypothese te verwerpen en concludeer dat er geen significant effect is.

Alfa% interval normale verdeling links en rechts: De z-waarde die hoordt bij 1 – alfa * ½ Right side confidence interval In samenvatting Als je variantie niet weet

  • Z eenzijdig 1.
  • Z tweezijdig 1.

Variance: A measure of how spread out a set of values is.  Covariance: A measure of how much two random variables change together. Lecture 2: Continuous Probability Models, Exponential Family Need to Know:Convolution: A mathematical operation on two functions to produce a third function. Formulas:Uniform Distribution: Describes constant probability.  Exponential Distribution: Models the time until an event occurs.  Exponential Family of Distributions: A family of probability distributions, including many well-known distributions. Lecture 3: Describing Data Need to Know:Boxplot: A graphical representation of the distribution of a dataset. Formulas:Empirical Distribution: A distribution function that accumulates probability mass at observed data points.  Sample Mean: The average of a sample.  Sample Median: The middle value of a sample.  Sample Variance: A measure of the spread of a sample.  Quartiles, IQR: Measures of the spread of a dataset.  Outliers: Data points significantly different from the rest. Lecture 4: Random Sample Need to Know:Q-Q Plot: Quantile-Quantile plot, used to compare the distribution of a sample to a theoretical distribution. Lecture 5: Sampling Distribution Need to Know:Fisherman-Neymann Theorem: A fundamental result in sampling theory. Formulas:

Mean and Variance of Sampling Distribution of Sample Mean of Normally Distributed Random Variables: Important for understanding the distribution of sample means. Lecture 6: Bivariate Distribution Formulas:Multinomial Distribution: Generalization of the binomial distribution for more than two categories.  Sample Correlation Coefficient: Measures the strength and direction of a linear relationship between two variables. Certainly! Here are the summaries for the specified lectures: Lecture 8: Estimators Need to Know:What is an Estimator?: A statistic that provides an estimate of a population parameter. Formulas:Bias: The difference between the expected value of an estimator and the true value of the parameter being estimated.  Mean Squared Error (MSE): The average of the squared differences between the estimator and the true parameter value. Lecture 9: MLE and MoM Schatters Need to Know:MoM Schatters: Method of Moments estimators for parameters. Formulas:Maximum Likelihood Estimator (MLE): A method for estimating the parameters of a statistical model.  MoM Schatters for Parameters of Bernoulli and Normal Distribution: Methods for parameter estimation using the method of moments. Lecture 10: Test Statistics Need to Know:What is a Test Statistic?: A standardized value calculated from sample data during a hypothesis test. Lecture 11: Likelihood Ratio Statistics Need to Know:Formula of Test Statistics: Includes Z-test, one-sample t-test, two-sample t-test, paired t-test. Understanding when to apply these tests. Lecture 12: Likelihood Ratio, Wald, p-value