Docsity
Docsity

Prepara i tuoi esami
Prepara i tuoi esami

Studia grazie alle numerose risorse presenti su Docsity


Ottieni i punti per scaricare
Ottieni i punti per scaricare

Guadagna punti aiutando altri studenti oppure acquistali con un piano Premium


Guide e consigli
Guide e consigli


formule di statistica base, Schemi e mappe concettuali di Analisi Statistica

formule di statistica generale

Tipologia: Schemi e mappe concettuali

2025/2026

Caricato il 25/06/2026

magda-marianelli
magda-marianelli 🇮🇹

15 documenti

1 / 3

Toggle sidebar

Questa pagina non è visibile nell’anteprima

Non perderti parti importanti!

bg1
MEASURES OF CENTRAL TENDENCY
ARITHMETIC MEAN
→ central tendency but affected by extreme values.
X1 + X2 + X3 / n
MEDIAN
→ In an ordered list, a value that is in the "middle"
(50% above, 50% below). Not affected by extreme
values.
! for the position: (n+1) / 2
If we get a decimal number, the position is between 2
values and we do the mean of them.
MODE
→ central tendency, most occurring value, not
affected by extreme values, used for both numerical
and categorical data. It may not exist. There may be
several modes
GEOMETRIC MEAN
→ most frequent to average percentages, rates ...,
i.e. cases where the variable presents a cumulative
change. When we have the factor TIME.
𝑁𝑋1 * 𝑋2 * 𝑋3
(x calcolatrice: prodotto elevato alla 1/n)
WEIGHTED MEAN
[(X1*peso)+(X2*peso)...] / somma dei pesi
MEASURES OF VARIABILITY (DISPERSION)
RANGE
→ difference between the largest and smallest
observation. Omits how data are distributed and it is
sensitive to extreme values.
X largest - X smallest
QUARTILES
separate large data sets into four quarters:
The first quartile, Q1, separates approximately the
smallest 25% of the data from the remainder of the
data. Q2 is the median. The third quartile separates
approximately the smallest 75% of the data from the
remaining largest 25% of the data.
! For the position:
- Q1 is 0.25(n+1)
- Q2 is 0.50(n+1)
- Q3 is 0.75(n+1)
If we get a decimal number, the position is between 2
values and we do the mean of them.
pf3

Anteprima parziale del testo

Scarica formule di statistica base e più Schemi e mappe concettuali in PDF di Analisi Statistica solo su Docsity!

MEASURES OF CENTRAL TENDENCY

ARITHMETIC MEAN

→ central tendency but affected by extreme values. X1 + X2 + X3 / n

MEDIAN

→ In an ordered list, a value that is in the "middle" (50% above, 50% below). Not affected by extreme values. ! for the position: (n+1) / 2 If we get a decimal number, the position is between 2 values and we do the mean of them.

MODE

→ central tendency, most occurring value, not affected by extreme values, used for both numerical and categorical data. It may not exist. There may be several modes

GEOMETRIC MEAN

→ most frequent to average percentages, rates ..., i.e. cases where the variable presents a cumulative change. When we have the factor TIME. 𝑁 𝑋1 * 𝑋2 * 𝑋 (x calcolatrice: prodotto elevato alla 1/n)

WEIGHTED MEAN [(X1peso)+(X2peso)...] / somma dei pesi

MEASURES OF VARIABILITY (DISPERSION)

RANGE

→ difference between the largest and smallest observation. Omits how data are distributed and it is sensitive to extreme values. X largest - X smallest

QUARTILES

separate large data sets into four quarters: The first quartile, Q1, separates approximately the smallest 25% of the data from the remainder of the data. Q2 is the median. The third quartile separates approximately the smallest 75% of the data from the remaining largest 25% of the data. ! For the position:

  • Q1 is 0.25(n+1)
  • Q2 is 0.50(n+1)
  • Q3 is 0.75(n+1) If we get a decimal number, the position is between 2 values and we do the mean of them.

INTERQUARTILE RANGE

To solve the extreme values problem. High and low values of the observations are removed and the range is calculated 50% of the central data.

Q3 - Q

BOX PLOT

Graph that describes the shape of a distribution in terms of the five-number summary.

VARIANCE

It is the sum of the squared differences between each observation and the sample mean divided by the sample size. [(X1-Xmedio) + (X2-Xmedio) +…] / n

STANDARD DEVIATION

It is the most commonly used measure of dispersion. It shows the dispersion around the sample mean X. It has the same units as the original data.

● low SD: observations are very concentrated around the mean value. ● high SD: observation dispersed

COEFFICIENT OF VARIATION

It measures the relative dispersion. Often in percentage(%). It measures the dispersion in measurement units. It is useful to compare two or more data sets measured in different units. (SD/Xmedio) * 100% mi esce una % ! useful when we have two situations (eg. stocks with average price of last year) with the same standard deviation but we see that stock B has lower relative dispersion on its price. MEASURES OF THE RELATIONSHIPS BETWEEN TWO VARIABLES

COVARIANCE

→ measure of linear relationship, it is for the direction of the two variables. Sommatoria (Xi - Xmedio) / n ● >0 → tendono a muoversi nella stessa dir ● <0 → tendono a muoversi in dir opposte ● 0 → non sono correlate, indipendenti

CORRELATION COEFFICIENT

→ measures the relative strength of the linear relationship between two variables. It is dimensionless and varies between –1 and 1. COV(x,y) / SDx * SDy ● closer to -1 → stronger the negative linear relationship ● closer to 1 → stronger the positive linear relationship ● closer to 0 → lighter a linear relationship