Understanding Data: A Statistics Primer, Study notes of Statistics

An introduction to the concept of data, its structure, and the role of statistics in collecting, describing, and analyzing data. It covers the basics of cases and variables, categorical and quantitative data, and the importance of statistics in making informed decisions. Real-life examples are used to illustrate the concepts.

Typology: Study notes

2021/2022

Uploaded on 08/05/2022

dirk88
dirk88 🇧🇪

4.4

(222)

3.1K documents

1 / 23

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Statistics: Unlocking the Power of Data Lock5
Section 1.1
The Structure of Data
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17

Partial preview of the text

Download Understanding Data: A Statistics Primer and more Study notes Statistics in PDF only on Docsity!

Section 1.

The Structure of Data

Outline

 Data

 Cases and variables

 Categorical and quantitative variables

 Explanatory and response variables

 Using data to answer a question

Data

 Data are a set of measurements taken on a set

of individual units

 Usually data is stored and presented in a

dataset , comprised of variables measured on

cases

Cases and Variables

We obtain information about cases or units.

A variable is any characteristic that is

recorded for each case.

 Generally each case makes up a row in a

dataset, and each variable makes up a column

Intro Statistics Survey Data

Diet Coke and Calcium

  • Diet cola Drink Calcium Excreted
  • Diet cola
  • Diet cola
  • Diet cola
  • Diet cola
  • Diet cola
  • Diet cola
  • Diet cola
    • Water
    • Water
    • Water
    • Water
    • Water
    • Water
    • Water
    • Water

Data Applicable to You

 Think of a potential dataset (it doesn’t have to

actually exist) that you would be interested in

analyzing

What are the cases?

What are the variables?

What interesting questions could it help you

answer?

Kidney Cancer

Source: Gelman et. al. Bayesian Data Anaylsis, CRC Press, 2004.

Counties with the highest kidney cancer death rates

Kidney Cancer

 If the values in the kidney cancer dataset are

rates of kidney cancer deaths, then the cases

are counties

 If the values in the kidney cancer dataset are

yes/no, then the cases are people

Categorical versus Quantitative

  • A categorical variable divides the

cases into groups

  • A quantitative variable measures a

numerical quantity for each case

 Variables are classified as either categorical

or quantitative :

Categorical Quantitative

Data can be used to answer interesting

questions!

Using Data to Answer a Question

1. Can eating a yogurt a day cause you to lose weight?
2. Do males find females more attractive if they wear red?
3. Does louder music cause people to drink more beer?
4. Are lions more likely to attack after a full moon?
(the answer to all of these questions is yes!)

Explanatory and Response

If we are using one variable to help us

understand or predict values of another

variable, we call the former the explanatory

variable and the latter the response variable

Examples:

 Does meditation help reduce stress?

 Does sugar consumption increase hyperactivity?

Variables

For each of the following situations:

 Which is the explanatory and which is the response
variable?
1. Can eating a yogurt a day cause you to lose weight?
2. Do males find females more attractive if they wear red?
3. Does louder music cause people to drink more beer?
4. Are lions more likely to attack after a full moon?