Statistics Lecture Notes: 'W's, Graphs & Descriptive Stats (Ch. 3-5), Assignments of Science education

A set of lecture notes for a statistics course, covering chapters 3-5. The notes include an introduction to the 'W's of a data set (Who, What, Why, How, When/Where), an overview of graphs and descriptive statistics for both categorical and quantitative variables, and rules for data analysis. The document also includes examples and exercises.

Typology: Assignments

2019/2020

Uploaded on 08/29/2020

husnain-butt
husnain-butt 🇵🇰

2 documents

1 / 20

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Introduction to Statistics
Lecture Notes
Chapters 3-5
Please sign in (SIGNATURES) as you come in to class. It will save
my voice instead of my taking attendance (this is only to settle the
class roster).
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14

Partial preview of the text

Download Statistics Lecture Notes: 'W's, Graphs & Descriptive Stats (Ch. 3-5) and more Assignments Science education in PDF only on Docsity!

Introduction to Statistics

Lecture Notes

Chapters 3- Please sign in (SIGNATURES) as you come in to class. It will save my voice instead of my taking attendance (this is only to settle the class roster).

What’s up with the powerpoint?

 I don’t usually use slides, but am going to try to use these to save my voice somewhat.  (^) Notes: Still working on getting the class roster settled. Has been some movement on the waitlist, will keep in touch as things develop. Be sure you’ve signed in!  (^) First homework is posted (on our course website), but isn’t due until next Friday (the 4 th). The additional problem is NOT optional, that just means it is not a book problem.

The “W”’s of a Data Set

 (^) Who – the observations (population – set of all objects you are interested in obtaining the value of some parameter for – since we usually can’t observe all objects, we take a sample of objects – a subset of the overall population of objects to observe)  (^) Note: There is NO such thing as a population sample or sample population.  What – the variables  (^) Why – why was the data collected  How – how was the data collected (related to design/sampling in chapters 12-13)  (^) When/Where – more information that could be relevant

Chapters 3-5 Overview

 (^) Covers basic graphs and descriptive statistics for both categorical and quantitative variables  (^) This is what you would do as a “preliminary analysis” for a variable.  Recall: a data set can have multiple variables in it.  (^) These chapters focus on mostly univariate (single variable) analyses. There is one comparative graph – a side-by-side boxplot in Chapter 5.

Categorical Variable Prelim Analysis

 Frequency tables (one variable) – summarize counts by category  Contingency tables (2 or more variables) – summarize counts by category for multiple variables  Bar charts  Pie charts

Frequency

 What is frequency?  (^) Frequency is the number of objects/cases per category  You can also look at relative frequency.  (^) Relative frequency is the number of objects/cases per category divided by the total number of objects.  (^) Hence it gives proportions for each category out of the total.  (^) It is often converted to %.

Pie Charts

 Take 100% of cases and divide up 360 degrees based on relative frequencies.  (^) We will look at bar charts over pie charts.  Note that for bar charts you do not need to create bars for 100% of the cases. You could look at the top three risk factors for a disease, etc. However, we usually do have 100% of cases shown.

Contingency Tables - Example

 See first page of Handout  Totals for rows/columns give marginal distributions for each variable.  (^) You can also look at conditional distributions. Fix a row or column and work solely within that row or column.  (^) Concept of independence (will formalize later):  (^) If the distribution of one variable is the same for all categories of another variable, then the two variables are independent.

Simpson’s Paradox

 (^) Something that can happen when you aggregate categorical data  (^) Looking at overall averages or % can be misleading  (^) Can get different results looking at breakdown  (^) Berkeley Discrimination Data Example (see bottom of page one of the handout)  (^) Claims of Sexual Discrimination in1973 Graduate School Admissions  (^) Overall, 44.28% of males who applied were admitted, while only 34.58% of females were admitted.  (^) Look what happens when you breakdown by the 6 largest departments though! (try this on your own or with a partner). Is there evidence of discrimination against females at the dept. level? What is going on?

Quantitative Variables Preliminary Analysis

 Graphs  (^) Dot plot – won’t use much – read about on your own  (^) Stem and leaf – won’t use much – read about on your own  (^) Histogram  (^) Boxplot (chapter 5)  (^) Qqplot (Friday or next week)  (^) Time plot (Friday or next week)  (^) Descriptive statistics  (^) Measures of center: mean, median  (^) Measures of spread: standard deviation, IQR, range

Dot Plot – On Your Own

 Most basic quantitative graph  Use for a low number of observations (<50)  Basically use a number line and place a dot above it for each value you have observed.  Example from wikipedia:

Stem and Leaf – On Your Own

 (^) Your book discusses lots of options for these, including split leaves (which is something R/Rcmdr will do).  (^) Basics: You take your values and set a stem – maybe tens. Then the leaves are the ones place. For each stem, you list the leaves that coincide in numeric order.  (^) Usually works decently for fewer than 100 observations  Try it. Suppose you have scores on a pre-test for an at-risk youth group as follows:  5, 11, 13, 21, 34, 36, 45, 47, 48, 48, 49

Examples

 See page 2 of the handout  (^) Try to describe the shape of each histogram  Then see page 3 of the handout  (^) We’re going to create a histogram by hand if there is time  (^) If no time, you can do this on your own.

Cookie Lab

 Time Permitting (otherwise, Friday)  The last page (to turn in) is not due till the end of class tomorrow. So don’t worry if we don’t get to it today. You can look at it tonight or tomorrow in class (I’ll give last five minutes of class for you to work on it).