Stats Start Here - Statistical Methods - Lecture Notes | STT 200, Study notes of Data Analysis & Statistical Methods

Material Type: Notes; Class: Statistical Methods; Subject: Statistics and Probability; University: Michigan State University; Term: Unknown 2007;

Typology: Study notes

Pre 2010

Uploaded on 07/28/2009

koofers-user-2mr-1
koofers-user-2mr-1 🇺🇸

10 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Stat 200, Sec-106 Lecture Note-1 05/14/2007
CHAPTER – I
STATS STARTS HERE
What is statistics?
Statistics have two different meanings.
In plural sense Statistics means a set of observations, usually collected by measurements or
counting, collectively known as Data. It also particular calculations made from data.
In singular sense Statistics refers to a group of scientific methods used to
1. Collecting Data
2. Organizing and summarizing Data.
3. Interpreting and analyzing Data
4. Making conclusion
CHAPTER - 2
DATA
What is Data?
Systematically recorded information, whether numbers or labels, together with context
Data Table
Context - tells Who, What, When, Where, How and Why is measured
1. Who – individuals about whom data are collected (participants, respondents, subjects,
experimental units, records, cases)
2. What – characteristics recorded about each individual (variables)
3. When – time
4. Where – place
5. How – method of collecting data
6. Why – purpose of study
1
pf3
pf4
pf5

Partial preview of the text

Download Stats Start Here - Statistical Methods - Lecture Notes | STT 200 and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity!

Stat 200, Sec-106 Lecture Note-1 05/14/ CHAPTER – I STATS STARTS HERE What is statistics? Statistics have two different meanings.  In plural sense Statistics means a set of observations, usually collected by measurements or counting, collectively known as Data. It also particular calculations made from data.  In singular sense Statistics refers to a group of scientific methods used to

  1. Collecting Data
  2. Organizing and summarizing Data.
  3. Interpreting and analyzing Data
  4. Making conclusion CHAPTER - 2 DATA What is Data? Systematically recorded information, whether numbers or labels, together with context Data Table Context - tells Who, What, When, Where, How and Why is measured 1. Who – individuals about whom data are collected (participants, respondents, subjects, experimental units, records, cases) 2. What – characteristics recorded about each individual (variables) 3. When – time 4. Where – place 5. How – method of collecting data 6. Why – purpose of study

Variable, Characteristic  what you actually measure or record Types of Variables Quantitative: Outcome is a number  height or concentration (continuous) - can be any (positive) real number  number of flowers on a plant (discrete) - the scale is made up of distinct numbers with no gaps in between. Usually a whole number but also e.g. grade in an MSU course: 0 1 1.5 2 2.5 3 3.5 4 Categorical: outcomes fall in to categories we can list  eye color - blue, black.  Sometimes numbers can be used as names -- male=1, female=2.  choices on a survey: never, rarely, occasionally, often, always CHAPTER 3 DISPLAYING CATEGORICAL DATA Example: For each of 2201 people on the Titanic the following variables were recorded: Ticket Class (First, Second, Third, Crew), Survival (Dead or Alive), Age (Adult or Child) and Sex (Male or Female). ONE VARIABLE Who = people on Titanic What = Ticket Class Distribution Frequency Table = list of categories and counts or percentage of observations of each category. Class Count (frequency) % (relative frequency) First 325 14. Second 285 12. Third 706 32. Crew 885 40. Total

Variables

Categorical or Qualitative Quantitative or Numerical

` Discrete Continuous

Percentages of Table Ticket Class First Second Third Crew Total Survival Alive 202 9.18%

Dead 123 5.59%

Total 325 14.8%

Marginal distributions First Second Third Crew Survival Alive 710 32.3% Dead 1491 67.7% Ticket Class

Percentages of Column Ticket Class First Second Third Crew Total Survival Alive 202 62.2%

Dead 123 37.8%

Total 325 100%

Percentages of Row Ticket Class First Second Third Crew Total Survival Alive 202 28.5%

Dead 123 8.25%

Total 325 14.8%

Pie charts

Alive

First Second Third Crew Dead First Second Third Crew

Segmented Bar Chart A distribution of one variable given value of another is called a conditional distribution. Variables are independent, if the conditional distribution for each category is the same as the corresponding marginal distribution. CONCLUSION. Survival rate and ticket class are dependent. CHAPTER 4 DISPLAYING QUANTITATIVE DATA Changes in Enron’s Stock prices (Jan. 1997 to Dec. 1999) Histogram Sorted data -4.50 -3.25 -1.75 -1.59 -1.44 -1.22 -0.88 -0.88 -0.75 -0.72 -0.69 -0. -0.50 -0.38 -0.28 -0.22 -0.16 0.03 0.12 0.34 0.47 0.62 0.69 0. 0.78 0.81 1.16 1.47 2.06 2.22 2.44 3.28 3.34 4.12 4.31 5. First First Second Second Third Third Crew Crew 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Alive Dead Crew Third Second First

Graphs convey information about distribution of the data, that is its  symmetric (same right to left)  bell-shaped  long tails  unusual observations which do not “fit” to the data – the outliers  skewed to the right or left  unimodal (one major peak)  bimodal (two major peaks)  spread (wide or narrow) Dot Plot 0 0,5 1 1,5 2 2,5 3

Time