STATISTIC REVIEWER FOR CLASS, Study notes of Statistics

Statistics reviewer, simple and detailed

Typology: Study notes

2024/2025

Uploaded on 04/22/2025

unknown user
unknown user 🇭🇰

1 / 15

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
GIZMO: STATWRE GIZMO DECK
MIDTERMS
MODULE 1A: NATURE OF STATISTICS
1.1 Definition of Statistics
Statistics is a branch of mathematics that collects, organizes, analyzes, and interprets numerical data
to make decisions. It allows businesses, governments, and individuals to make informed choices based on
data.
1.2 Importance of Studying Statistics
Decision-makers use statistics to:
Present and describe business data properly.
Draw conclusions about a population using a sample.
Make forecasts about business and economic activities.
Improve business processes through data-driven decision-making.
1.3 Types of Statistics
1. Descriptive Statistics – Summarizes and describes a set of data (e.g., mean, median, mode,
charts).
2. Inferential Statistics – Draws conclusions or makes predictions about a population using sample
data.
1.4 Basic Vocabulary of Statistics
Variable: A characteristic that can take different values (e.g., height, weight, income).
Data: The values collected from variables.
Population: The entire group being studied.
Sample: A subset of the population used for analysis.
Parameter: A numerical measure that describes a characteristic of a population (e.g., population
mean).
Statistic: A numerical measure that describes a characteristic of a sample (e.g., sample mean).
1.5 Sources of Data
1. Primary Sources: Data collected first-hand for analysis (e.g., surveys, experiments).
2. Secondary Sources: Data collected by someone else (e.g., census reports, published research).
1.6 Methods of Data Collection
Survey (Questionnaire/Interview) – Directly asking respondents.
Observation – Recording behaviors without interference.
Experiment – Conducting controlled studies to determine cause-effect relationships.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download STATISTIC REVIEWER FOR CLASS and more Study notes Statistics in PDF only on Docsity!

GIZMO: STATWRE GIZMO DECK

MIDTERMS

MODULE 1A: NATURE OF STATISTICS

1.1 Definition of Statistics

Statistics is a branch of mathematics that collects, organizes, analyzes, and interprets numerical data to make decisions. It allows businesses, governments, and individuals to make informed choices based on data.

1.2 Importance of Studying Statistics

Decision-makers use statistics to: ● Present and describe business data properly.Draw conclusions about a population using a sample. ● Make forecasts about business and economic activities. ● Improve business processes through data-driven decision-making.

1.3 Types of Statistics

  1. Descriptive Statistics – Summarizes and describes a set of data (e.g., mean, median, mode, charts).
  2. Inferential Statistics – Draws conclusions or makes predictions about a population using sample data.

1.4 Basic Vocabulary of Statistics

Variable: A characteristic that can take different values (e.g., height, weight, income). ● Data: The values collected from variables. ● Population: The entire group being studied. ● Sample: A subset of the population used for analysis. ● Parameter: A numerical measure that describes a characteristic of a population (e.g., population mean). ● Statistic: A numerical measure that describes a characteristic of a sample (e.g., sample mean).

1.5 Sources of Data

  1. Primary Sources: Data collected first-hand for analysis (e.g., surveys, experiments).
  2. Secondary Sources: Data collected by someone else (e.g., census reports, published research).

1.6 Methods of Data Collection

Survey (Questionnaire/Interview) – Directly asking respondents. ● Observation – Recording behaviors without interference. ● Experiment – Conducting controlled studies to determine cause-effect relationships.

Registration – Collecting official records (e.g., birth certificates, company records).

1.7 Types of Sampling Methods

  1. Probability Sampling (Random selection):Simple Random Sampling (SRS): Each member has an equal chance of selection. ○ Systematic Sampling: Selecting every k th element. ○ Stratified Sampling: Dividing population into subgroups and randomly selecting within each. ○ Cluster Sampling: Selecting entire groups instead of individuals.
  2. Non-Probability Sampling (Non-random selection):Convenience Sampling: Choosing readily available subjects. ○ Judgment Sampling: Researcher selects subjects based on judgment. ○ Quota Sampling: Ensuring representation of subgroups.

MODULE 1B: BASIC STATISTICAL CONCEPTS

2.1 Types of Variables

Categorical (Qualitative) Variables – Represent categories (e.g., Gender: Male/Female). ● Numerical (Quantitative) Variables – Represent numbers (e.g., Height, Weight). ○ Discrete Variables: Countable values (e.g., number of children). ○ Continuous Variables: Measurable values (e.g., temperature).

2.2 Levels of Measurement

  1. Nominal Scale: Categories with no ranking (e.g., eye color, nationality).
  2. Ordinal Scale: Categories with ranking but no measurable differences (e.g., survey satisfaction levels).
  3. Interval Scale: Numeric values without a true zero (e.g., temperature in Celsius).
  4. Ratio Scale: Numeric values with a true zero (e.g., weight, income).

2.3 Data Organization Techniques

Frequency Tables: Organize data into classes with counts. ● Bar Graphs & Pie Charts: Display categorical data. ● Histograms & Line Graphs: Display numerical data.

MODULE 1C: DESCRIPTIVE STATISTICS

3.1 Measures of Central Tendency

  1. Mean (Arithmetic Mean)

○ Formula: 𝑀𝑒𝑎𝑛 = Σ𝑥𝑛

○ Affected by extreme values (outliers).

  1. Median (Middle Value)

Mean = Median = Mode.Total area under the curve = 1.

4.2 Standardized Normal Distribution (Z-Score)

● Converts any normal distribution into a standard normal distribution (mean = 0, SD = 1).

● Formula: 𝑍 = 𝑋−μσ

Z-scores above 0: Values above the mean. ● Z-scores below 0: Values below the mean.

MODULE 2A: CHARACTERISTICS OF A GOOD MEASUREMENT TOOL

4.1 Measurement in Research

● Measurement in research consists of assigning numbers to empirical events, objects, or properties in compliance with a set of rules. ● It involves a three-part process :

  1. Selecting observable empirical events.
  2. Developing mapping rules for assigning numbers or symbols.
  3. Applying the rules to observations.

4.2 What is Measured?

Variables in research can be classified as: ○ Objects: Tangible items (e.g., people, cars, buildings). ○ Properties: Characteristics of objects (e.g., height, weight, attitude, intelligence). ● Constructs like satisfaction, leadership, and engagement cannot be measured directly but require observation of indicants.

4.3 Sources of Measurement Differences

  1. The Respondent: Factors like social class, mood, fatigue, anxiety, and distractions can influence responses.
  2. Situational Factors: Interview setting, presence of others, and lack of anonymity can distort responses.
  3. The Measurer: Interviewer bias, rewording, non-verbal cues, and careless recording can introduce errors.
  4. The Instrument: Confusing questions, poor printing, and leading questions reduce reliability.

4.4 Characteristics of Good Measurement

Validity: Measures what it intends to measure. ○ External Validity: Generalizability across populations, settings, and times. ○ Internal Validity: Accuracy in measuring the intended concept. ● Reliability: Produces consistent results over repeated trials. ○ Stability: Consistency over time. ○ Equivalence: Consistency among different observers. ○ Internal Consistency: Homogeneity among measurement items. ● Practicality: Measurement must be economical, convenient, and interpretable.

4.5 Selecting a Measurement Scale

Factors influencing selection: ● Research objectives ● Response types ● Data properties ● Balanced vs. unbalanced scales ● Forced vs. unforced choices ● Number of scale points ● Rater errors

MODULE 2B: SURVEY RESEARCH

5.1 Steps in Survey Research

  1. Define research goals.
  2. Develop a budget and resources.
  3. Design research (target population, sampling frame, sample size).
  4. Choose a survey method (mail, phone, interview, web, direct observation).
  5. Design the questionnaire.
  6. Pretest and revise the survey.
  7. Administer the survey and collect responses.
  8. Code and analyze data.

5.2 Types of Survey Methods

Mail Surveys: Low response rates, high nonresponse bias. ● Telephone Surveys: Low response rates, difficult targeting. ● Interviews: High cost, high quality, useful for sensitive topics. ● Web Surveys: Fast, cost-effective, but susceptible to bias. ● Direct Observation: Unobtrusive, but may require informed consent.

5.3 Survey Guidelines

● Consider staff expertise, budget, and precision. ● Ensure high-quality survey design. ● Conduct pilot tests before full deployment. ● Increase response rates by explaining purpose and offering incentives. ● Work with experts for better design and analysis.

5.4 Questionnaire Design Best Practices

● Use white space for readability. ● Provide clear instructions. ● Ensure anonymity. ● Organize questions logically. ● Use filters (e.g., "If no, skip to question 7"). ● Keep surveys as short as possible.

5.5 Types of Survey Questions

Open-ended: "Describe your job goals."

Errors in Hypothesis Testing

Type I Error (α) : Rejecting H₀ when it is true. ● Type II Error (β) : Failing to reject H₀ when it is false.

Z-Test for Population Mean (σ Known)

● Used when population standard deviation (σ) is known.

● Requires n ≥ 30 or normally distributed population.

Formula:

σ 𝑛

Where:

= sample mean ○ μ = population mean (hypothesized value) ○ σ = population standard deviation ○ n = sample size

Decision Making:

Critical Value Approach: Compare z-computed to z-critical. ● P-Value Approach: If p-value < α , reject H₀.

Example:

● A researcher claims students score better than 515 in a test. ● Sample: n = 40 , X̄ = 540 , σ = 114. ● Compute z and compare to z-critical at α = 0.05.

JASP Output Interpretation

JASP is statistical software used for hypothesis testing. ● The output includes: ○ Test statistic (z-value)P-valueConfidence intervalsDecision: Reject or fail to reject H

Example JASP Output Interpretation:

Computed z-value = 1.Critical z-value at α = 0.05 (one-tailed) = 1.

P-value = 0.Decision: Fail to reject H(Not enough evidence to support the claim).

T-Test for Population Mean (σ Unknown)

● Used when σ is unknown.

● Uses sample standard deviation (s).

● Sample size n < 30 must be from a normal distribution.

Formula:

𝑠 𝑛

Where s = sample standard deviation.

● Uses Student’s t-distribution with df = n - 1.

● Critical values obtained from t-table.

Example:

● A researcher wants to test if the mean salary of a sample differs from $59,. ● Uses a t-test since population σ is unknown.

Choosing Between Z-Test and T-Test

Situation Test to Use

σ is known, n ≥ 30 Z-Test

σ is unknown, n < 30 T-Test

σ is unknown, n ≥ 30 (CLT applies)

One-Tailed vs. Two-Tailed Tests

One-tailed : Tests for direction (e.g., greater than, less than). ● Two-tailed : Tests for difference (e.g., not equal to).

Hypothesis Type of Test Rejection Region

H₁: μ > μ₀ Right-tailed z > z-critical

Statistical tests help determine if there is enough evidence to support or reject a claim.

Types of Hypotheses

  1. Null Hypothesis (H)

○ States there is no significant difference between the sample and the population. ○ Contains “=” , “≤” , or “≥” symbols. ○ Example: H: μ = 5,320 (The mean household electricity expense has not changed.)

  1. Alternative Hypothesis (H)

○ Suggests a difference or effect exists. ○ Uses “≠” , “>” , or “<”. ○ Example: H: μ ≠ 5,320 (The mean electricity expense has changed.)

Steps in Hypothesis Testing

  1. State Hand H
  2. Collect sample data
  3. Compute the test statistic (t-value)
  4. Compare the test statistic to the critical value OR compute the p-value
  5. Make a decision (Reject or fail to reject H) and conclude.

Errors in Hypothesis Testing

Type I Error (α): Rejecting H₀ when it is true. ● Type II Error (β): Failing to reject H₀ when it is false

T-Test for Population Mean (σ Unknown)

● Used when population standard deviation (σ) is unknown.

● Uses sample standard deviation (s).

● Requires Student’s t-distribution.

Formula:

𝑠 𝑛

Where:

= sample mean ○ μ = population mean (hypothesized value) ○ s = sample standard deviation

n = sample size

Decision Making:

Critical Value Approach: Compare t-computed to t-critical (from t-table). ● P-Value Approach: If p-value < α , reject H₀.

Example:

● A researcher tests if household electricity expenses have changed from PhP 5,320. ● Sample: n = 35 , X̄ = 6,480 , s = 1,. ● Compute t and compare to t-critical at α = 0.05.

Properties of the Student’s T-Distribution

Similarities to Z-Distribution: ○ Bell-shaped, symmetric. ○ Mean, median, and mode = 0. ● Differences: ○ Variance > 1. ○ Based on degrees of freedom (df = n - 1). ○ Approaches normal distribution as n increases.

JASP Output Interpretation

JASP is statistical software for hypothesis testing. ● The output includes: ○ Test statistic (t-value)P-valueConfidence intervalsDecision: Reject or fail to reject H

Example JASP Output Interpretation:

Computed t-value = 3.Degrees of freedom (df) = 6P-value < 0.Decision: Reject H(There is significant evidence to support the claim).

One-Tailed vs. Two-Tailed Tests

One-tailed : Tests for direction (e.g., greater than, less than). ● Two-tailed : Tests for difference (e.g., not equal to).

Hypothesis Type of Test Rejection Region

H₁: μ > μ₀ Right-tailed t > t-critical

H₁: μ < μ₀ Left-tailed t < - t-critical

● 59% of consumers purchase gifts for their fathers. ● 50.3% of businessmen own stocks and mutual funds. ● 55% of Filipinos buy generic products ● 40% of Filipino families eat out once a week.

2. Conditions for a Valid Z-Test for Proportions

Before using the z-test, the following conditions must be met:

  1. The sample is a random sample.
  2. The sample size must be large enough , satisfying: ○ 𝑛 · 𝑝 ≥ 5 ○ 𝑛 · (1 − 𝑝) ≥ 5
  3. Sampled values are independent of each other. 3. Formula for the Test Statistic

The test statistic for a z-test for a population proportion is:

𝑧 = or

𝑝(1−𝑝) 𝑛

𝑧 =

Where:

● 𝑝= population proportion ● 𝑝= sample proportion ● 𝑛= sample size

4. Hypothesis Testing Approaches

A. Traditional Method

  1. State the hypotheses : ○ Null hypothesis (𝐻𝑜): 𝑝 = 𝑝 0 ○ Alternative hypothesis (𝐻 1 ): 𝑝 ≠ 𝑝 0 , 𝑝 > 𝑝 0 , 𝑝 < 𝑝 0
  2. Compute the test statistic using the formula above.
  3. Find the critical value from the z-table (based on the chosen significance level α\alpha).
  4. Make a decision : ○ If the computed z-value falls in the rejection region, reject (𝐻𝑜)
  5. Summarize the conclusion.

B. P-Value Approach

  1. State the hypotheses (same as above).
  2. Compute the test statistic.
  3. Find the p-value from the z-table: ○ Two-tailed: Multiply by 2. ○ Left-tailed: p-value is the area to the left of z. ○ Right-tailed: p-value is the area to the right of z.
  4. Compare p-value with α\alpha : ○ If 𝑝 < 𝑎, reject (𝐻𝑜)
  5. Summarize the conclusion. 5. Examples

Example 1: Traditional Method

A survey in 2018 showed that 41% of families ate dinner together every night. A recent study of 1, families found that 405 had dinner together every night. At a 0.05 significance level , has the proportion decreased?

Given: ○ p=0.41, α=0. ○ 𝑝 = (^) 1,127^405 = 0. 3594 ● Compute z-va1lue:

○ 𝑧 =

0.41(1−0.41) 1,

=− 0. 3594

Compare with Critical Value (−1.645) ○ Since − 0. 3594 <− 1. 654, reject (𝐻𝑜) ○ Conclusion: The proportion of families eating dinner together every night has decreased.

Example 2: P-Value Approach

20 years ago, 76% of Americans preferred American cars. A new survey of 56 people found that 38 still preferred American cars. At 0.01 significance level , is this proportion different?

Given: ○ p=0.76, α=0. ○ 𝑝 = 3856 = 0. 6786 ● Compute z-value:

○ 𝑧 =

0.76(1−0.76) 56

=− 1. 43

Find p-value: ○ From the z-table, p = 0. ○ Since it's a two-tailed test , multiply by 2 → p = 0.