Mathematics & Statistics, Study notes of Mathematical Statistics

Mathematical Statistics notes and mini self made study guide

Typology: Study notes

2025/2026

Available from 06/04/2026

tk-chauke
tk-chauke 🇺🇸

3 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
U N I VERS I T Y TEX TBOOK S E R I ES
Quantitative Analysis & Document
Engineering
A Unified Framework for Structural Modeling, Inferential Statistics, and Modern Typesetting
Chapter 1: Foundations of Continuous Probability & Sampling
Theory
1.1 The Transition to Continuous Frameworks
In introductory mathematics, probability spaces are often bounded by discrete, countable structures—such as the
permutations of a card deck or the finite outcomes of rolling dice. At the university level, real-world phenomena
(e.g., fluid velocities, economic asset lifespans, or continuous physical metrics) require the adoption of continuous
random variables. A continuous variable can assume an infinite number of real values within any specified
interval. Because the probability of a continuous variable matching any singular precise point value is
mathematically zero, probabilities are calculated across a bounded domain.
This formulation is achieved by evaluating the definite integral of a Probability Density Function (PDF),
denoted as f(x), over an interval boundary:
P(a ≤ X ≤ b) = ∫a
b f(x) dx
To establish mathematical validity, any continuous density model must adhere strictly to two baseline constraints:
the function must remain non-negative across all real limits (f(x) 0), and the absolute cumulative space
underneath the global density curve must equal unity:
-∞
f(x) dx = 1
1.2 Theoretical Distribution Systems
Phenomena across diverse natural frameworks typically organize into predictable geometric distribution structures:
The Binomial Distribution: A discrete distribution evaluating successes (k) over fixed independent trials (n)
under a constant target probability (p).
University Textbook Series: Quantitative & Visual Architecture Page 1
pf3
pf4
pf5

Partial preview of the text

Download Mathematics & Statistics and more Study notes Mathematical Statistics in PDF only on Docsity!

U N I V E R S I T Y T E X T B O O K S E R I E S

Quantitative Analysis & Document

Engineering

A Unified Framework for Structural Modeling, Inferential Statistics, and Modern Typesetting

Chapter 1: Foundations of Continuous Probability & Sampling

Theory

1.1 The Transition to Continuous Frameworks

In introductory mathematics, probability spaces are often bounded by discrete, countable structures—such as the permutations of a card deck or the finite outcomes of rolling dice. At the university level, real-world phenomena (e.g., fluid velocities, economic asset lifespans, or continuous physical metrics) require the adoption of continuous random variables. A continuous variable can assume an infinite number of real values within any specified interval. Because the probability of a continuous variable matching any singular precise point value is mathematically zero, probabilities are calculated across a bounded domain.

This formulation is achieved by evaluating the definite integral of a Probability Density Function (PDF) ,

denoted as f(x) , over an interval boundary:

P(a ≤ X ≤ b) = ∫ab^ f(x) dx

To establish mathematical validity, any continuous density model must adhere strictly to two baseline constraints:

the function must remain non-negative across all real limits ( f(x) ≥ 0 ), and the absolute cumulative space underneath the global density curve must equal unity:

∫-∞∞^ f(x) dx = 1

1.2 Theoretical Distribution Systems

Phenomena across diverse natural frameworks typically organize into predictable geometric distribution structures:

The Binomial Distribution: A discrete distribution evaluating successes ( k ) over fixed independent trials ( n ) under a constant target probability ( p ).

The Poisson Distribution: Tracks the frequency of occurrences over a constant spatial or temporal continuum given a baseline occurrence rate ( λ ). The Normal Distribution: A perfectly symmetrical continuous curve completely defined by its mean parameter ( μ ) and variance value ( σ² ).

When continuous data models achieve perfect normal symmetry, they conform to the Empirical Rule (68-95-99. Rule). This rule establishes that approximately 68.27% of observations fall within one standard deviation of the

central mean ( μ ± 1σ ), 95.45% line up within two standard deviations ( μ ± 2σ ), and 99.73% are contained within

three standard deviations ( μ ± 3σ ).

Figure 1.1: Standard Gaussian distribution highlighting primary standard deviation limits.

1.3 The Central Limit Theorem (CLT)

The Central Limit Theorem provides the operational framework for inferential statistical modeling. It states that if

independent random samples of size n are extracted from any general population—regardless of whether that source domain is highly skewed, uniform, or completely asymmetrical—the sampling distribution of the sample

mean ( ) converges to a normal distribution as the sample size increases ( n ≥ 30 ).

The arithmetic expectation of this sample distribution corresponds to the underlying population mean ( μX̄ =

μ ), while its variation profile narrows as sample sizing scales. This baseline variability is captured by the Standard Error of the Mean (SE) :

σX̄ = σ / &sqrt;n

-1σ μ +1σ

68.3%

χ² = ∑ [ (O - E)² / E ]

Chapter 3: Linear Regression & Diagnostic Modeling

3.1 Ordinary Least Squares (OLS) Optimization

Regression modeling maps predictive dependencies between independent variables ( X ) and continuous responses

( Y ). The complete population space is governed by the structural line formula:

Y = β 0 + β 1 X + ε

The sample estimation equivalents ( ŷ = b 0 + b 1 x ) are calculated using the Ordinary Least Squares (OLS) framework. OLS minimizes the cumulative Sum of Squared Residuals (SSR), tracking the vertical displacement distances between real coordinates and the model line:

Minimize ∑ ei² = ∑ (yi - ŷi)²

3.2 Core Modeling Assumptions and Diagnostic Metrics

For OLS estimators to provide the Best Linear Unbiased Estimates (BLUE), the residual errors must satisfy four structural criteria:

Assumption Theoretical Requirement Diagnostic Vector

Linearity Structural changes between variables must behave linearly. Residual vs. Fitted scatteruniformity.

Homoscedasticity Residual error variance must remain uniform across alllevels.

Absence of funnel configurations in error plots.

Independence Error occurrences must remain uncorrelated acrossobservations. Durbin-Watson scorestargeting 2.0.

Normality Calculated residuals must fit a normal distribution curve. Diagonal alignment alonga Q-Q reference path.

Chapter 4: The Architecture of Academic Document Layout

4.1 Page Geometry & The Typographic Grid

In academic text design, visual presentation directly impacts data credibility. Every element follows strict geometric layout constraints. Pages use asymmetric balanced margins, typically following a 2:3:4:6 ratio (Inside, Top, Outside, Bottom margins). Inside margins include a dedicated gutter padding of 5mm to 10mm to offset physical binding curvature, maintaining text planarity.

The text line width is limited to contain between 45 and 75 characters per line. Lengthy rows tire the reader's eyes during horizontal tracking, while overly tight lines disrupt reading flow.

4.2 The Three-Line Rule for Data Presentation

Academic tables discard heavy black vertical borders and alternating colored row fills. In accordance with professional styling standards, data organization relies exclusively on three horizontal lines: a top boundary line opening the table frame, a secondary header separator line isolating title cells, and a single bottom baseline rule closing the data matrix. Furthermore, academic layouts dictate that table captions must always be positioned above the data grid, while figure captions are placed below the visual asset frame.

Chapter 5: Digital Document Engineering: The LaTeX Paradigm

5.1 Programmatic Layout Compilation

Advanced scientific and quantitative documentation replaces visual processors with programmatic markup engines like LaTeX. Visual formatting engines can distort mathematical formulas or break cross-reference linkages across different operating systems. In a programmatic typesetting layout, layout commands are compiled explicitly from text-based source documents. This allows complex equations to render with sub-micrometer positioning accuracy regardless of the local hardware platform.

5.2 Micro-Spacing Mechanics

Programmatic layout engines evaluate paragraph typography dynamically to prevent standard editing errors. For example, three hyphens are compiled to generate a solid Em-Dash (—) for strong parenthetical sentences. Furthermore, layout compilation constraints automatically recalculate spacing weights to eliminate orphans (isolated paragraph lines left at page bases) and widows (concluding rows left floating at top margin limits), preserving grid symmetry across publication volumes.