Inferential Statistics
Inferential Statistics covers parametric and non-parametric methods for hypothesis testing and data-driven decision-making. It addresses estimation, t-tests, ANOVA, and corresponding non-parametric approaches, accommodating a variety of data scenarios, including those that deviate from normality assumptions. Python-based techniques are used throughout, facilitating reliable analyses and effective communication of statistical results.
Introduction to Statistical Inference
Learning Objectives
Explain foundational statistical inference terminology (parameter, estimator, estimate) and differentiate point vs. interval estimation to build core inferential skills.
Indicative Content
Definition and importance of statistical inference
Key terms: Variable, Population, Sample, Statistical Distribution, Factor, Descriptive Statistics
Parameter vs. estimator vs. estimate
Confidence intervals with
scipy.stats.t.interval()
Hypothesis Testing
Learning Objectives
Construct null and alternative hypotheses and evaluate test statistics, errors, and p-values to guide data-driven decisions.
Indicative Content
Null Hypothesis (H₀) vs. Alternative Hypothesis (H₁)
Test statistic and rejection region
Errors in hypothesis testing (Type I and II)
p-value concept and interpretation
scipy.stats.ttest_1samp()
for hypothesis testing
Normality Assessment
Learning Objectives
Apply graphical (Q-Q plots) and numerical tests (Shapiro-Wilk, Kolmogorov-Smirnov) to determine if data meet normality assumptions for valid parametric testing.
Indicative Content
Q-Q Plot:
scipy.stats.probplot()
Shapiro-Wilk Test:
scipy.stats.shapiro()
Kolmogorov-Smirnov Test:
scipy.stats.kstest()
T-Distribution and Degrees of Freedom
Learning Objectives
Recognize the properties of the t-distribution, compare it to the normal distribution, and articulate how degrees of freedom impact parametric tests.
Indicative Content
Fundamentals of the t-distribution
Degrees of freedom (df)
Computing probabilities with
scipy.stats.t.pdf()
One-Sample T-Test
Learning Objectives
Conduct and interpret a one-sample t-test to compare a sample mean with a hypothesized population mean.
Indicative Content
Practical usage of one-sample t-tests
scipy.stats.ttest_1samp()
Independent Samples T-Test
Learning Objectives
Evaluate mean differences between two groups using independent samples t-tests, ensuring normality and equal variance assumptions are verified.
Indicative Content
Key assumptions (random sampling, normality, equal variance)
scipy.stats.ttest_ind()
for mean comparisonsChecking variance equality with
scipy.stats.levene()
Paired T-Test
Learning Objectives
Analyze mean differences for paired data (same subjects, two conditions/time points) using paired t-tests.
Indicative Content
Considerations for paired data
scipy.stats.ttest_rel()
T-Test for Correlation
Learning Objectives
Assess the statistical significance of correlation between two continuous variables with a t-test approach.
Indicative Content
Pearson correlation analysis
scipy.stats.pearsonr()
F-Test for Equality of Variances
Learning Objectives
Determine variance equality between two populations, a prerequisite for many parametric methods.
Indicative Content
Concept of F-tests
scipy.stats.f_oneway()
for variance-based comparisons
One-Way ANOVA
Learning Objectives
Compare three or more group means under normality and homogeneity of variances using one-way ANOVA.
Indicative Content
Concept and assumptions of one-way ANOVA
scipy.stats.f_oneway()
Two-Way ANOVA
Learning Objectives
Incorporate two independent variables into ANOVA, analyzing main and interaction effects on group means.
Indicative Content
Main vs. interaction effects
statsmodels.formula.api.ols()
andstatsmodels.api.stats.anova_lm()
Multi-Way ANOVA
Learning Objectives
Extend ANOVA to three or more independent variables, exploring multifaceted factor interactions.
Indicative Content
Multi-factor study designs
statsmodels.api.ols()
for multi-way ANOVA
Introduction to Non-Parametric Tests
Learning Objectives
Explain the rationale for non-parametric tests when parametric assumptions fail and weigh the trade-offs between these approaches.
Indicative Content
Definition of distribution-free methods
Conditions under which parametric assumptions break down
Power considerations in parametric vs. non-parametric tests
Relevant Python equivalents (replacing R functions)
Mann-Whitney U Test (Two Independent Groups)
Learning Objectives
Apply rank-based methods to compare two independent groups under non-normal or ordinal data conditions.
Indicative Content
Ranking combined samples and calculating sum of ranks
Computing the U test statistic
scipy.stats.mannwhitneyu()
for implementation in Python
Wilcoxon Signed Rank Test (Paired Samples)
Learning Objectives
Evaluate differences in paired observations using rank-based methods for non-normal data.
Indicative Content
Computing differences between paired observations
Ranking absolute differences and deriving the W statistic
scipy.stats.wilcoxon()
for performing the test in Python
Kruskal-Wallis Test (Multiple Independent Groups)
Learning Objectives
Compare three or more independent groups using a rank-based approach to detect distribution differences without normality assumptions.
Indicative Content
Ranking combined groups and deriving sum of ranks
Calculating the H test statistic
scipy.stats.kruskal()
for multi-group comparisons
Chi-Square Test for Independence (Categorical Data)
Learning Objectives
Determine the relationship between two categorical variables by comparing observed vs. expected frequencies and testing for independence.
Indicative Content
Constructing a contingency table
Computing expected frequencies and the χ² statistic
pandas.crosstab()
for contingency tablesscipy.stats.chi2_contingency()
for Chi-Square Test in Python
Tools and Methodologies
Python Data Environment
pandas
for data manipulation and constructing contingency tables (e.g.,pandas.crosstab())
numpy
for foundational numeric operations (array management, mathematical functions)
Statistical Testing and Analysis
scipy.stats
for parametric tests (t-tests, F-tests, ANOVA, normality checks) and non-parametric tests (Mann-Whitney U, Wilcoxon, Kruskal-Wallis, Chi-Square)statsmodels.formula.api.ols()
andstatsmodels.api.stats.anova_lm()
for one-way, two-way, and multi-way ANOVA analyses