Course

Statistical Modeling for Data Science Applications

University of Colorado Boulder

Statistical Modeling for Data Science Applications is a comprehensive three-credit sequence that delves into the core of data science. Learners gain proficiency in advanced statistical modeling techniques, emphasizing the application of theories using the R programming language. This course focuses on cultivating a deep understanding of linear regression analysis, ANOVA, experimental design, and generalized linear and additive models.

Throughout the course, learners will explore recommended practices for ethical behavior and communication in statistics and data science, interpret important components of modern regression analysis, and implement testing-based procedures for model selection. They will also gain the ability to identify and interpret two-way ANOVA models and apply the concepts of replication, repeated measures, and full factorial design in the context of experimental design. Additionally, learners will describe how to generalize the linear model framework to accommodate data not suitable for the standard linear regression model, and they will learn about the advantages and disadvantages of generalized additive models.

Certificate Available ✔

Get Started / More Info
Statistical Modeling for Data Science Applications
Course Modules

This course comprises three modules that cover modern regression analysis in R, ANOVA and experimental design, and generalized linear models and nonparametric regression.

Modern Regression Analysis in R

Modern Regression Analysis in R introduces learners to recommended practices for ethical behavior and communication in statistics and data science. It covers important components of the MLR model, including the “systematic” and “random” components, and testing-based procedures for model selection. Learners will be able to select the “best” model based on a given procedure.

ANOVA and Experimental Design

ANOVA and Experimental Design allows learners to identify and interpret two-way ANOVA (and ANCOVA) models as a linear regression model. They will use these models to answer research questions using real data and apply concepts of replication, repeated measures, and full factorial design in the context of two-way ANOVA.

Generalized Linear Models and Nonparametric Regression

Generalized Linear Models and Nonparametric Regression module teaches learners to generalize the linear model framework to accommodate data not suitable for the standard linear regression model. It also covers the advantages and disadvantages of (generalized) additive models and how an additive model can be generalized to incorporate non-normal response variables.

More Probability and Statistics Courses

Causal Inference

Columbia University

Causal Inference is a Master's level course providing a rigorous mathematical survey of inferring causation, offering methods to estimate causal relationships and...

Inferential Statistics

University of Amsterdam

Inferential Statistics introduces students to making inferences from sample relations to population, covering significance testing, statistical tests, and software...

Stability and Capability in Quality Improvement

University of Colorado Boulder

Stability and Capability in Quality Improvement is a comprehensive course focusing on analyzing process stability, statistical control, and capability for quality...

Model Diagnostics and Remedial Measures

Illinois Tech

Model Diagnostics and Remedial Measures is a comprehensive course focusing on detecting and remedying violations of linear regression model assumptions. Participants...