Course

Introduction to Data Science and scikit-learn in Python

LearnQuest

This course, "Introduction to Data Science and scikit-learn in Python," is a comprehensive exploration of Python and artificial intelligence for hypothesis testing and machine learning. The course begins with a foundational understanding of Python for data science, progressing to the application of essential libraries such as Numpy, Pandas, and scikit-learn for exploratory data analysis and machine learning. Learners will delve into the theory and mathematics of linear regression and apply a full pipeline for estimating diabetes progression. Additionally, they will gain hands-on experience in using classification models to predict the presence or absence of heart disease from patient health data.

By the end of this course, participants will have the practical skills to employ artificial intelligence techniques, test hypotheses, and apply machine learning models using Python, Numpy, Pandas, and scikit-learn.

Certificate Available ✔

Get Started / More Info
Introduction to Data Science and scikit-learn in Python
Course Modules

The course modules provide a step-by-step journey through Python programming for hypothesis testing, utilizing Numpy, Pandas, and scikit-learn, and applying machine learning techniques to predict the presence of heart disease.

Introduction to Python Programming for Hypothesis Testing

This module introduces learners to Python programming for hypothesis testing. It covers the basics of Python and Jupyter Notebook, including lists, dictionaries, loops, functions, and libraries. Participants will gain hands-on experience through coding assignments and quizzes, setting the stage for the subsequent modules.

Creating a Hypothesis: Numpy, Pandas, and Scikit-Learn

Learners will deep dive into Numpy, Pandas, and scikit-learn in this module. They will explore the manipulation and joining of dataframes, data reshaping, and combining data. The module also includes a case study on finding outliers, providing practical insights into data analysis and manipulation.

Scikit-Learn Revisited: ML for Hypothesis Testing

This module revisits scikit-learn for machine learning in hypothesis testing. Participants will learn to apply linear regression, explore train/test splits and cross-validation, and gain insights into the math behind machine learning. The module also delves into loading and analyzing datasets, providing practical skills for applying machine learning models.

Using Classification to Predict the Presence of Heart Disease

The final module focuses on using classification to predict the presence of heart disease. Learners will engage in predicting heart disease presence, strengthening their understanding of applying classification models to real-world health data.

More Data Analysis Courses

Applied Data Science with R

IBM

Applied Data Science with R equips learners with the essential skills to work with data in R, from data manipulation and analysis to creating visualizations and...

Sports Performance Analytics

University of Michigan

Sports Performance Analytics provides an in-depth exploration of sports analytics, using real data sets from various sports leagues to construct predictive models...

Data Analysis with SQL: Inform a Business Decision

Coursera Project Network

Data Analysis with SQL: Inform a Business Decision

SQL: A Practical Introduction for Querying Databases

IBM

SQL: A Practical Introduction for Querying Databases is a comprehensive course that equips learners with foundational and intermediate SQL knowledge necessary for...