In the "Web Applications and Command-Line Tools for Data Engineering" course, you will delve into advanced concepts of Python, Bash, and SQL for data engineering. This course, part of the Python, Bash, and SQL Essentials for Data Engineering Specialization offered by Duke University, provides an in-depth exploration of creating and deploying models for machine learning tasks, constructing Python microservices with FastAPI, and building command-line tools in Python using Click.
Throughout the course, you will learn how to break up your data warehouse into small, portable solutions that can scale using Python microservices, and automate testing and quality control for publishing and sharing your tools with a data registry. The course is designed to equip you with the skills needed to address real-world data engineering challenges effectively.
Certificate Available ✔
Get Started / More InfoThe course modules cover a wide range of topics, including leveraging Jupyter notebooks for machine learning tasks, constructing Python microservices with FastAPI, and building command-line tools in Python using Click, providing comprehensive training on advanced data engineering techniques.
Throughout the "Jupyter Notebooks" module, you will gain a comprehensive understanding of leveraging Jupyter notebooks for machine learning tasks. You will explore the code and text cells in Jupyter, as well as the use of magics and Jupyter Lab. By the end of this module, you will have a strong foundation in using Jupyter notebooks for data engineering tasks.
The "Cloud-Hosted Notebooks" module focuses on tools like Colab and SageMaker for cloud-hosted notebook environments. You will learn about the features and functionality offered by these platforms, gaining practical experience in using Colab, SageMaker, and Jupyter notebooks to work with data and documents effectively.
The "Python Microservices" module introduces you to the world of Python microservices, emphasizing the benefits and components of microservices. You will learn how to set up project structures, build microservices with FastAPI, and deploy containerized microservices. By the end of this module, you will be equipped to create efficient and scalable Python microservices for data engineering tasks.
In the "Python Packaging and Command Line Tools" module, you will delve into packaging and distributing Python projects, focusing on building command-line tools using frameworks like Click. You will also explore continuous integration for command-line tools, including automation of testing and publishing processes. By the end of this module, you will have a comprehensive understanding of building and managing command-line tools in Python.
BI Foundations with SQL, ETL, and Data Warehousing is a comprehensive specialization designed to equip learners with the skills and knowledge needed for success...
Aggregate Data in SQL using MySQL Workbench
This course teaches data cleaning in SQL, focusing on detecting and removing duplicates, using CASE transformations, and handling Null values.
Introduction to Relational Databases (RDBMS) is a comprehensive beginner-level course covering data storage, processing, and access in relational databases, including...