Embark on a career in the high-growth field of data engineering with IBM's Data Engineering program. You'll acquire in-demand skills in Python, SQL, and databases, preparing you for entry-level data engineering roles in less than 5 months.
Throughout the program, you'll master practical skills such as creating, designing, and managing relational databases, working with NoSQL and Big Data technologies, and implementing ETL and data pipelines. With a focus on hands-on experience, you'll develop proficiency in using Python and Linux/UNIX shell scripts for data extraction, transformation, and loading.
Upon completion, you will showcase your expertise with a portfolio of projects and earn a Professional Certificate from IBM, empowering you to pursue entry-level data engineering opportunities. Additionally, you'll have the opportunity to earn up to 12 college credits and gain access to career resources including mock interviews and resume support.
Certificate Available ✔
Get Started / More InfoMaster the most up-to-date practical skills and knowledge data engineers use in their daily roles. Gain expertise in Python, SQL, ETL, Data Warehousing, NoSQL, Big Data, and Spark through hands-on labs and projects.
List basic skills required for an entry-level data engineering role.
Discuss various stages and concepts in the data engineering lifecycle.
Summarize concepts in data security, governance, and compliance.
Describe Python Basics including Data Types, Expressions, Variables, and Data Structures.
Demonstrate proficiency in using Python libraries such as Pandas, Numpy, and Beautiful Soup.
Access web data using APIs and web scraping from Python in Jupyter Notebooks.
Demonstrate your skills in Python for working with and manipulating data.
Implement webscraping and use APIs to extract data with Python.
Use Jupyter notebooks and IDEs to complete your project.
Describe data, databases, relational databases, and cloud databases.
Explain an Entity Relationship Diagram and design a relational database for a specific use case.
Develop a working knowledge of popular DBMSes including MySQL, PostgreSQL, and IBM DB2.
Analyze data within a database using SQL and Python.
Construct basic to intermediate level SQL queries using DML commands.
Compose more powerful queries with advanced SQL techniques like views, transactions, stored procedures, and joins.
Describe the Linux architecture and common Linux distributions and update and install software on a Linux system.
Develop shell scripts using Linux commands, environment variables, pipes, and filters.
Schedule cron jobs in Linux with crontab and explain the cron syntax.
Create, query, and configure databases and access and build system objects such as tables.
Perform basic database management including backing up and restoring databases as well as managing user roles and permissions.
Monitor and optimize important aspects of database performance.
Explain batch vs concurrent modes of execution.
Implement an ETL pipeline through shell scripting.
Describe data pipeline components, processes, tools, and technologies.
Explore the architecture, features, and benefits of data warehouses, data marts, and data lakes and identify popular data warehouse system vendors.
Design and populate a data warehouse, and model and query data using CUBE, ROLLUP, and materialized views.
Design and load data into a data warehouse, write aggregation queries, create materialized query tables, and create an analytics dashboard.
Differentiate between the four main categories of NoSQL repositories.
Describe the characteristics, features, benefits, limitations, and applications of the more popular Big Data processing tools.
Perform common tasks using MongoDB tasks including create, read, update, and delete (CRUD) operations.
Explain the impact of big data, including use cases, tools, and processing methods.
Apply Spark programming basics, including parallel programming basics for DataFrames, data sets, and Spark SQL.
Use Spark’s RDDs and data sets, optimize Spark SQL using Catalyst and Tungsten, and use Spark’s development and runtime environment options.
Describe ML, explain its role in data engineering, summarize generative AI, discuss Spark's uses, and analyze ML pipelines and model persistence.
Construct the data analysis processes using Spark SQL, and perform regression, classification, and clustering using SparkML.
Demonstrate connecting to Spark clusters, build ML pipelines, perform feature extraction and transformation, and model persistence.
Demonstrate proficiency in skills required for an entry-level data engineering role.
Design and implement various concepts and components in the data engineering lifecycle such as data repositories.
Showcase working knowledge with relational databases, NoSQL data stores, big data engines, data warehouses, and data pipelines.
Apply skills in Linux shell scripting, SQL, and Python programming languages to Data Engineering problems.
Business Intelligence and Visual Analytics is a comprehensive course focusing on data visualization, visual analytics, and advanced business intelligence topics...
Kickstart your journey into Data Warehousing and BI Analytics with this self-paced course offered by IBM. Gain practical knowledge and hands-on experience in designing,...
Showcase your Python skills in this hands-on Data Engineering Project. Apply ETL techniques, web scraping, and APIs to extract, transform, and load data using Python....
Advanced Database Engineer Project is a comprehensive course where you'll create a database and customer system for Little Lemon restaurant.