Big Data Engineers and professionals with NoSQL skills are highly sought after in the data management industry. This Specialization is designed for those seeking to develop fundamental skills for working with Big Data, Apache Spark, and NoSQL databases.
The course covers popular NoSQL databases like MongoDB and Apache Cassandra, the widely used Apache Hadoop ecosystem of Big Data tools, as well as Apache Spark analytics engine for large-scale data processing.
Certificate Available ✔
Get Started / More InfoThis specialization covers the fundamentals of NoSQL databases, Big Data with Spark and Hadoop, and Machine Learning with Apache Spark, providing a comprehensive understanding and practical experience in these areas.
Differentiate between the four main categories of NoSQL repositories. Describe the characteristics, features, benefits, limitations, and applications of the more popular Big Data processing tools. Perform common tasks using MongoDB tasks including create, read, update, and delete (CRUD) operations. Execute keyspace, table, and CRUD operations in Cassandra.
Explain the impact of big data, including use cases, tools, and processing methods. Describe Apache Hadoop architecture, ecosystem, practices, and user-related applications, including Hive, HDFS, HBase, Spark, and MapReduce. Apply Spark programming basics, including parallel programming basics for DataFrames, data sets, and Spark SQL. Use Spark’s RDDs and data sets, optimize Spark SQL using Catalyst and Tungsten, and use Spark’s development and runtime environment options.
Describe ML, explain its role in data engineering, summarize generative AI, discuss Spark's uses, and analyze ML pipelines and model persistence. Evaluate ML models, distinguish between regression, classification, and clustering models, and compare data engineering pipelines with ML pipelines. Construct the data analysis processes using Spark SQL, and perform regression, classification, and clustering using SparkML. Demonstrate connecting to Spark clusters, build ML pipelines, perform feature extraction and transformation, and model persistence.
Explore the Cognitive Solutions and RPA Analytics course to understand the role of cognitive automation and RPA analytics in processing unstructured data.
Introduction to AWS Elastic File System provides hands-on experience in creating and configuring file systems in AWS, allowing seamless access from multiple instances....
Retrieve Data with Multiple-Table SQL Queries
Achieving Advanced Insights with BigQuery is a comprehensive course in Portuguese that delves into advanced SQL functions, query optimization, and data access control...