Explore the intensive one-week course, Leveraging Unstructured Data with Cloud Dataproc on Google Cloud em Português Brasileiro, designed to build upon the foundational knowledge of Data Engineering on Google Cloud Platform. Delve into video lectures, demonstrations, and hands-on labs to master the creation and management of computing clusters for executing Hadoop, Spark, Pig, and/or Hive jobs on Google Cloud Platform.
Learn to access various cloud storage options, integrate Google's machine learning capabilities into your analysis, and create and manage Dataproc clusters using the web console and CLI. Discover how to utilize clusters for Spark and Pig jobs, create iPython notebooks integrated with BigQuery and storage, and integrate machine learning APIs into data analysis. This course requires a basic understanding of Big Data and Machine Learning on Google Cloud Platform (or equivalent experience) and some knowledge of Python.
Certificate Available ✔
Get Started / More InfoThis course is divided into four modules, guiding you through Cloud Dataproc fundamentals, executing Dataproc jobs, using GCP, and analyzing unstructured data with machine learning.
Module 1: Introdução ao Cloud Dataproc
Explore the foundational concepts of Cloud Dataproc, including defining unstructured data, extracting value from it, and the comparison between Cloud Dataproc and Hadoop options. Learn how to create and customize a Dataproc cluster, and gain hands-on experience through practical labs.
Module 2: Como executar jobs do Dataproc
Discover the methods for submitting jobs, the separation of storage and computation, and the importance of networking in data processing. Gain insights into sending Spark jobs and working with structured and semi-structured data, and build proficiency through hands-on labs.
Module 3: Como usar o GCP
Learn to utilize GCP, leverage BigQuery support, and customize clusters. Master the installation of software in a Dataproc cluster and automate cluster tasks using CLI commands, all through interactive labs and demonstrations.
Module 4: Como analisar dados não estruturados
Delve into the details of machine learning, its application, and natural language processing. Gain practical experience in adding machine learning to your data analysis through comprehensive hands-on labs.
Data Science Fundamentals with Python and SQL. Gain essential skills in Python, SQL, and statistical analysis for a career in data science.
Learn advanced data visualization techniques in Python using popular libraries like Seaborn, Altair, Bokeh, and Matplotlib. Master the art of choosing the right...
Data Warehousing with Oracle: Design a Database
Learn to optimize supply chain networks using MILP on RStudio in under 2 hours.