Big Data and Data Analytics Cloudera.
Big Data and Data Analytics Cloudera.
Duration: 3 Days
Overview
Cloudera Educational Services' three-day Data Analyst Training course will teach you to apply
traditional data analytics and business intelligence skills to big data. This course presents the
tools data professionals need to access, manipulate, transform, and analyze complex data
sets using SQL and familiar scripting languages.
What to Expect
This course is designed for data analysts, business intelligence specialists,
developers, system architects, and database administrators. Some knowledge of SQL is
assumed, as is basic Linux command-line familiarity. Prior knowledge of Apache Hadoop is
not required.
Apache Hive makes transformation and analysis of complex, multi-structured data scalable
in Cloudera environments. Apache Impala enables real-time interactive analysis of the data
stored in Hadoop using a native SQL environment. Together, they make multi-structured data
accessible to analysts, database administrators, and others without Java programming
expertise.
Course Detail
Introduction
• Apache Hadoop Fundamentals
• The Motivation for Hadoop
• Hadoop Overview
• Data Storage: HDFS
• Distributed Data Processing: YARN, MapReduce, and Spark
• Data Processing and Analysis: Pig, Hive, and Impala
• Database Integration: Sqoop
• Other Hadoop Data Tools
• Exercise Scenario Explanation
Data Management
• Data Storage
• Creating Databases and Tables
• Loading Data
• Altering Databases and Tables
• Simplifying Queries with Views
• Storing Query Results