0% found this document useful (0 votes)

8 views4 pages

Download

The document provides an overview of data analytics, highlighting the diverse sources of data and the role of data analysts in translating data into actionable insights. It outlines the data analytics process, types of analytics, and the skills required for data analysts, as well as the importance of data wrangling, statistical analysis, and data visualization. Additionally, it discusses various data types, repositories, and the ETL process essential for preparing data for analysis.

Uploaded by

ass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views4 pages

Download

Uploaded by

ass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Course Summary - Introduction to Data Analytics

Modern Data Ecosystem

Data is available in a variety of structured and unstructured datasets, residing in text, images, videos, click streams, user conversations, the Internet of Things or IoT
devices, social media platforms, real-time events that stream data, legacy databases, and data sourced from professional data providers and agencies. The sources have
never before been so diverse and dynamic.

Data analyst

translate data and numbers into plain language, so organizations can make decisions. Data analysts inspect and clean data for:

deriving insights
identify correlations
find patterns
apply statistical methods

to analyze and mined data and visualize data to interpret and present the findings of data analysis.

New technologies

like cloud computing, machine learning, and big data have a significant influence on the data ecosystem, providing access to limitless storage, powerful computing,
and advanced tools for data analysis.

Data Analytics

Data analytics is the process of gathering, cleaning, analyzing and mining data, interpreting results, and reporting the findings.

Types of Data Analytics

Descriptive Analytics "What happened?"

Diagnostic Analytics "Why did it happen?"

Predictive Analytics "What will happen next?"

Prescriptive Analytics "What should be done about it?"

The Data Analytics Process

* Understanding the problem and desired result

* Setting a clear metric
* Gathering data
* Cleaning data
* Analyzing and mining data
* Interpreting results
* Communicating the findings

Data Analysis Vs Data Analytics

Term Data Analysis Data Analytics

Definition Detailed examination of elements or structure of something Systematic computational analysis of data or statistics

Use of Can be done without numbers or data (e.g., business analysis, Almost invariably implies the use of data for numerical manipulation and
Number psychoanalysis, etc.) inference

Historical
Often based on inferences from historical data Not limited to historical data; can include predictive elements
Data

Responsibilities of Data Analyst

* Acquiring data from primary and secondary data sources
* Creating queries to extract required data from databases and other data collection systems
* Filtering, cleaning, standardizing, and reorganizing data in preparation for data analysis
* Using statistical tools to interpret data sets and to identify patterns and correlations in data
* Analyzing patterns in complex data sets and interpreting trends
* Preparing reports and charts that effectively communicate trends and patterns
* Creating appropriate documentation to define and demonstrate the steps of the data analysis process

Applications of Data Analytics

* Analyzing commercial content to identify and share information

* Monitoring health metrics, such as sugar levels for diabetes patients
* Every industry, including airlines, pharmaceuticals, banking, etc., can benefit from data analytics such as sales pipeline analysis, financial reporting, headcount planning
and review
* Companies are using data analytics to understand changes in customer buying habits during the pandemic
* Analytics helps companies pivot and cater to changing demand
* Sentiment analysis of tweets and news stories to inform investment decisions
* Satellite imagery to track industrial activities development
* Geolocation data to track store traffic and predict sales volume

Skills required for Data Analysts

Technical Skills Functional Skills Soft Skills

Proficiency in spreadsheets, statistical and visualization tools, Understanding of statistics, analytical Collaboration, effective communication,
programming, querying languages, and working with various techniques, problem-solving, data visualization, storytelling with data, stakeholder engagement,
data repositories and big data platforms. and project management. curiosity, and intuition.

Types of Data
Data is unorganized information that is processed to make it meaningful.

Structured Semi-Structured Un-Structured

Characteristics Well-defined structure, tabular format, Some organizational properties, Lacks specific structure, no mainstream database
schemas metadata-driven fit

Examples of Sources SQL databases, Spreadsheets, Online forms, E-mails, XML, Binary executables, Data Web pages, Social media feeds, Images,
Sensors, Logs integration Audio/Video, Documents

Storage and Analysis

Method Relational or SQL databases, standard XML, JSON, metadata for grouping and
Files, NoSQL databases, specialized analysis
analysis hierarchy

Different File Formats

Data professional work with a variety of data file types, and formats including delimited text files (CSVs and TSVs), Microsoft Excel XLSX, XML, PDF, and JSON.
These formats are used for storing, organizing, and sharing data in different ways, offering flexibility and compatibility with a wide range of applications and systems.

Data Sources

Relational Databases Systems like SQL Server, Oracle, MySQL, and IBM DB2 used for structured data storage.
Flat Files & XML Datasets Plain text formats with delimited values (CSV, TSV) or hierarchical structures (XML) for data organization.
APIs and Web Services Interfaces for interacting with data providers or applications, returning data in various formats.
Web Scraping Techniques for extracting specific data from web pages based on parameters, using tools like BeautifulSoup, Scrapy, and Selenium.
Data Streams Continuous flows of data from various sources (IoT devices, GPS data, web clicks, etc.) often timestamped and geo-tagged.
RSS Feeds Sources for capturing updated data from forums and news sites, streamed to user devices via a feed reader.

Data Repositories
A Data Repository is a general term that refers to data that has been collected, organized, and isolated so that it can be used for reporting, analytics, and also for archival
purposes. The different types of Data Repositories include:

Databases, which can be relational or non-relational, each follow organizational principles to show the kind of data they can store and the tools used to query, organize,
and retrieve data.

Data Lakes, that serve as storage repositories for large amounts of structured, semi-structured, and unstructured data in their native format.

Big Data Stores, that provide distributed computational and storage infrastructure to store, scale, and process very large data sets.

Data Warehouses, that consolidate incoming data into one comprehensive storehouse.

Data Marts, that are essentially sub-sections of a data warehouse, built to isolate data for a particular business function or use case.

Extract, Transform, and Load

ETL process is an automated process that converts raw data into analysis-ready data by:

* Extracting data from source locations

* Transforming raw data by cleaning, enriching, standardizing, and validating it
* Loading the processed data into a destination system or data repository

Data Pipeline, sometimes used interchangeably with ETL, encompasses the entire journey of moving data from the source to a destination data lake or application, using
the ETL process.

Data Sources, can be internal or external to the organization, and they can be primary, secondary, or third-party, depending on whether you are obtaining the data directly
from the original source, retrieving it from externally available data sources, or purchasing it from data aggregators.

Data that has been identified and gathered from the various data sources is combined using a variety of tools and methods to provide a single interface using which data
can be queried and manipulated.

The data you identify, the source of that data, and the practices you employ for gathering the data have implications for quality, security, and privacy, which need to be
considered at this stage.

Data Wrangling
Data Wrangling is an iterative process that involves data exploration, transformation, and validation.
* Structurally manipulate and combine the data using Joins and Unions.

* Normalize data, clean the database of unused and redundant data.

* Denormalize data, that is, combine data from multiple tables into a single table so that it can be queried faster.

* Clean data, which involves profiling data to uncover quality issues, visualizing data to spot outliers, and fixing issues such as missing
values, duplicate data, irrelevant data, inconsistent formats, syntax errors, and outliers.

* Enrich data, which involves considering additional data points that could add value to the existing data set and lead to a more
meaningful analysis.

Statistical Analysis
Statistical Analysis involves the use of statistical methods in order to develop an understanding of what the data represents.

Descriptive statistical analysis: provides a summary of what the data represents. Common measures include Central Tendency, Dispersion, and Skewness.

Inferential statistical analysis: involves making inferences, or generalizations, about data. Common measures include Hypothesis Testing, Confidence Intervals, and
Regression Analysis.

Data Mining

Data Mining, simply put, is the process of extracting knowledge from data. It involves the use of pattern recognition technologies, statistical analysis, and mathematical
techniques, in order to identify correlations, patterns, variations, and trends in data.

Data Visualization
Data visualization is the discipline of communicating information through the use of visual elements such as graphs, charts, and maps. The goal of visualizing data is to
make information easy to comprehend, interpret, and retain.

Project Report On Infosys BECG SEE
No ratings yet
Project Report On Infosys BECG SEE
33 pages
Introduction To Data Science and Data Analytics
No ratings yet
Introduction To Data Science and Data Analytics
72 pages
Gartner Magic Quadrant - Customer Engagement Center 2021
No ratings yet
Gartner Magic Quadrant - Customer Engagement Center 2021
30 pages
Unit I (Notes 2)
No ratings yet
Unit I (Notes 2)
16 pages
Coursera - IBM - Introduction To Data Analytics
No ratings yet
Coursera - IBM - Introduction To Data Analytics
13 pages
Summary - Introduction To Data Analytics (2) - 3978
No ratings yet
Summary - Introduction To Data Analytics (2) - 3978
7 pages
CCTV Expansion-Rfp
No ratings yet
CCTV Expansion-Rfp
30 pages
Procemin 2021 Maximizing Copper Production V Final
No ratings yet
Procemin 2021 Maximizing Copper Production V Final
16 pages
IBM - Introduccion Analisis de Datos
No ratings yet
IBM - Introduccion Analisis de Datos
148 pages
Data Analysis
No ratings yet
Data Analysis
11 pages
Chapter-1 Introduction To Data Analytics
No ratings yet
Chapter-1 Introduction To Data Analytics
34 pages
Crucible Institute of Management
No ratings yet
Crucible Institute of Management
36 pages
Unit I - Data Science
No ratings yet
Unit I - Data Science
161 pages
Unit - I DA
No ratings yet
Unit - I DA
107 pages
Model QP CCW 331 Nov Dec 2024
No ratings yet
Model QP CCW 331 Nov Dec 2024
3 pages
Introduction To Data Analysis
No ratings yet
Introduction To Data Analysis
18 pages
Module 1 & 2 DAEH QB
No ratings yet
Module 1 & 2 DAEH QB
69 pages
Data Analytics Mod 2 Netflix Case Study
No ratings yet
Data Analytics Mod 2 Netflix Case Study
4 pages
Human Resource Information Systems: Basics, Applications, and Future Directions 4th Edition Michael J. Kavanagh (Editor)
100% (2)
Human Resource Information Systems: Basics, Applications, and Future Directions 4th Edition Michael J. Kavanagh (Editor)
57 pages
TCMFrameworkV1 PDF
No ratings yet
TCMFrameworkV1 PDF
40 pages
Data Science - III
No ratings yet
Data Science - III
94 pages
Fundamentals of Data Analytics
No ratings yet
Fundamentals of Data Analytics
39 pages
Data and Analytics 4.1-2 v3 Handout
No ratings yet
Data and Analytics 4.1-2 v3 Handout
44 pages
Embedded Insurance by Rashmika Wijesinghe
No ratings yet
Embedded Insurance by Rashmika Wijesinghe
17 pages
Data Analysis - Unit1
No ratings yet
Data Analysis - Unit1
65 pages
Chapter 2 - Data Science
No ratings yet
Chapter 2 - Data Science
57 pages
Unit 1 - BD - Introduction To Big Data
No ratings yet
Unit 1 - BD - Introduction To Big Data
89 pages
L01-Fundamentals of Big Data and Data Analytics
No ratings yet
L01-Fundamentals of Big Data and Data Analytics
58 pages
Week 1
No ratings yet
Week 1
50 pages
Data Analytics For IOT
No ratings yet
Data Analytics For IOT
57 pages
Unit 1
No ratings yet
Unit 1
61 pages
Data Analystic
No ratings yet
Data Analystic
35 pages
Da Unit-1
No ratings yet
Da Unit-1
24 pages
Social Informatics Amikom
No ratings yet
Social Informatics Amikom
89 pages
DAUnit 1
No ratings yet
DAUnit 1
20 pages
02 Data Science
No ratings yet
02 Data Science
23 pages
All About Data Science
No ratings yet
All About Data Science
35 pages
Unit 1ppt
No ratings yet
Unit 1ppt
29 pages
Overview of Data Analysis
No ratings yet
Overview of Data Analysis
11 pages
Unit 1 Topic 1 Intro
No ratings yet
Unit 1 Topic 1 Intro
30 pages
Data Analytics
No ratings yet
Data Analytics
20 pages
Week 1 Lecture
No ratings yet
Week 1 Lecture
26 pages
Terri Clementson - Using Analytics To Cut Through Complexity
No ratings yet
Terri Clementson - Using Analytics To Cut Through Complexity
45 pages
Data & Data Analytics
No ratings yet
Data & Data Analytics
15 pages
1 - Konsep Big Data
No ratings yet
1 - Konsep Big Data
35 pages
Data Analytics-Wps Office
No ratings yet
Data Analytics-Wps Office
21 pages
Unit 1
No ratings yet
Unit 1
21 pages
Unit 1 - DATA ANALYTICS - KIT-601 - AKTU
No ratings yet
Unit 1 - DATA ANALYTICS - KIT-601 - AKTU
24 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
19 pages
IDSA - Lesson 1
No ratings yet
IDSA - Lesson 1
10 pages
Data-Analysis-Chapter 1-Compressed
No ratings yet
Data-Analysis-Chapter 1-Compressed
20 pages
Chapter-2 Data Science2
No ratings yet
Chapter-2 Data Science2
24 pages
ITGY403 Lesson 1
No ratings yet
ITGY403 Lesson 1
16 pages
Unit 1 Introduction To Data Analytics
No ratings yet
Unit 1 Introduction To Data Analytics
20 pages
Module 2 Provide Effective Information CustomerGuest Service
No ratings yet
Module 2 Provide Effective Information CustomerGuest Service
23 pages
Da Unit-1
No ratings yet
Da Unit-1
23 pages
Components of Information Systems
No ratings yet
Components of Information Systems
57 pages
Cloud Iq Overview
No ratings yet
Cloud Iq Overview
38 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
16 pages
Unit 2
No ratings yet
Unit 2
15 pages
Explaratory Data Analysis - Python
No ratings yet
Explaratory Data Analysis - Python
16 pages
Note GG Data Analytics Course
No ratings yet
Note GG Data Analytics Course
16 pages
Mini Project On Organisational Study - Harish EP
No ratings yet
Mini Project On Organisational Study - Harish EP
57 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
9 pages
What Is Data Analytics
No ratings yet
What Is Data Analytics
12 pages
CHP - 50 - IT and SCM
No ratings yet
CHP - 50 - IT and SCM
15 pages
Slide Ba Orientation Ftu Edited
No ratings yet
Slide Ba Orientation Ftu Edited
18 pages
ACC IT APP MIdterm Bigdata
No ratings yet
ACC IT APP MIdterm Bigdata
12 pages
Oracle Endeca Information Discovery: A Technical Overview: An Oracle White Paper January 2014
No ratings yet
Oracle Endeca Information Discovery: A Technical Overview: An Oracle White Paper January 2014
28 pages
TP 4 2docuatrimestre
No ratings yet
TP 4 2docuatrimestre
10 pages
CHAPTER 2 Emerging
No ratings yet
CHAPTER 2 Emerging
8 pages
Data Analytics
No ratings yet
Data Analytics
5 pages
Plantweb Optics Data Lake: Transform Data Into Intelligent Business Decisions
No ratings yet
Plantweb Optics Data Lake: Transform Data Into Intelligent Business Decisions
7 pages
Data Analitics 1
No ratings yet
Data Analitics 1
6 pages
Data Analysis
No ratings yet
Data Analysis
6 pages
Business Analytics Notes
No ratings yet
Business Analytics Notes
6 pages
Analysis Terms
No ratings yet
Analysis Terms
1 page
PGDM Semester 4 Internal Assessment: Customer Relationship Management
No ratings yet
PGDM Semester 4 Internal Assessment: Customer Relationship Management
4 pages
Chapter 2 Enterprise An Overview Alexis Leon - Presentation Transcript
No ratings yet
Chapter 2 Enterprise An Overview Alexis Leon - Presentation Transcript
4 pages
Puter 20241406 03
No ratings yet
Puter 20241406 03
8 pages
Business Analytics
No ratings yet
Business Analytics
2 pages
Analytics Ms Onesheet
No ratings yet
Analytics Ms Onesheet
2 pages
AI For Digital Marketing - 3-3
No ratings yet
AI For Digital Marketing - 3-3
1 page
Cover Letter
No ratings yet
Cover Letter
1 page
Quantitative Developer - Rothesay - Job Spec
No ratings yet
Quantitative Developer - Rothesay - Job Spec
1 page
Data Analytics with Generative AI
From Everand
Data Analytics with Generative AI
Younish P
No ratings yet
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
From Everand
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
WINTON CLEM
No ratings yet
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
Database Management System
From Everand
Database Management System
Manish Soni
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet

Download

Uploaded by

Download

Uploaded by

Course Summary - Introduction to Data Analytics

Modern Data Ecosystem

Types of Data Analytics

Descriptive Analytics "What happened?"

Predictive Analytics "What will happen next?"

Prescriptive Analytics "What should be done about it?"

The Data Analytics Process

* Understanding the problem and desired result

Data Analysis Vs Data Analytics

Term Data Analysis Data Analytics

Responsibilities of Data Analyst

Applications of Data Analytics

* Analyzing commercial content to identify and share information

Skills required for Data Analysts

Structured Semi-Structured Un-Structured

Storage and Analysis

Different File Formats

Extract, Transform, and Load

* Extracting data from source locations

* Normalize data, clean the database of unused and redundant data.

© IBM Corporation. All rights reserved.

You might also like