0% found this document useful (0 votes)
13 views6 pages

Data Analysis

The document outlines the phases and importance of data analysis, detailing steps such as identifying problems, cleaning data, analyzing, and visualizing insights. It emphasizes the need for effective communication, technical knowledge, and understanding stakeholder needs throughout the data-driven decision-making process. Additionally, it discusses data types, sources, and the ETL process, highlighting the significance of data transformation and consolidation for meaningful insights.

Uploaded by

Tobias Ramirez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views6 pages

Data Analysis

The document outlines the phases and importance of data analysis, detailing steps such as identifying problems, cleaning data, analyzing, and visualizing insights. It emphasizes the need for effective communication, technical knowledge, and understanding stakeholder needs throughout the data-driven decision-making process. Additionally, it discusses data types, sources, and the ETL process, highlighting the significance of data transformation and consolidation for meaningful insights.

Uploaded by

Tobias Ramirez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
You are on page 1/ 6

Data Analysis

The importance of using data analysis


Phases
⁃ Identifying the problem and gathering data
⁃ Cleaning and processing data
⁃ Analyzing data
⁃ Drawing insights and making recommendations
⁃ Implementing changes and measuring impact

Habilities
Communication: in order to transmit complex information
clearly and concisely, with storytelling, for example
Diplomacy: the art of navigating delicate situations and
maintaining positive relationships, even when disagreements
arise.
Understand end-users
Technical knowledge

Steps in data-driven decision-making


Prepare
Data preparation is the crucial first step in the data analysis
process. In this stage, data analysts gather, clean, and pre-
process the raw data to make it suitable for analysis. This often
involves removing any inaccuracies, inconsistencies, or
duplicate records, as well as filling in missing values.

Model
In the modeling stage, data analysts create a data model that
represents the structure, relationships, and constraints of the
data. This involves designing a schema, which is a blueprint of
how the data is organized and stored.

Analyze
This step is the core of the data analysis process, where data
analysts dig deep into the data to uncover insights and answer
specific questions. Analysis can take many forms, including:
• Descriptive analysis: Describe what the data looks like in its
basic form.
• Exploratory analysis: Dig deeper to try and find interesting
patterns or relationships between different parts of the data.
• Inferential analysis: Use available data to make guesses or
predictions about things outside the data.
• Predictive analysis: Use statistics to predict what might happen
in the future based on what's happened in the past.

Visualize
By creating charts, graphs, and other visual representations of
data, analysts can more easily spot trends, outliers, and
relationships between variables. This helps them gain a deeper
understanding of the data and communicate their findings to
stakeholders in a way that's easy to understand.

Manage
Data management is a critical aspect of the data analysis
process that ensures the integrity, consistency, and security of
the data being used. This involves implementing best practices
for data storage, backup, and access control, as well as
maintaining data documentation and metadata.

Stages
Data processing: prepare raw data for analysis
Data analysis: transform data into insights

Stakeholder experience
Specific needs, preferences and expectations of stakeholders
engaging with the visualizations and insights provided in a data
analysis report. it impacts how relevant, useful, and
understandable the visualizations and analysis are.

Identify and analysis data


- A good place to start analysis is to streamline the business
requirement from complex to simple, and then establish
relationships between any multiple topics.

- The first is to determine the date to be measured, this include:


• Internal company data
• Data from social media
• Sensor generated data

- A critical source of this information can come from their


Enterprise Resource Planning or ERP system. ERP systems
are designed to collect, store, manage, and interpret structured
data from various business activities
• Structured data: organized into a formatted repository, typically
a database, so it's easily searchable
• Semi-structured data: it contains tags or other markers to
separate data elements and enforce hierarchies of records and
fields within the data
• Un-structured data:

ETL process
Extract: involves retrieving and extracting raw data from
different sources, such as databases, files, or other data
storage systems

Transform: involves cleaning, structuring, and enriching the


data to make it more suitable for analysis

Loading the transformed data into the final storage system

Data sources
• SQL Server databases
• Cloud-based data sources
• Microsoft Excel spreadsheets
• On-premises data sources
• Web-based data sources
• NoSQL databases

Flat file: file type that contains a single data table, with
a uniform structure for every row of data, and does not
have hierarchies.

Types of data
• Structured: quantitative, searchable, sortable, analyzed

Unstructured: does not have a predefined structure or


format. It is best used for qualitative analysis and
usually resides in non-relational databases or
unprocessed file formats. EX: text files, audio, video

Data serialization
Process for converting semi-structured data into a
specific format that can be easily transmitted, stored,
or processed. It is called data serialization. It uses a
method of formatting that will allow the data to
be transmitted or stored in a way that is easily
understood by both the sender and the receiver
without the need to know all the specific details of the
data.
One of the formats that to allow the storage
of unstructured or semi-structured data is a blob. This
is a binary large object where the data is stored in a
binary ones and zeros format.

Data transformation
Data from different sources can be untidy, incomplete,
and inconsistent, making it difficult to draw meaningful
insights. That's why data transformation is a crucial
step

Data combination
Consolidating information means getting information
from various sources or tables together into a single
table and provide a unified view of the data.

Instead of working with multiple separate


tables, having a single consolidated table
reduces complexity and makes it easier to handle data
updates, refreshes, and maintenance tasks.

Append: adding rows of one table or query to another


table or query. By adding multiple lists one below the
other, you will see an increase in the number of rows.

Merge: consolidate data from multiple tables into a


single entity by leveraging a shared column between
the tables

Join: when you merge or combine data from different


places to create a bigger and a more complete dataset.

You might also like