Data Analysis
Data Analysis
Habilities
Communication: in order to transmit complex information
clearly and concisely, with storytelling, for example
Diplomacy: the art of navigating delicate situations and
maintaining positive relationships, even when disagreements
arise.
Understand end-users
Technical knowledge
Model
In the modeling stage, data analysts create a data model that
represents the structure, relationships, and constraints of the
data. This involves designing a schema, which is a blueprint of
how the data is organized and stored.
Analyze
This step is the core of the data analysis process, where data
analysts dig deep into the data to uncover insights and answer
specific questions. Analysis can take many forms, including:
• Descriptive analysis: Describe what the data looks like in its
basic form.
• Exploratory analysis: Dig deeper to try and find interesting
patterns or relationships between different parts of the data.
• Inferential analysis: Use available data to make guesses or
predictions about things outside the data.
• Predictive analysis: Use statistics to predict what might happen
in the future based on what's happened in the past.
Visualize
By creating charts, graphs, and other visual representations of
data, analysts can more easily spot trends, outliers, and
relationships between variables. This helps them gain a deeper
understanding of the data and communicate their findings to
stakeholders in a way that's easy to understand.
Manage
Data management is a critical aspect of the data analysis
process that ensures the integrity, consistency, and security of
the data being used. This involves implementing best practices
for data storage, backup, and access control, as well as
maintaining data documentation and metadata.
Stages
Data processing: prepare raw data for analysis
Data analysis: transform data into insights
Stakeholder experience
Specific needs, preferences and expectations of stakeholders
engaging with the visualizations and insights provided in a data
analysis report. it impacts how relevant, useful, and
understandable the visualizations and analysis are.
ETL process
Extract: involves retrieving and extracting raw data from
different sources, such as databases, files, or other data
storage systems
Data sources
• SQL Server databases
• Cloud-based data sources
• Microsoft Excel spreadsheets
• On-premises data sources
• Web-based data sources
• NoSQL databases
Flat file: file type that contains a single data table, with
a uniform structure for every row of data, and does not
have hierarchies.
Types of data
• Structured: quantitative, searchable, sortable, analyzed
Data serialization
Process for converting semi-structured data into a
specific format that can be easily transmitted, stored,
or processed. It is called data serialization. It uses a
method of formatting that will allow the data to
be transmitted or stored in a way that is easily
understood by both the sender and the receiver
without the need to know all the specific details of the
data.
One of the formats that to allow the storage
of unstructured or semi-structured data is a blob. This
is a binary large object where the data is stored in a
binary ones and zeros format.
Data transformation
Data from different sources can be untidy, incomplete,
and inconsistent, making it difficult to draw meaningful
insights. That's why data transformation is a crucial
step
Data combination
Consolidating information means getting information
from various sources or tables together into a single
table and provide a unified view of the data.