0% found this document useful (0 votes)
2 views

Introduction to DataAnalysis

The document provides an introduction to data analysis, covering key concepts such as types of data, the data analysis process, and the differences between data analysis and data science. It outlines the importance of data cleaning, exploratory data analysis, and data visualization, along with common tools used for analysis like Excel and Python. The document also emphasizes practical applications of data analysis through examples and visualization techniques.

Uploaded by

elishaboateng17
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Introduction to DataAnalysis

The document provides an introduction to data analysis, covering key concepts such as types of data, the data analysis process, and the differences between data analysis and data science. It outlines the importance of data cleaning, exploratory data analysis, and data visualization, along with common tools used for analysis like Excel and Python. The document also emphasizes practical applications of data analysis through examples and visualization techniques.

Uploaded by

elishaboateng17
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

INTRODUCTION TO DATA

ANALYSIS
BENEDICTA MENSAH
Data Analyst/Scientist
AGENDA
What is Data
Data Analysis
Difference between Data Analysis and Data Science
Sources of Data
The Data Analysis Process
Tools used for Data Analysis
Practical Session with Excel
Python for Data Analysis
Data is a collection of facts and statistics
used for analysis.
It can be numbers, texts, images, audio, or
video.

WHAT IS TYPES OF DATA

DATA? 1. Structured Data: Organized data with


a fixed format(spreadsheets, databases
etc) Eg: Sales records, Customer
information…
2. Unstructured Data: Unorganized data
without a fixed format(social media posts,
emails, images). Eg: Tweets, product
reviews.
EXAMPLES OF DATA
IN REAL LIFE
1. Sales Data: Tracks revenue,
customer purchases, and product
performance.
2. Healthcare Data: Patient records,
treatment outcomes, and medical
research.
3. Social Media Data: User
interactions, trends, and sentiment
analysis
DATA IS
EVERYWHERE….
DATA ANALYSIS
Data analysis is the process of inspecting,
cleaning, transforming, and modelling
data to discover useful information, draw
conclusions, and support decision-
making.
The main objectives of Data Analysis is to
identify trends and patterns, detect
anomalies, and make informed decisions.
DIFFERENCE BETWEEN DATA
ANALYSIS & DATA SCIENCE
Data Analysis Data Science
• Goal: Extract insights • Build predictive models
from data using AI/ML
• Tools: Excel, SQL, •Python(Scikit-Learn,
Python(Pandas, NumPy) TensorFlow), R
• Output: Reports, •AI-powered solutions,
Dashboards Forecasting Models
Data Collection

THE
DATA Data Cleaning

ANALYS Exploratory Data Analysis (EDA)


IS
PROCES Data Visualization

S Decision Making
DATA COLLECTION
It’s the process of gathering raw data from various sources for
analysis.

What are the sources of Data?


o Internal Sources: Data generated within an organization
Example- Sales records, Customer databases, Employee
performance records, financial transactions
o External Sources: Data collected from outside the organization.
Example- Government databases(GSS), social media platforms,
APIs(Googlemap API), public datasets from platforms like Kaggle,
World Bank Open Data, Data Bank, Surveys…..
SOURCES OF
DATA
 Primary Data – Data you collected
Secondary Data – Data Collected by an
external source
o Kaggle
o World Bank
o Ghana Statistical Service
o Data Bank
DATA CLEANING
Data cleaning involves detecting and
correcting(or removing) errors,
inconsistencies, and inaccuracies in a
dataset. It ensures that the data is
accurate, complete, and ready for
analysis.

Why is this step important?


Dirty data can lead to incorrect analysis,
misleading insights, and poor decision-
making.
Clean data improves the reliability and
credibility of your results.
COMMON DATA CLEANING
TASKS
1. Handling Missing values: Remove rows

DATA or columns with too many null values, fill


missing values with averages, medians, or

CLEANING mode.
2. Standardizing formats: Ensuring
CONTINUE consistency in date formats, …..

D 3. Correcting Errors: Fixing typos,


incorrect values, or Outliers
5. Removing Duplicates
6. Removing empty spaces using the trim
feature
EXPLORATORY DATA
ANALYSIS(EDA)
EDA is the process of examining data to
uncover patterns, trends, relationships,
anomalies. It helps understand the data
before performing modelling or decision.
DATA
VISUALIZATION
Data visualization is creating graphical
representations of data to communicate
insights effectively. It helps identify
patterns, trends, and relationships.

Why is Data Visualization Important?


 Makes complex data easier to
understand
 Enhances storytelling with data
 Reveals trends, correlations, and outliers
 Engage stakeholders with interactive
dashboards.
1. Bar Chart – Compare
categories
2. Line Chart – Show
trends over time (Weekly,
monthly and yearly
growth)
3. Pie Chart – Displaying
proportions or
percentages
4. Scatter Plot – Identify
relationships between two
variables
5. Histogram –
Understand Distributions
of numerical data
COMMON TYPES OF 6. Dashboards – Combine
DATA VISUALIZATIONS multiple visualization into
a single interactive view
TOOLS FOR DATA
ANALYSIS/VISUALIZATION
1. Excel – Pivot Tables/Charts, Column/Bar Charts, Line
Charts, Pie Charts

2. Python – Numpy, Panda, Matplotlib, Seaborn, Plotly

3. Power BI/ Tableau – Interactive Dashboards

4. Looker Studio
DATA ANALYSIS WITH EXCEL
o Download Sales Data
o Data Cleaning
o Create new columns – Total Sales & Profit Margin
o Create pivot tables – Sales Per Category, Profit by
Region, Quantity sold by price, Sales Trend Over
time, Average price per payment method……
o Create pivot charts
o Create a Dashboard

You might also like