0% found this document useful (0 votes)
21 views

Introduction

Uploaded by

Kingsley Esedebe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Introduction

Uploaded by

Kingsley Esedebe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

INTRODUCTION

FIRSTLY, WHAT IS Data?


Data is simply unprocessed facts. Any fact(weird information) that is unprocessed(not fully looked at, to
know ‘EXACTLY’ what it is saying) can be called Data.

So basically: ‘EVERYTHING’ can be Data- from the food we eat, to the hair we make, to the clothes we
wear, to the music we listen to….EVERYTHING!

Types of Data
There are 3 main types of data namely;

1. Structured Data; This is data that is arranged in a rigid tabular format. E.g; Exam result Data,
Attendance List Data…
2. Semi-structured Data; This is data that is mixed, it is not in a rigid format, but it has consistent
character like is seen in structured Data. E.g; arranging market list in a csv(comma separated
values) format. *[bread, Fish, Meat, Milk]
3. Unstructured Data; This is data that is complex, it is impossible to arrange in a rigid tabular
format. E.g; videos, images, text messages

*NOTE: Data analysts usually work with structured and semi-structured data because they are far more
easier to analyze, analysis of unstructured data can be done by more advanced big-data scientists.

Sources of Data for analysis


1. Databases; these are simply places where large banks of data is stored
2. Open-files; these are usually datasets available to the world for analysis, Example are weather
forecasts datasets, football seasons datasets, etc.
3. Web services; data can be easily gotten from the web using a process known as web scraping
4. Data from Applications using API’s; API’s are middle-men that help you to source for data from
an application/software by helping to make requests on your behalf, and APPLICATION facts are
a very good source of data, E.g; getting data about the most spoken about song on twitter

NOW, WHAT IS Data Analysis/Data Analytics?


Data Analysis/Analytics is the discipline that involves the collecting, cleaning and transforming
of raw data, with the aim of getting useful information(insights) that will help key-stakeholders
of businesses and organizations to make very accurate decisions.
WHAT is the Difference between Data Analysis and Data Analytics?
First let’s specially define the 2 terms;

-Data analysis involves extracting meaning from data in a way that’s useful to a decision-maker. Data
analytics is actually broader in scope, Data analytics refers to the process of using data and analytical
tools and techniques to find new insights and make new predictions which are usually for the benefit of
the organization.

-PLAINLY; Data analytics involves the broad-field of using special analytical tools and techniques to
generate useful insights from data that will therefore help organizations to make correct data-driven
decisions, whereas data analysis involves the specific process of analyzing raw data.

*Many people in MANY different fields can apply the data analysis process for the purpose of analyzing
a particular dataset, but only a true ‘Data Analyst’ has ALL the required skills and knowledge bank to
work in a firm or organization and actively help them to work with their data, so as to help them make
very informed and accurate decisions that can help take their company or business farther.

SUMMARY; So the real thing we’re actually studying right now is ‘Data analytics’, because ‘Data analysis’
as a discipline basically focuses on the process of actually analyzing data to get insights, whereas ‘Data
analytics’ as a discipline focuses on both the process of actually analyzing data and its direct Real-world
application in businesses and organizations.

*The best use case of the data analysis process is in businesses and specific organizations (like
governmental or even non-governmental) because you can’t just analyze data without any aim at heart,
you analyze with the aim of improving something- such as making a governmental organization much
better, or helping a business to move to the next level.

[FOR THIS COURSE THO, WE’LL BE USING THE NAME ‘DATA ANALYSIS’]

WHAT is the Difference between DATA ANALYSIS and DATA SCIENCE?


Data Analysis and Data Science are extremely related, but have some slight differences

-Data Science is the Father of Data analysis;

*ALL Data Scientists can be called ‘Data Analysts’, and can actively ‘analyze data’, whereas, not all Data
Analyst can be called Data scientists.

-Whereas Data analysis is mainly focused on just finding reasonable and useful information(insights),
Data Science is focused on discovering hidden patterns and relationships, that can therefore help a
business or co-operation to be far ahead of their competitors or contemporaries.
-Data Scientists use more advanced statistical and analytical tools and techniques because they are
more focused on making predictions of the ‘Future’ and some of the normal Data analysts tools and
techniques are not best soothed for that

-Data Science also involves analyzing much larger datasets because they are usually doing far more
advanced work than a regular data analysts work scheme.

KEY SUMMARY

1. D.Sc is the Senior brother of Data analysis- ALL data scientist are data analyst
2. D.Sc aims to find hidden patterns and relationships unlike D.A that just wants to find insights
3. D.Sc is usually focused on ‘Making Predictions’ for the future, while DA just wants to check and
understand what happened in the ‘Past’
4. D.Scientists usually analyze much larger datasets as they already have more advanced skillsets

What are the types of Data analysis;


1. Descriptive analysis; aims to answer the question [‘WHAT HAPPENED’]
2. Diagnostic analysis; aims to answer the question [‘WHY DID IT HAPPEN’]
3. Predictive analysis; aims to answer the question [‘WHAT WILL HAPPEN NEXT’]
4. Prescriptive analysis; aims to answer the question [‘WHAT SHOULD BE DONE ABOUT IT’]

Process of Data analysis


1. Identifying the Data to analyze
2. Data Gathering and Collection
3. Data Cleaning
4. Data Transformation
5. Data Analyzing proper--*Where we get our result from
6. Data Visualization(Interpretation of Results)
7. Presentation of results and key-findings, [most key step]
Short Notes on these Processes
1. Identifying the Data to analyze; involves knowing what kind of data you need for your
particular analysis Project. It’s a ‘Mental Process’(done in the Head)
2. Data Gathering and Collection- involves compiling and joining together ‘ALL’ the different
datasets you’ve gotten from possibly many different sources into 1 special file where they can
be imported at once. It also involves actively extracting and collecting data from many different
sources based on its importance in your analytical project.
Basic Breakdown
a. Data Extraction: Sourcing data from many different-accurate sources
b. Data Gathering: Bringing all the Different Datasets together at once
c. Data Arranging: Compiling the whole Data into 1 big File(mostly an Excel File)
3. Data Cleaning- involves removing all the inconsistencies and inaccurate factors in your
dataset, thereby making it a more consistent and accurate dataset
4. Data Transformation- involves making the well-needed changes or additions to your
dataset, that’ll make it much more ready for analysis
5. Data Analyzing proper- involves using statistical and analytical tools to properly analyze
and understand what your data is saying to you
6. Data Visualization[Interpretation(Explanation) of Results]- involves using of charts and
graphs to interpret and represent what the result of your analysis on that particular dataset is.
7. Presentation of results and key-findings- involves actively speaking and opening up to the
key stakeholders of either the business or particular organization you work for about the
findings that you’ve gotten from your analysis process. It’s a ‘Mental Process’(done in the
Head).

Data Analysis tools


1. Tools focused on Data Extraction; Apache Hadoop
2. Tools focused on Data cleaning; Excel, Python and R
3. Tools focused on Data Analysis proper; STATA and SPSS
4. Tools focused on Data Visualization; Power BI, Tableau
5. Tools that analyze data in a database; SQL
6. OGA’s: Tools that can do everything well; Excel, Python and R

You might also like