0% found this document useful (0 votes)
16 views2 pages

Data Analysis

1. The document outlines the key steps in the data analysis process including: defining the problem, exploring and analyzing the data, and communicating results. 2. Some important aspects of data analysis covered are identifying the problem statement, question to be answered, and sources of data. The document also discusses data wrangling, biases, and different data types. 3. Key analysis techniques mentioned are descriptive statistics, sorting and exploring data distributions, and finding patterns and relationships in the data. The overall goal is to gain insights from the data to help solve problems.

Uploaded by

ukm esensial
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views2 pages

Data Analysis

1. The document outlines the key steps in the data analysis process including: defining the problem, exploring and analyzing the data, and communicating results. 2. Some important aspects of data analysis covered are identifying the problem statement, question to be answered, and sources of data. The document also discusses data wrangling, biases, and different data types. 3. Key analysis techniques mentioned are descriptive statistics, sorting and exploring data distributions, and finding patterns and relationships in the data. The overall goal is to gain insights from the data to help solve problems.

Uploaded by

ukm esensial
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

DATA ANALYSIS

1. Data analysis process


a. question to solve/answer (problem statement)
b. Identify, explore and define the problem:
i. Identify
Clarify situation (USGF, SWOT, comparative study, survey, performance
reports) then summarize, If problem is unclear, use 5 W (who, what, why,
where, when).
Data come from: in-person research, online research, gov agencies,
collagues (identify what’s missing)
ii. Explore—conclusion, prediction, needs stat and ML
iii. Define
1. build intuition: build conceptual model (predict something by
making statement that might happen)
2. find patterns (forming hypothesis):
If (hypothesis) Then (conclusion/prediction)---deductive reasoning

deductive: general to specific (provide premises are true)—apply


theories to certain situation
inductive: specific to general (starts with observation looks for
patterns then create tentative hypothesis from which theory
emerges)—conclusion can be false
c. Analyze the data
Definition: a value that assigns to the attributes of something (qualitative which can
be measured)
Wrangling data: data finding/acquisition and data cleaning
1. Finding data/ data acquisition : internal source, open source and paid source
a. Types of data
i. Qualitative: described by subjective, non numerical
1. Binary: either/or catagories
2. Nominal: unordered categories (ex colour, shape)
3. ordinal: implied ordering of categories (ex small,
medium, large)
ii. Quantitative: described and manipulated numerically
1. discrete: a dictinct, integer-based count (non
fractions number)
2. continuous: can take any valu within range (ex
height, weight)
Bias, opinion and assumption
FACTS: can be proved and checked, reflect what happened and verifiable by
objective data.
OPINION: cant be proved, reflects of thought or feelings, included “feel,
seem, always, never”.
‘suspect’ ≠opinions
BIAS: A cautionary tale, anything that impact the resuls of an activity
(including data analysis) ex selection bias (cherry pick data selection process
which will impact result)
Downloading and Manipulating data
 CSV: comma separated value, standard format to store tabular data (ex
excel)
 clean data and dirty data
 Dirty data : must be cleaned automate and repeatable
o spelling errors, duplicate rows,
o blank fields, lower/uppercase text,
o spaces and non printing characters in text
o improperly formatted numbers and number signs
o improperly formatted dates and times
 structured (machine readable): csv, html, xml and jason
 unstructured (human readable): pdf, txt

Descriptive Statistics (distributions, central tendencies and standard deviation)

ANALYZE STEPS :
1. Get data sets
2. Make a copy
3. sort data
4. find min and max for each variable

https://www.udemy.com/course/introduction-to-data-analysis-for-
government/learn/lecture/9223368#reviews

d. communication phase (share)—data visualization (presentation)

1.

You might also like