0% found this document useful (0 votes)
33 views66 pages

Data Analysis Training Final

Advance data Analysis Training Using Spreadsheet General Education Inspection Desk Ministry of Education

Uploaded by

Gesese Ganka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views66 pages

Data Analysis Training Final

Advance data Analysis Training Using Spreadsheet General Education Inspection Desk Ministry of Education

Uploaded by

Gesese Ganka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 66

ግብዓት Advance data Analysis

ሂደት
Training Using Spreadsheet
ውጤት

General Education Inspection Desk


Ministry of Education

Adama
Dire International Hotel
Outline
01
Introduction

02
Objective

03 Quality data, Information, Characters


and Challenges
04 How Can we built data flow system

Set By Alamerewu Aklilu


Outline …
05 Data flow problem, solution,& update

06 How to develop reporting template

07
Data analysis using Excel

08 Graph and chart development

Set By Alamerewu Aklilu


Introduction

Data analysis is the process of


Inspecting,
Cleaning,
Transforming, and
Modeling data to extract insights
and useful information.

Set By Alamerewu Aklilu


Cont’d

Key skills include


 Data cleaning,
 Statistical knowledge,
 Database querying languages,
 Data visualization,
 Machine learning,
 Programming languages,
 Spreadsheets, and
 Data manipulation tools.
Set By Alamerewu Aklilu
Objective

Participants learn how to gather, clean, and organize


data, use statistical analysis methods, make use of
data analysis tools, and make decisions based on
school inspection data.

Set By Alamerewu Aklilu


Data and Information

 Data and information are related but different


concepts.
 Data refers to raw, objective facts or observations
that have not between given context or interpretat
ion.
 Examples of data include numbers, images,
text, and audio recordings.

Set By Alamerewu Aklilu


Data and Information

 Information, on the other hand, is data that has


been processed, organized, and given meaning in
a particular context.
 Information is typically more structured and useful
than data because it provides insights, knowledge
, or understanding of a situation or subject.
 For example, the temperature readings from a
thermometer are data, and a list of customer
names and purchase amounts is data
Set By Alamerewu Aklilu
Importance of information for education
Information is essential for
 Decision-making,
 Communication,
 Innovation,
 Competitive advantage, and
 Risk management.
 People share ideas,
 Collaborate, and coordinate activities.
 Provide a competitive advantage by analyzing
the education system trends and school status
Set By Alamerewu Aklilu
Problems During Data Collection

 Inconsistent and unreliable information exchange


and collection system,
 Inconsistent data collection template
 Data encoding problems
 No responsibility system,
 used for the desired purpose,
 Not sent on time “ከዘገየ መረጃ የቀረ ይሻላል”
 Not cascaded the template hierarchically Region,
zonal and Woreda levels.
Set By Alamerewu Aklilu
Characteristics of quality data

Data quality is determined by six main


characteristics.
1. Accuracy /ትክክለኛነቱ
2. Reliability /አስተማማኝ
3. Completeness /የተሟላ
4. Timeliness /ወቅታዊ
5. Relevance/appropriateness /ተገቢነቱ
6. Presentation /አቀራረቡ
Set By Alamerewu Aklilu
Data flow system building

 Develop school inspection data needs


and goals,
 Define inspection data management plan,
 Select hardware, database software, and n
etworking technologies,
 Implement school inspection data processing
techniques.
 create an inspection data flow architecture
 Test and refine the system.
Set By Alamerewu Aklilu
Data Flow Diagram

Zone

Region Woreda

Ministry of
School
Education

Set By Alamerewu Aklilu


Solving quality Data Problem
FORMULATION DEFINE,
REAL WORLD MATH MODEL
PROBLEM SCOPE, SPECIFY
ACQUIRE DATA

TEST
USE
COMMUNICATE REFINE
CALCULATE/DEDUCE
IMPLEMENT

TEST

REAL WORLD
SENSITIVE
SOLUTION
INTERPRETATION MODEL SOLUTION

Set By Alamerewu Aklilu


Infrastructure
Human Resource
Financial Resource
Participatory SIP Plan

Input

Process

Teaching and Learning Output


Curriculum
M & E and Effective Resource Utilization
School Community Engagement Internal Efficiency
Student Attainment
Personal Development
Set By Alamerewu Aklilu
Responsibilities of school
inspection data analyst

 Compile, analyze, and report the results


of data sent from the region/zone/sub-
city/Woreda every quarter to the relevant
parties.
 Develop an inspection workflow plan;
Set By Alamerewu Aklilu
Data Analysis in Spreadsheet Goals

Create a Workbook Previous Versions of Excel

Make Basic Formatting Changes Excel Basics Formulas

Save and Print a Workbook Graph, Chart, Table

Reporting template, data cleaning, Testing Data by Statistical Tools,

Set By Alamerewu Aklilu


Introduction to Excel
Blank workbook
Enter Data into a Worksheet

 A worksheet is a grid array of cells, into which


you can enter data.
 The columns are lettered,
 and the rows are numbered.

 A cell can be referred to by its “reference”, whic


h is its column letter followed by its row number.
Creating a table

To define a table in Excel, first, write all required data in a new s


heet and name
them.
Steps
1. Select all data
2. Right-click and select Define the name and write the nam
e of the file.
3. Go to the data collection format sheet and select all array
s that belong to the defined file
4. Go to the data tab, select data validation
Formula

 Overview:
 Every formula in Excel starts with the equal sign (“=”).
 This sign is very meaningful and tells the computer:
 “What’s written next to the Equal sign is not simple text. I wan
t to calculate something.
If a combination of a letter and a number appears next to it (e
g. B6, G78, D13) they refer to addresses of cells in the works
heet.
If arithmetic signs appear (signs like + - * / ) they refer to the
operations they mean (eg. adding, subtracting etc.)”
Formula

 Creating a formula step by step


1. Type the = sign.
2. Type the address of the cell to include in the formula.
-or-
2. Click with the mouse on the cell which you want to include
in the formula (the cell’s address will automatically appear in
the formula, no need to type it).
3. Type an arithmetic sign (e.g. +, -, *, /, see list below)
4. Continue with adding your desired elements to the formula
(it could be a number, or another cell’s address, or a function)
.
5. End by pressing the [Enter] key.
Cont’d

 Examples of formulas:
 =B4
 In words: Show me the value of the cell in address B4.
 =B4*7+3
 In words: Multiply the value of the cell B4 by 7, and add 3 to i
t.
 =(B4+B5+B6)*D6
 In words: add up the values of the cells B4, B5, and B6, and
multiply their sum by the value of cell D6.
Cont’d

You can include functions in your formulas:


 =SUM(B4:B6)*D6
 This does exactly the same as the last formula from above.
 =B4*MIN(C1:C10)+7
 In words: Multiply the value of cell B4 with the smallest numb
er in the range of cells C1:C10, and add 7 to it.
Data cleaning
1. Trim()
2. Fill color using conditional formatting number of
3. Conditional and Count function to check the requir
ed list is filled
4. Data validation to protect unnecessary data entry
5. Data format conversion (percentage and numeric)
Data cleaning

1. Blank cells selecting a strategy


 Select array If you want to identify
duplicate values, Go Home , Go to
search and find a position, click drop
down arrow select Go to Special, In the
dialogue box select blank cells and
press ok, then write Not available (NA)
and press ctrl + Enter simultaneously.
Data cleaning
 Select the column that contains duplicate
values, go to Conditional formatting, Select
the “Highlight duplicate” option, and easily
you can sort it by expanding your selections
 To remove duplicate values select the column
that contains duplicate values, go to the
Data tab, select Remove duplicate files,
select expanding selection, select “my head
has head”, select school code only then click
ok.
Percentage
Excel understands the % sign when attached to numbers inside a for
mula.
=A5*20%
In words: Multiply the value of cell A5 by 0.2 (or: give me 20 per
cent of A5).
=A5*115%
In words: Show me the value of cell A5 plus 15 percent of it.
=A5*(1+B5)
Let’s assume the value 15% is written inside cell B5, then the
formula in words will be:
Show the value of A5 plus 15 percent of it. That seems the same as
the last formula, but the difference is now you can easily change the
percentage, just type a new value in cell B5 (eg. 20%) and the formul
a gives a newly updated result.
List of Arithmetic Sign

 + plus
 - minus
 * multiplication
 / division There is also:
 ^ exponential
for example =2^4 (will give the value of 16)
Cont’d

Examples of formulas:
 One note:
You should take into consideration the order in which arithme
tic is calculated, consider the following two formulas:
=1+2*3+4
=(1+2)*3+4
They are not equal, the first one gives the value of 11 (multipl
ication is calculated first), and the second gives the value of
13.
Common mistakes when using formulas
 Forgetting to write the Equal sign
 Don’t forget to start it always with the “=” sign, otherwise, it will be ju
st written to the cell as text.
 Mixing texts with numbers
 Formulas can’t be based on cells that have mixed numbers and text
s. For example, if a cell contains something like this:
120 Kg. or 60 Km/h or 2,500 Birr
Formulas can’t be based on such kinds of cells.
You have 2 options:
A. Write the text (signs or letters) in an adjacent cell, and not togethe
r with the number.
B. Write the number only in the cell, and afterward use cell formattin
g to add special signs or letters (but either way don’t type them down
inside the cell with the numbers).
Relative Referencing

The formula’s default behavior in Excel is relative referencing.


Let’s understand this concept:
Assume that inside cell B5 you have the following formula:
=G2+1
If you copy this formula two cells downwards into cell B7 (either by doi
ng “copy” and “paste”, or by dragging the fill handle), the formula in B7
will be:
=G4+1
Common mistakes when using formulas

Look at the following formula:


=$G$4+1
It’s actually the same formula as =G4+1 but with $ signs.
The $ signs might look a bit annoying making the formula less
readable, but they have only one meaning to Excel: “Don’t
change the address of cell G4 inside the formula when copied”.
That’s why it’s called “Absolute Referencing”.
19 Excel Tips that could save your time

1. Split windows and freeze panes


10. SUM and SUMIF functions
2. Hide and Unhide command
11. Subtotals and Totals
3. Moving around a spreadsheet with Ctrl, Shift,
and Arrow keys 12. SUMPRODUCT function

4. Name cells/ranges 13. NPV function

5. Sort command 14. COUNT functions

6. Toggling among relational and absolute 15. ROUND,


references 16. ROUNDUP and
7. Fill down and fill right commands 17. ROUNDDOWN functions
8. IF function 18. VLOOKUP and HLOOKUP functions
9. AND and OR functions 19. Insert Function command
Cont’d

18. Paste Special command 27. Conditional formatting


19. Auditing features 28. Autofilter command
20. Goal Seek add-in
29. Customize tool bars
21. Solver add-in
22. Data tables
30. Changing default workbook
23. Scenarios add-in 31. Group and Ungroup your
24. Pivot Tables spreadsheet
25. Protecting cells and worksheets 32. Switch off the Microsoft Actors
26. Editing multiple worksheets
simultaneously
33. Clean up text
35. Final thoughts 34. Keyboard shortcuts
SUMIF

What it does?
sums items in a list matching a condition
Syntax:
sumif(in this range,values that meet this criteria,[sum-this-range])
Example:
=sumif(A1:A20,10) = sums the cells with the value of "10"
SUMIF

• We want to know how many Primary


and secondary we have.
• =sumif(a2:a14,”Primary”,c2:c14)
• =sumifs(a2:a14,”Primary”,b2:b14,”S
econdary”)
SUMIF

=sumif(condition range,condition, sum range)

Sum alternate Rows/Columns Number Amount Condition


Alt. Row Sum 1: 183 1 56 0
=sumif(E15:E21,1,D15:D21) 2 35 1
Alt. Row Sum 0: 285 3 66 0
=sumif(E15:E21,0,D15:D21) 4 23 1
5 98 0
6 125 1
7 65 0
AVERAGE
What it does?
averages a group of numbers
Syntax:
average(of this number range)
Example:
=average(2,4,6) 4 5 10
=average(c9:d11) 12.4166667 3.5 20
6 30
Median Function
Returns the median of the given numbers. The median is the
number in the middle of a set of numbers; that is, half the numbers
have values
that are greater than the median, and half have values that are less.
=MEDIAN(number1,number2,...)
Number1, number2, ... are 1 to N numeric arguments for which
you want the median.
Example ::: Data = 1,2,3,4,5,6
MEDIAN(A2:A6) Median of the first 5 numbers in the list above
MEDIAN(A2:A7) Median of all the numbers above, or the
average of 3 and 4 (3.5)
Standard Deviation
Estimates standard deviation based on a sample. The
standard deviation is a measure of how widely values
are dispersed from the standard value.
= STDEV(number1,number2,...)
Number1, number2, ... are 1 to 30 number arg
uments corresponding to a sample of a population. You
can also use a single array or a reference to an array ins
tead of arguments separated by commas.
Correlation
Returns the correlation coefficient of the array1 and
array2 cell ranges. Use the correlation coefficient to
determine the relationship between two properties. For
example, you can examine the relationship between a
location's average temperature and the use of air
conditioners.

= CORREL(array1,array2)
Array1 is a cell range of values.
Array2 is a second cell range of values.
Correlation
Returns the correlation coefficient of the array1 and
array2 cell ranges. Use the correlation coefficient to
determine the relationship between two properties. For
example, you can examine the relationship between a
location's average temperature and the use of air
conditioners.

= CORREL(array1,array2)
Array1 is a cell range of values.
Array2 is a second cell range of values.
Rank Function

Returns the rank of a number in a list of numbers. The rank of a number is


its size relative to other values in a list. (If you were to sort the list, the ran
k of the number would be its position.)
= RANK(number,ref,order)
The number is the number whose rank you want to find.
Ref is an array of, or a reference to, a list of numbers.
Non-numeric values in ref are ignored.
Order is a number specifying how to rank a number.
If an order is 0 (zero) or omitted, Microsoft Excel ranks the number as if t
he ref were a list sorted in descending order.
If an order is any nonzero value, Microsoft Excel ranks the number as if t
he ref were a list sorted in ascending order
ROUND
What it does?
rounds a number to the nearest decimal you specify
Syntax:
round(this number, to this many digits after decimal)
Example:
=round(12.416667,2) 12.42
Other:
=rounddown(12.416667,2) 12.41
=roundup(12.416667,2) 12.42
VLOOKUP AND HLOOKUP
FUNCTIONS
Allows you to automatically look up a particular cell of
data from a larger data range. This is especially useful
when you have
A large data section that contains information for
multiple records somewhere on the spreadsheet (e.g.,
a small database)
A calculation area somewhere else, and you need to
refer to some specific data elements for specific
records
Group Frequency distribution

Step 1 :- first go to insert and select pivot table


Step 2 :- Define data series and cell.
Step 3 :- Drag Pivot chart field to axis category
Step 4 :- Rename by clicking the selected term and modify each values.
Step 5 :- Drag Pivot chart fields to Values and define name and category
Step 6 :- Select value of data in the pivot chart and define name and category.
Step 7 :- go to insert tab and select group arrow and define upper limit , l
ower limit and interval value .
ROMAN/ARABIC
What it does?
converts a number to roman numeral format
or visa versa
Syntax:
roman(number) arabic("text")
Example:
=roman(65) LXV
=arabic(LXV) 65
Logical Formulas

 If
 And
 Or
 Not
 Choose
 Iferror
 Istext
IF
What it does?
checks whether a condition is met and returns one value if TRUE and another if FALSE
Syntax:
if(is-this-true,then do this, or this)
Example:
=if(25<15,"looser","winner") = winner
Other:
=sumif(condition range,condition, sum range)
IFERROR
What it does?
an easy way to handle errors in formulas
IFERROR returns the value you want incase of
an error with the formula
Syntax:
iferror(formula, value to return if there is an error)
Example:
=iferror(1/0,"can't divide by zero") can't divide by zero
=iferror(0/1,"can't divide by zero") 0
Text Formulas

 Proper
 Trim
 Dollar
 Rept
 Text
 Type
Error Types
Error Type When It Happens
#DIV/0! When you divide by ZERO
#N/A! When a formula or a function inside
a formula cannot find the referenced data.
#NAME? When the text in a formula is not recognized.
#NULL! When a space was used instead of a
comma in formulas that reference multiple
ranges. A comma is necessary to separate
range references.
Error Types
Error Type When It Happens
#NUM! When a formula has numeric data
#REF! When a reference is invalid.
#VALUE! When the wrong type of operand or
function argument is used
Absolute and Relative Referencing
• Absolute cell reference contains a ($)
in a Row and/or Column
– Do not change when copied or filled
– Use when you want to consistently refer to a certain cell
A1 Relative
A$1 Column is relative; Row is Constant
$A1 Row is relative; Column is absolute
$A$1 BOTH are Absolute
Show Formulas
Merge & Center
Database In Excel

Introduction
 A database is a collection of logically related data designed to m
eet the information needs of one or more users
 A database defines a structure for storing information.
 Databases are typically organized into tables, which are collections
of related items.
 A database is a collection of information that is organized so t
hat it can easily be accessed, managed, and updated
Database Function
Averages the values in a column of a list or database that
matches the conditions you specify.
= DAVERAGE (database,field,criteria)
 The database is the range of cells that makes up the list
or database.
 The field indicates which column is used in the function.
Field can be given as text with the column label enclosed
between double quotation marks, such as “-----” as a nu
mber that represents the position of the column within
the list.
 Criteria are the range of cells that contains the conditions
you specify
DSUM Function
Adds the numbers in a column of a
list or database that match conditions you specify.
= DSUM (database,field,criteria)
 The database is the range of cells that makes up
the list or database.
 Field indicates which column is used in the functio
n. Field can be given as text with the column label
enclosed between double quotation marks, such as
”------” or as a number that represents the position
of the column within the list
 Criteria are the range of cells that contains the co
nditions you specify
Summary
Function What it does
SUM(range) Adds a range of cells
(SUMIF(range,criteria,sum_ran Adds cells from sum_range if the condition specified in criteria on range is met.
ge)
AVERAGE(range) Calculates the mean (arithmetic average) of a range of cells
MEDIAN(range) Calculates the median value for a data set; half the values in the data set are greater t
han the median and half are less than the median
MAX(range) Returns the maximum value of a data set
MIN(range) Returns the minimum value of a data set
SMALL(range,k) Returns the kth smallest or kth largest value in a specified data range
LARGE(range,k)
COUNT(range) Counts the number of cells containing numbers in a range
COUNTA(range) Counts the number of non-blank cells within a range
COUNTBLANK(range) Counts the number of blank cells within a range
COUNTIF(range,value) Counts the number of cells in range that are the same as value.
VAR(range) and Calculates the variance of a sample or an entire population (VARP); equivalent to the
VARP(range) square of the standard deviation
STDEV(range) and STEVP(ra Calculates the standard deviation of a sample or an entire population (STDEVP); the s
nge) tandard deviation is a measure of how much values vary from the mean.
Summary
Mode =Mode(Range)
Median =Median(Range)
Mean =Average(Range)
Variance = var(range)
Standard Deviation =sqrt(var(range))
Correlation Coefficient =CORREL(Range1,Range2)
RAND( ) Returns a random value between 0 and 1.
ROUND( X, Y) Returns the value X with Y digits after decimal point.
SIN( X ) Returns the Sin for X.
COUNT( X:Y ) Returns the count of numerical values in the X:Y.
AVERAGE( X:Y) Returns the mean value in the list X:Y.
STDEV( X:Y) Returns the standard deviation of the list X:Y
Exercise
1. Prepare a table that shows Level 1 to 4 primary and secondary schools in number and
percentage.
2. Show the table in activity number 1 using a bar graph in comparison.
3. Compare all Woredas in levels 1 to 4 using bar graphs.
4. Compare all schools in levels 1 to 4 using input, process, and output standards.
5. Show percentage of schools which fit input, process, and output standards using a bar
graph.
6. Show the percentage of schools by level, urban, rural, and Woredas using a bar graph i
n a comparison format.
7. Using a bar graph compare the average of all school inspections versus the national av
erage.
8. Using a bar graph comparing the average of all school inspections versus the national a
verage
9. Which input, process and output standard more affected the 2006-2008 school inspec
tion?
10. Show all standards in the bar graph of your zone inspection report.
11. Show all Woredas in a bar graph that compares the worst standards.(at least by three s
tandards).
12. Show all standards in a line graph that compares statistical tools of range, variance, sta
ndard deviation, and coefficient of variation in a single table.
Thank-you

You might also like