100% found this document useful (5 votes)
5K views

Ethio Coders

Uploaded by

ABDI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (5 votes)
5K views

Ethio Coders

Uploaded by

ABDI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

EthioCoders

 Centre: Mean, median and mode


 Four main aspects to analyse data: centre, spread, shape and outliers
 The row in spread sheet is commonly known as instance or individual.
 spread: range, interquartile range, standard deviation and variance
 Histogram is mostly and commonly used for data spread
 The five number summaries: Consists of five values: 1. Minimum, 2. Q1, 25% data
fall below; 3.Q2. 50% data fall below; 4. Q3. 75% of data fall below and 5.
Maximum, the largest value in the data set.
 The interquartile range = Q3-Q1
 Q2 is median
 Box plot is used to compare quartiles, minimum and maximum
 The box itself represents IQR
 Standard deviation is the average distance of each observation from the mean
 To create column B for calculation of variance in spread sheet for 10 data set, =A1-
AVERAGE(A$1:A$10), then for variance we create column C using formula =B1^2 in
cell C1, and in Cell C11, type =SUM(C1:C10). In Cell C12, =C11/10 is therefore
variance of the data set. For standard deviation, in cell C12, =sqrt(C12)
 Advantages of single number (standard deviation):it simplifies amount of information,
useful for inferential statistics
 normal: mean = median; right skewed: mean>median; left skewed: mean<median
 when we consider outliers:,

1. We focus on noting they exist and impact on summary statistics

2. If typo-remove or fix

3. Understand why they exist, and the impact on questions we are trying to answer about
our data

4. The 5 number summary are important than mean and standard deviation when we have
outliers

5. be careful in reporting

 To check whether your data is normally distributed, use normal quintile plot and
statistical methods like Kolmogorov-Smirnov test.

-Parameter: numerical summary about population

-statistic: numeric summary about sample

Limitations of spread sheets:

-lots of users and large datasets; variance on sheet size and how many there can be
between programs
-In excel, for help menu ‘press-F1

- to customize the quick access toolbar with file > Options> Quick Access Toolbar.

- the data menu has items such as sorting and filtering data

-Undo: Crrl+Z and Redo: Ctrl+Y

- the Range is the group of cells: =SUM(B2:C7) for example

- to quickly toggle from the relative to absolute address, use F4

- Adding dollar sign to an address makes it absolute based on where dollar signs are
placed. A dollar sign in front of the letter fixes a column, while the dollar sign in front of
the number fixes the row; if both are prefixed with a dollar sign, it is fixed to the exact
cell.

- to save data for the first time, save in SAVE AS, then save continuously in the same file.

- the excel format files are *.xlsx, *.xls and *.csv

-Excel formulas: converts a text string to upper case letters: UPPER(text); returns the
logical value TRUE: TRUE(); returns the number of characters in a text string:
LEN(text); estimates standard deviation based on sample (ignores logical values and
text): STDEV(n1, n2, …); adds all the numbers in a range of cells: SUM(n1, n2, …);
removes all spaces from a text string except single spaces between words: TRIM(text)

- to substitute will not change the original cell, instead showing the transformed data in
the new cell. SUBSTITUTE({text}, {old_text}, {new_text}), where {new_text} is the
cell to change, {old_text} is the string sequence to be replaced, and {new_text} is the new
string in place of the old one.

-To extract word from the middle: MID(text, start_num, num_chars)

text: text string containing the characters you want to extract

start_num: position of the first character you want to extract in the text

num_chars: the number of characters you want MID to return from the text

-For FIND function: FIND(find_text, within_text, [start_num])

find_text: the text you want to find

within_text: the text containing the text you want to find

start_num: OPTIONAL character position at which to start the search

- CONCATENATE: will join together two or more strings.


- TRIM: will help to remove excess white space from a string
- PROPER: sets the first letter of each word to upper case, with the rest lower case
- UPPER: sets all letters to upper case
- LOWER: sets all letters to lower case

Data Analysing in Spread Sheet

*aggregation functions: SUM, AVERAGE, MAX, MIN, MEDIAN, STDEV, COUNT

* Logical functions IF, AND, OR, NOT: =IF(condition, value if TRUE, [value if
FALSE]); for AND: IF(AND({condition1}, {condition2}), …)

*Conditional aggregation functions COUNTIF and SUMIF

*Pivot tables

*Named ranges

- The operators: <> for not equal, >= greater than or equal to, <= less than or equal to, =
equal to

Example: If the maximum number of any one fruit exceeds 10 pieces in an order, or the
total order exceeds 20 pieces (all fruit) then mark it with ‘‘Special order’’. Otherwise,
mark it with ‘‘NO’’ is given as =IF(OR(Max(B2:D2)>10, E2>20), ‘‘Special order’’,
‘‘NO’’)

- Lookup functions: uses a keyword and index to ‘’look up’’ a value in a table. There
are both horizontal and vertical lookup functions, although we will focus on a vertical
one called VLOOKUP

-Data Visualization: Pie chart, bar charts, scatter and line pilots, histograms, box plots

A pie chart illustrate the proportionality

To draw pie chart: select categories at the top of the tabe> hold down the control key and
select the bottom row with your mouse>copy highlighted data to another location>paste
using the transpose feature so it creates columns instead of rows > select and choose
‘’insert pie chart’’ as before.

- Bar chart is used for comparing values


- Pie chart is used to sow proportionality but used for categorical data
- Line chart uses for numerical values
- To add trend line in scatter: scatter chart> design>add chart element>trendline>linear.

5. Business Metrics

-Marketing: sales: leads, bookings, deals, margins; customers: CAC, CPA, Life time
value, Active users, chum; web traffic: CTR, CPC, CPL, Stickiness

-Growth:

-Sale
-Finance

Click Though Rate (CTR) = (Cicks/Impressions)*100

Cost Per Click (CPC) = Cost of advertisement/number of views

Cost Per Lead = cost of advertising/total cost

Customer Acquisition Cost (CAC) = (total marketing expenses + total marketing and
sales salaries)/number of customers converted to paid customers

-conversion rates based on impression = numbers/arrived on time numbers

-conversion rates based on each level = current level numbers/ prior level numbers

-total sales and marketing cost = (marketing cost + sales and marketing salaries +
overhead costs for sales and marketing)/number of leads

-Life Time Value (LTV)= Average sale*number of repeated sales *expected


retention*profit margin

-Profit Margin = (average sale-average cost of sale)/average sale

You might also like