0% found this document useful (0 votes)
103 views

Lesson 9 Using Macros For Analytics

Uploaded by

sbicapsec.ambala
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
103 views

Lesson 9 Using Macros For Analytics

Uploaded by

sbicapsec.ambala
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

Business Analytics with Excel

Using Macros for Analytics


Learning Objectives

By the end of this lesson, you will be able to:

Create Macros and functions

Examine mean of data using Macros

Describe the five point summary using Macros

Identify how to remove duplicates using Macros


A Day in the Life of Business Analyst

As a business analyst of an organization:

You are required to do few tasks in Microsoft Excel which are to be done repeatedly.

Also, you need to create and then run a macro that quickly applies these formatting changes
to the cells that needs to be selected.

To achieve these tasks, you will be learning a few concepts, such as macros for analytics,
means of data using macros, correlation coefficient and removing duplicates using macros
Using Macros for Analytics
Using Macros for Analytics

We use functions within Excel to perform data analysis, charting, and predictive analytics.
Using Macros for Analytics

Macros is an important feature in Excel which permits to do VBA programming within Excel workbook.

Source: https://www.google.com/url?sa=i&url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fwww.k2e.com%2Fseminars%2Fexcel-
macros%2F&psig=AOvVaw2h6kc_fd2sSQnjL8L12diF&ust=1635577130758000&source=images&cd=vfe&ved=0CAsQjRxqFwoTCMj4qeqF7_MCFQAAAAAdAAAAABAD
VBA

Visual Basic for Applications (VBA) allows a programmable interface to Excel.

Source: https://www.google.com/url?sa=i&url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fwww.reddit.com%2Fr%2Fvba%2F&psig=AOvVaw0chti-
XNoCvPvHQYrG3jjy&ust=1635577238890000&source=images&cd=vfe&ved=0CAsQjRxqFwoTCJj-w5qG7_MCFQAAAAAdAAAAABAK
Create Macros and Functions

Macros and Functions are created to:

Perform operations Fix missing Perform predictive


Validate data
on the data values analysis
Types of Macros

There are 3 types of Macros:

• Event based
• Subroutine/Sub Procedure
• Functions
Event Based

This is based on a macro event. For instance, whenever the worksheet is activated, a message box is printed
by the below macro. Example:
Subroutine

This is a set of commands that does some processing in the worksheet and does not return any value.
Example:
Functions

They are similar to sub procedures, but they return some value to the calling sub procedure.
Example:

• To call this function we use: msgbox(sqr(9))


• This prints 81 in a message box.
Create Macros and Functions

To create a macro, press control+F11 on the sheet with the data


Create Macros and Functions

The white area on the right side can be used to create all the macro functions on the data.
Create Macros and Functions

A function is usually written as:

Sub function_name()


End Sub
Create Macros and Functions

Regular VB programming can be done within the function.

A function can based on any event done on the worksheet.


Create Macros and Functions

To choose an event to work on, choose Worksheet on the first drop down and then activate
on the second drop down
Create Macros and Functions

This function will create a set of commands whenever the worksheet is activated.
Create Macros and Functions

A cell value within the sheet can be accessed using the cells.

• Row starts with 1


• Column starts with 1
Mean of Data Using Macros
Mean

Mean is defined as the sum of values in a data set divided by the number of values
in the data set.
Mean

We can the find mean of a column of values using macros for Excel.
Steps to Find Mean

Step 1: Open the boston_housing.xlsx file


Steps to Find Mean

Step 2: Press Alt+F11 to open the macro editor


Steps to Find Mean

Step 3: Create a macro with the below code:


Steps to Find Mean

Step 4: Run the macro using the run button


Steps to Find Mean

The means are populated in row 452 for all 14 columns.

Step 5: We can check the values with the average() function.


Five Point Summary Using Macros
Five Point Summary

The five point summary in statistics specifies five values to describe a set of numeric values.
Values of Five Point Summary

The values are:

Minimum Maximum value


value

25th 75th percentile


percentile value (Q3)
value (Q1)

50th percentile value


(Median or Q2)
Values of Five Point Summary

The five point summary can be visualized using a box and whisker chart.
Values of Five Point Summary

The lowest point is minimum, and the topmost value is the maximum value.
Values of Five Point Summary

The line within the box is Q2 (median).


Values of Five Point Summary

Q1 and Q2 are the bottom and top boxes.


Interquartile Range

There is a metric called IQR (Interquartile Range) which is Q3-Q1.

It refers to the height of the box.


Interquartile Range

IQR is used to find outliers in the data set.

Values of a variable more than Q3 + 1.5 * IQR and


less than Q1 - 1.5 * IQR are considered as suspected
outliers.
Calculate Five Point Summary

Press Alt+F11 on the Excel sheet where we have the data set of Boston_housing

Boston_housing
Calculate Five Point Summary

Copy the following code in macros


Calculate Five Point Summary

The function of quartile is used from the WorksheetFunction object.

• The function takes 0-4 values to each of the five


point metrics.
• The values are stored in columns P and Q.
• The macro is executed.
Calculate Five Point Summary

This will be the output.


Calculate Five Point Summary

Macro values are the same as in the box and whisker plot.
Correlation Coefficient Using Macros
Correlation Coefficient Using Macros

Let us consider an example: For the Boston housing data, we can implement a macro to calculate the
correlation coefficient between ‘INDUS’ and ‘MEDV’ using a macro.
Correlation Coefficient Using Macros

The mathematical equation for the correlation coefficient is:


Correlation Coefficient Using Macros

This is implemented as a macro function.

When we verify the results with the CORREL function, both are the same.
Steps to Find Correlation Coefficient

The following are the steps to find the correlation coefficient.

Step 1: Open the macro editor using Alt+F11


Steps to Find Correlation Coefficient

Step 2: Copy the following code into the editor


Steps to Find Correlation Coefficient

Step 3: The code calculates the correlation coefficient using the mathematical formula for
‘INDUS’ and ‘MEDV’ columns
Steps to Find Correlation Coefficient

Step 4: The results are stored in columns P and Q

Step 5: Run the macro using the F5 or run button


Steps to Find Correlation Coefficient

Step 6: The results are stored in the same Excel in columns P and Q

We can see that the calculated correlation coefficient and CORREL functions are the same.
Steps to Find Correlation Coefficient

Step 7: Any formula can be assigned to a cell value using macros. This is done by the following
command:

Cells(10, 17) = "=CORREL(C2:C451,N2:N451)"


Removing Duplicates Using Macros
Removing Duplicates

To remove duplicates using macros in a data set within a range, we can use the
RemoveDuplicates function.
Removing Duplicates

Use this command to remove duplicates with RemoveDuplicates function.

Range("A2:A451").RemoveDuplicates Columns:=1, Header:=xlNo


Removing Duplicates

Columns:=1 specifies that the first column must be used for checking duplicates.

Range("A2:A451").RemoveDuplicates Columns:=1, Header:=xlNo

Header:=x|No specifies that there is no header in the data range.


Removing Duplicates

We will try to remove duplicates from a range of rows and columns for our entire data set of
Boston housing.

Range("A2:N451").RemoveDuplicates Header:=xlNo
Steps to Remove Duplicates

Open the Macro editor by pressing Alt+F11


Steps to Remove Duplicates

Create a function on columns A2:N451 to remove the duplicates


Steps to Remove Duplicates

Run the function using the F5 or run button


Steps to Remove Duplicates

The macro automatically removes duplicates from column A.

Any duplicates in column A will be


presented as shown.
Steps to Remove Duplicates

If we don’t have duplicates in column A, nothing is removed.


Key Takeaways

Macros is an important feature of Excel which allows VBA programming


within any Excel workbook.

We can find the mean of a column of values using Macros for Excel.

The five point summary in statistics specifies five values to describe a


set of numeric values.

The calculated correlation coefficient and CORREL functions are the


same.

To remove duplicates using macros in a data set within a range, we


can use the RemoveDuplicates function.
Knowledge Check
Knowledge
Check
Which of the following functionalities in Excel do Macros allow?
1

A. Programming

B. Formula writing

C. Charting

D. All the above


Knowledge
Check
Which of the following functionalities in Excel do Macros allow?
1

A. Programming

B. Formula writing

C. Charting

D. All the above

The correct answer is D

All of the above.


Knowledge
Check
Macros are based on which programming language?
2

A. Visual Basic

B. VC++

C. VBA

D.
Knowledge
Check
Macros are based on which programming language?
2

A. Visual Basic

B. VC++

C. VBA

D.

The correct answer is C

Visual Basic for Applications is used to program macros in Excel.


Knowledge
Check
VBA in Excel macros stands for?
3

A. Visual Basic for Automation

B. Visual Basic for Applications

C. Visual Basic Application

D.
Knowledge
Check
VBA in Excel macros stands for?
3

A. Visual Basic for Automation

B. Visual Basic for Applications

C. Visual Basic Application

D.

The correct answer is B

VBA stands for Visual Basic for Applications.


Knowledge
Check
What is the mathematical formula for mean of a set of values?
4

A. Sum of all values

B. Sum of all values/number of values

C. Sum of all values/number of values -1

D. Number of values/Sum of values


Knowledge
Check
What is the mathematical formula for mean of a set of values?
4

A. Sum of all values

B. Sum of all values/number of values

C. Sum of all values/number of values -1

D. Number of values/Sum of values

The correct answer is B

Average is defined as sum of values/number of values.


Knowledge
Check
What is the macro way to set the value of C11 to mean of cells A1:A24?
5

A. Cells(11,3).values="=MEAN(A1:A24)"

B. Cells(11,3)="=MEAN(A1:A24)"

C. Cells(11,3)="=AVERAGE(A1:A24)"

D. Cells(11,3).values="=AVERAGE(A1:A24)"
Knowledge
Check
What is the macro way to set the value of C11 to mean of cells A1:A24?
5

A. Cells(11,3).values="=MEAN(A1:A24)"

B. Cells(11,3)="=MEAN(A1:A24)"

C. Cells(11,3)="=AVERAGE(A1:A24)"

D. Cells(11,3).values="=AVERAGE(A1:A24)"

The correct answer is C

AVERAGE function gives mean in Excel.


Knowledge
Check For finding median of a dataset using macros, the data is ordered in ascending and
6 middle value is found programmatically? True or False.

A. True

B. False

C.

D.
Knowledge
Check For finding median of a dataset using macros, the data is ordered in ascending and
6 middle value is found programmatically? True or False.

A. True

B. False

C.

D.

The correct answer is A

True. By definition of median, the data is ordered in ascending order and the middle value(s) is/are the median(s).
Knowledge
Check
Which of the following defines Interquartile range?
7

A. Q1-Q2

B. Q2-Q3

C. Q3-Q1

D.
Knowledge
Check
Which of the following defines Interquartile range?
7

A. Q1-Q2

B. Q2-Q3

C. Q3-Q1

D.

The correct answer is C

IQR is 3rd quartile minus 1st quartile.


Knowledge
Check
Which of the following is NOT a part of the 5-point summary?
8

A. Mean

B. Median

C. Maximum

D. Minimum
Knowledge
Check
Which of the following is NOT a part of the 5-point summary?
8

A. Mean

B. Median

C. Maximum

D. Minimum

The correct answer is A

Mean is not a part of the 5-point summary.


Knowledge
Check
Which plot is based on the 5-point summary?
9

A. Bar chart

B. Box-and-whisker

C. Line graph

D. Histogram
Knowledge
Check
Which plot is based on the 5-point summary?
9

A. Bar chart

B. Box-and-whisker

C. Line graph

D. Histogram

The correct answer is B

Box-and-Whisker plot shows the 5-point summary


Knowledge
Check
What is the maximum value of the correlation coefficient?
10

A. 0

B. 1

C. 2

D.
Knowledge
Check
What is the maximum value of the correlation coefficient?
10

A. 0

B. 1

C. 2

D.

The correct answer is B

Maximum value of correlation coefficient is 1.


Knowledge
Check If two variables are non-correlated, the value of the correlation coefficient is around
11 which value?

A. -1

B. 0

C. 1

D.
Knowledge
Check If two variables are non-correlated, the value of the correlation coefficient is around
11 which value?

A. -1

B. 0

C. 1

D.

The correct answer is B

If two variables are non-correlated, the value of the correlation coefficient is 0.


Knowledge
Check
What is typically done with highly correlated variables for data analytics?
12

A. They are retained for the model

B. They are removed for the model

C.

D.
Knowledge
Check
What is typically done with highly correlated variables for data analytics?
12

A. They are retained for the model

B. They are removed for the model

C.

D.

The correct answer is B

Highly correlated variables are removed and only one of them is retained for the model.
Knowledge
Check
What is the range of Correlation values?
13

A. 0 and 1

B. 1 and 2

C. -1 and +1

D. -1 and 0
Knowledge
Check
What is the range of Correlation values?
13

A. 0 and 1

B. 1 and 2

C. -1 and +1

D. -1 and 0

The correct answer is C

Correlation coefficient is between -1 and +1


Knowledge
Check
Duplicates cannot be removed using macros. True or False.
14

A. True

B. False

C.

D.
Knowledge
Check
Duplicates cannot be removed using macros. True or False.
14

A. True

B. False

C.

D.

The correct answer is B

False. Duplicates can be removed using macros.


Knowledge
Check How is the function for RemoveDuplicates used to specify that there is no header in
15 the data?

A. Header:=xlNo

B. Header:=No

C. Header:=FALSE

D. Header:=None
Knowledge
Check How is the function for RemoveDuplicates used to specify that there is no header in
15 the data?

A. Header:=xlNo

B. Header:=No

C. Header:=FALSE

D. Header:=None

The correct answer is A

Header:=xlNo is used to specify that the dataset has no headers.


Knowledge
Check Columns keyword is used to specify the column number to check for duplicates in
16 RemoveDuplicates macro function. True or False.

A. True

B. False

C.

D.
Knowledge
Check Columns keyword is used to specify the column number to check for duplicates in
16 RemoveDuplicates macro function. True or False.

A. True

B. False

C.

D.

The correct answer is A

True. Columns:=1 is specified to remove duplicates using column 1.

You might also like