How To Utilize Data Analysis in Excel
How To Utilize Data Analysis in Excel
COLLEGE
BSC HONS MATHEMATICS
SUBMITTED TO:
You can perform many operations with ranges in your Excel worksheet, unlike copying the dataset, moving
data from one position to another, formatting cells, and even you can name your range. In this tutorial, we
will briefly cover all the topics about ranges.
Select a Range
While working with excel, you may want to select a multi cell range so you can easily make a command
for all the cells at once. For example, let's suppose you can highlight the headers in the cell range A2:E2.
You select the range and change the background colour of the cells.
1. Select the cell from where you want to start selecting your range. In our case, we have selected the
B2 cell.
2. Drag your cursor to the last cell of your range. As you can see, we have dragged our pointer to the
D5 cell.
3. Hence the range B2: D5 got selected.
Similarly, you can select any range of cells in your Excel worksheet.
1. Select the cell from where you want to start selecting your range. In our case, we have selected the
B2 cell.
2. Hold the 'CTRL' key on your select and the various non-adjacent cells. We have selected B2:B6,
C3:C6, D4 range of cells.
Types of Ranges
1. Vertical Range
Vertical range refers to the selection of the cells within a column. For example, in the below image,
the vertical range is A1:A5. However, if you select the entire column, the vertical range would be A:
A.
2. Horizontal Range
Horizontal Range refers to the selection of cells within a row. For example, in the below image, the
horizontal range is A2:E2. However, if you select the entire row, the vertical range would be 2:2.
3. Mixed Range
Mixed Range refers to the collection of cells formed by combining adjacent rows and columns. For
instance, in the below example, the mixed Range is A2: E10.
Move a Range
By default, if you move a range of cells in excel, it will move the data from one location to another along
with its formatting such as font, text or number format, cell borders, font colour, etc.
1. Select the range of cells you want to move from one location to another in your excel spreadsheet.
2. As soon as you select the cells, you will notice the entire range of selected cells become active with
a green box around it.
3. Move your cursor to the green border, and you will see that the cursor changes to a four-headed
arrow icon.
4. Using the arrow moves the cell to another location within the same Excel worksheet. Unlike here,
we can move it to column E.
Copy/Paste a Range
By default, if you copy a range of cells in excel, it will copy the data from one location along with its
formatting such as font, text or number format, cell borders, font colour, etc. and paste it to its new
location.
Follow the given below steps to copy & paste a range of cells in Excel:
3. Select the cell from where you want to start pasting the copied cell. Either right-click on the cell and
select the paste option or press the 'CTRL + V' option directly.
That's it, your data (along with its formatting) will be pasted to the new location of your excel spreadsheet.
To add a named range in your Excel worksheet, follow the below steps:
1. Select the range of cells for which you want to define the name.
2. Go to the ribbon toolbar located at the top of your Excel window. Click on the Formulas tab -> Defined
Names group -> Define Name option.
3.The New Name window will open (as shown below). In the descriptive name textbox, enter any suitable name for
the range. In our case, we have entered Student_Marks as the name for the selected range.
4. After specifying the name, it's time to specify the range of cells from which you want to apply the
name; therefore, in the "Refers to" box, select the range from your Excel worksheet.
7. Now that we have defined the range's name, we can directly use the name Student_Marks in formulas to
refer to the named range of cells. For example, type the below formula in your excel worksheet.
Formula used:
=SUM(Student_Marks)
8. The SUM formula will quickly calculate the sum of all numbers present in a defined range and will give
you the following result.
Result: 55
Another excellent technique to present a narrative with graphics is charts. They summarise data so that data
sets are easier to grasp and analyze. Excel is well-known for its ability to organize and compute numbers. A
chart is a graphical depiction of any set of facts. A chart is a visual depiction of data that uses symbols such
as bars in a Bar Chart or lines in a Line Chart to represent the data. Excel offers a variety of chart kinds from
which to pick, or you may utilize the Excel Recommended Charts option to examine charts tailored to your
data and select one of those.
Excel charts are great for assisting with data analysis by directing emphasis to one or a few components of a
report. We can use Excel charts to filter out the unnecessary "noise" from the story we're attempting to
convey at the time and instead focus on the most important bits of data. By navigating to the Insert tab and
selecting the Charts command group, you can quickly create pie, line, column, or bar charts. The process for
creating these fundamental charts
Step 2: Select Insert > (choose desired chart type from icons).
Conditional Formatting
Conditional formatting can assist in highlighting patterns and trends in your data. Create rules that define the
format of cells based on their values to utilize it. Conditional formatting may be applied to a range of cells
(either a selection or a named range), an Excel table, and even a PivotTable report in Excel for Windows.
Follow the steps mentioned below to perform conditional formatting.
Step 1: Click Conditional Formatting on the Home tab. Perform one of the following:
1. If you wish to change the values in individual cells, do so. Select Highlight Cells Rules or Top/Bottom
Rules, and then choose the option that corresponds to your needs. If you wish to highlight dates after this
week, numbers between 50 and 100, or the lowest 10% of scores, select Highlight Cells Rules.
2. A color scale that indicates the intensity of the cell's color corresponds to the value's placement at the top
or bottom of the range emphasizes the relationship between values in a cell range. Sales distributions
between regions are one example. Point to Color Scales and then click the desired scale.
3. To emphasize the relationship of values in a cell range, point to Data Bars and then click the desired fill.
This creates a colored band across the cell. Price or population comparisons in the major cities are two
examples.
4. To highlight a cell range containing three to five sets of values, each with its own threshold, point to Icon
Sets and then click a set. For example, you might use a set of three icons to emphasize cells with sales of
less than $80,000, $60,000, and $40,000. Alternatively, you may assign a 5-point rating system to autos
and use a set of five icons.
Concatenate
=CONCATENATE is one of the simplest yet most powerful formulae for data analysis. Text, numbers,
dates, and other data from numerous cells can be combined into one. This is a fantastic method for
generating API endpoints, product SKUs, and Java queries.
Formula:
Len
=LEN returns the number of characters in a given cell rapidly. As seen in the above example, the =LEN
formula may be used to determine the number of characters in a cell to distinguish two types of product
Stock Keeping Units (SKUs). LEN is notably important when attempting to distinguish between distinct
Unique Identifiers (UIDs), which are sometimes long and not in the correct sequence.
Formula:
=LEN(SELECT CELL)
TRIM
Except for single spaces between words, this amazing function will eliminate all spaces from a cell. This
function is most commonly used to eliminate trailing spaces. This is typical when material is copied from
another source or when users enter spaces at the end of text.
=TRIM(piece of text)
COUNTA
=COUNTA determines whether or not a cell is empty. Every day as a data analyst, you will encounter
incomplete data sets. COUNTA will allow you to examine any gaps in the dataset without having to
restructure it.
Formula:
=COUNTA(SELECT CELL)
AVERAGEIFS
AVERAGEIFS, like SUMIFS, allows you to take an average based on one or more parameters.
Formula:
FIND/SEARCH
=FIND/=SEARCH are effective methods for locating particular text inside a data source. Both are
mentioned here because =FIND returns a case-sensitive match, i.e. if you query for "Big," you will only get
Big=true results. A =SEARCH for "Big" will, however, match with Big or big, broadening the query. This is
very helpful when looking for abnormalities or unique identifiers.
Formula:
Sorting
When sorting data in a spreadsheet, you may rearrange the data to rapidly discover values. Sorting a range or
table of data on one or more columns of data is possible. You can, for example, rank personnel first by
department and then by the last name.
Filter
You may use the FILTER function to filter a set of data depending on the criteria you provide. Please keep
in mind that this feature is presently only available to Microsoft 365 users.
Conditional Formatting
Conditional formatting in Excel allows you to highlight cells with a certain color based on the value of the
cell.
Charts
A simple Excel graphic may convey more information than a page of statistics. As you can see, making
charts is pretty simple.
Dataset
A dataset is a collection of continuous cells on an Excel worksheet that contains data to be analyzed. To
make Analyse-it function with your data, you must follow a few simple guidelines when structuring data on
an Excel worksheet:
1. The title should adequately describe the data. If you do not supply a title, the dataset is referred to by its
cell range.
2. A header row with configurable labels. Each variable should have a distinct name. Measurement units
can be incorporated into the label by putting them in brackets after the name.
3. Rows carrying information for each instance. Excel is the sole thing that limits the number of rows.
Sorting
Sorting data is a very critical and vital part of Data Analysis. You can sort your Excel data by multiple
columns or even a single column. The sorting is done in ascending or descending order as well.
Single Column
The first step is to click on any cell in the column which you want to sort.
Next, to sort in ascending order, click on AZ which is found on the Data tab, in the Sort & Filter group.
Note: To sort in descending order, click ZA.
Multiple Columns
You can also sort on multiple columns in your worksheet. Execute the following steps.
Click on Sort which can be found on the Sort & Filter group, on the Data tab.
Filtering
We use filtering when we want to get the data that will match the specific conditions.
COUNTIF
COUNTIF is a very commonly used Excel function used for counting cells in a range that satisfy a single
condition.
Syntax:
Example: Let’s get the count of items that are over 100.
SUMIF
The Excel SUMIF function returns the sum of cells that meet a single condition.
Syntax:
Example:
Let’s use the SUMIF function to calculate the cells based on numbers that meet the criteria.
Pivot Tables
Pivot tables are known for being the most purposeful and powerful feature in Excel. We use
them in summarizing the data stored in a table. They organize and rearrange statistics (or
"pivot") to bring crucial and valuable facts to attention. It helps take an extremely large data
set and see the relevant data you need in a crisp, easy, and manageable way.
Sample Data
The sample data that we are going to use contains 41 records with 5 fields of information on
the buyer information. This data is perfect to understand the pivot table.
To insert a pivot table in your sheet, follow the steps mentioned below:
Drag Fields
To get the total items bought by each buyer, drag the following fields to the following areas.
What-If Analysis is the process of changing the values to try out different values (scenarios) for formulas.
You can use several different sets of values in one or multiple formulas to explore all the different results.
Perfect for what-if analysis, a solver is a Microsoft Excel add-in program that is helpful on many levels. You
can use this feature to find an optimal (maximum or minimum) value for a formula in one cell, which is
known as the objective cell. This is subject to some constraints, or limits, on the values of other formula
cells on a worksheet.
Solver works with a group of cells, called decision variables or simply variable cells, used in computing the
formulas in the objective and constraint cells. Solver also adjusts the decision variable cells' values to work
on the limits on constraint cells. This thereby helps in producing the desired result for the objective cell.
In this example, we will try to find the solution for a simple optimization problem.
Problem: Suppose you are the owner of a business and you want your income to be $3000.
Goal: Calculate the units to be sold and price per unit to achieve the target.
On the Data tab, in the Analysis group, click the Solver button.
In the set objective, select the income cell and set it’s value to $3000.
To Change the variable cell, select the C3, C4, and C8 cells.
Click Solve.
Click the File tab, click Options, and then click the Add-Ins category.
On the Data tab, in the Analysis group, you can now click on Data Analysis.
Descriptive Statistics
Descriptive statistics are one of the fundamental ‘must know’ information of any data set. It gives you an
idea on:
Suppose we have a score of a batsman of his last 10 matches. To generate the descriptive analysis, follow
the steps mentioned below.
Select the range from where you want to display the output.
ANOVA ( Analysis of variance ) in Excel is a statistical method that is used to test the difference between
two or more means.
Below you can find the scores of three batsmen for their last 8 matches.
Regression
In Excel, we use regression analysis to estimate the relationships between two or more variables.
Consider the following data where we have several COVID cases and masks sold in a particular month.
Go to the Data tab > Analysis group > Data analysis.
Select the Input Y Range as the number of masks sold and Input X Range as COVID cases. Check the
residuals and click OK.
R Square signifies the Coefficient of Determination, which is used as an indicator of the goodness of fit.
With the help of R Square, you can track how many points fall on the regression line.
Standard Error is another goodness-of-fit measure that shows the precision of your regression analysis.