Midterm Module 1a
Midterm Module 1a
1. Pivot Table:
Whenever you are working with company data, you seek answers for
questions like “How much revenue is contributed by branches of North
region?” or “What was the average number of customers for product A?” and
many others.
Step-2: Now, you can see the PivotTable Field List panel, which contains the
Above, you can see that we have arranged “Region” in row, “Product id” in
column and sum of “Premium” is taken as value. Now you are ready with
pivot table which shows Region and Product wise sum of premium. You can
also use count, average, min, max and other summary metrics.
Microsoft Excel offers several types of charts and graphs to help you
visualize your spreadsheet data. All you need to do is organize your data,
select it and insert a chart from the menu bar. You can also customize,
resize and reposition the chart.
Excel offers a variety of charts and graphs to help represent your data.
Here are some common data visualization types available in Excel:
Column Chart: Displays data using vertical bars, with each bar
representing a category. It is useful for comparing values across
categories or showing trends over time.
Bar Chart: Similar to a column chart, but with horizontal bars
instead of vertical ones. Bar charts are great for comparing values
Module in Data Science Analytics – mid2023-fdl
4
across categories when the category names are long or there are
many categories.
Line Chart: Plots data points connected by lines, showing trends or
changes over time. Line charts are ideal for illustrating continuous
data, such as stock prices or temperature measurements.
Pie Chart: Represents data as slices of a circle, with each slice's
size proportional to the value it represents. Pie charts are suitable
for showing the proportion of each category within a whole.
Area Chart: Similar to a line chart, but with the area between the
line and the horizontal axis filled in. Area charts emphasize the
magnitude of change over time and are useful for showing
cumulative data or comparing multiple data series.
Scatter Plot: Displays data points on a Cartesian coordinate
system, with each axis representing a variable. Scatter plots are
used to show the relationship between two variables and identify
patterns or correlations.
Bubble Chart: A variation of the scatter plot that adds a third
variable represented by the size of the bubbles. Bubble charts can
help visualize the relationship between three continuous variables.
Radar Chart (Spider Chart): Displays multivariate data on multiple
axes that radiate from a central point. Radar charts are useful for
comparing multiple variables across different categories or entities.
Stock Chart: Designed specifically to show stock market data, stock
charts can display open, high, low and close values over a period of
time.
Surface Chart: Displays three-dimensional data on a grid, where
the color or shading indicates the value. Surface charts are useful
for visualizing complex data with two independent variables.
Waterfall Chart: Displays the cumulative effect of sequential
positive and negative values, typically used for financial data to show
the progression of a starting value to an ending value.
Treemap: Represents hierarchical data as nested rectangles, with
each rectangle's size and color representing specific values.
Treemaps are helpful for visualizing large data sets with multiple
categories and subcategories.
Data Cleaning
Above, you can see that A001 and A002 have duplicate value but if we select
both columns “ID” and “Name” then we have only one duplicate value (A002,
2).
Follow the these steps to remove duplicate values: Select data –> Go to Data
2. Text to Columns: Let’s say you have data stored in the column as shown
in below snapshot.
Above, you can see that values are separated by semicolon “;”. Now to split
these values in a different column, I will recommend to use the “Text to
Columns” feature in excel. Follow the below steps to convert it to different
columns:
Keyboard shortcuts are the best way to navigate cells or enter formulas more
quickly. We’ve listed our favorites below.
1. Ctrl +[Down|Up Arrow]: Moves to the top or bottom cell of the current
column and combination of Ctrl with Left|Right Arrow key, moves to the
cell furthest left or right in the current row
2. Ctrl + Shift + Down/Up Arrow: Selects all the cells above or below the
current cell
3. Ctrl+ Home: Navigates to cell A1
4. Ctrl+End: Navigates to the last cell that contains data
5. Alt+F1: Creates a chart based on selected data set.
6. Ctrl+Shift+L: Activate auto filter to data table
7. Alt+Down Arrow: To open the drop-down menu of auto filter
Module in Data Science Analytics – mid2023-fdl
9
8. Alt+D+S: To sort the data set
9. Ctrl+O: Open a new workbook
10. Ctrl+N: Create a new workbook
11. F4: Select the range and press F4 key, it will change the reference
to absolute, mixed and relative.
You can use the Analysis Toolpak add-in to generate descriptive statistics. For example, you
may have the scores of 14 participants for a test.
To generate descriptive statistics for these scores, execute the following steps.
Note: can't find the Data Analysis button? Click here to load the Analysis ToolPak add-in.
2. Select Descriptive Statistics and click OK.
6. Click OK.
Result: