Assignment 3
Assignment 3
1) What is a scatter plot? For what type of data is a scatter plot usually used for?
A) A scatter plot is a visual tool utilized to illustrate the connection between two variables. It usually
consists of points on a coordinate grid, where each point signifies a data point, helping to detect
trends or patterns in the data. In general, scatter plots are employed to depict the association,
whether it's correlation or trends, between two continuous or numeric (quantitative) variables.
A) Pie charts are typically used when you have one categorical variable and one continuous numeric
variable to create a visualization.
They are not suitable for representing complex data with a large number of categories in the
categorical variable.
Without appropriate data labels indicating the degree or area of each segment within a category, it
can be difficult to compare and comprehend, especially when segment sizes are similar.
Pie charts provide a sense of the proportion of each category in relation to one another but do not
provide precise numeric values for each category.
While it is possible to add numeric values, doing so can make the chart less relevant, as it goes
beyond the chart's intended purpose.
3) Name two charts that can be used for each of the following:
1. Bar Chart
2. Column Chart
1. Scatter Plot
2. Line chart
1. Pie Chart
2. Donut chart
d. Looking at how data is distributed:
1. Histogram
2. Box plot
1. Map Chart
2. 3d Map Chart
1. Line Chart
4) Differentiate between the following charts in terms of data supported, and application:
Purpose: Box Plots are well-suited for comparing data distributions and identifying outliers.
Histograms offer a visual summary of data distribution and are valuable for grasping the
spread of data.
b. Comparing Histograms and Bar Charts:
Data Utilization: Histograms represent the distribution of continuous data. Bar charts can
convey relationships using a single qualitative variable and one quantitative variable.
Application: Histograms are employed to visualize data frequency and distribution. Bar charts
are appropriate for comparing categories or discrete data points.
5) Using Tableau and any chart of your choice, make a visual representation of the data from
salary.csv. Preprocessing the data
A)To start, any unnecessary cells that could cause inconsistencies in the creation of a uniform column
with consistent data types were removed.
After that, the salary range was replaced with the average value of the range to convert it into a
numerical variable.
Furthermore, improvements were implemented in the qualification column by adding a category for
"lack of a degree" and merging cells when required. The final outcome is as follows
A side-by-side column chart is an efficient visual representation for displaying this data. It enables
straightforward comparisons with other salary profiles. To enhance clarity, we have applied color
coding to the relevant experience columns, provided clear labels for the data, and included filtering
options for qualifications and relevant work experience to create customized visualizations based on
specific criteria.