Introduction to Data Science Module 1 (1)
Introduction to Data Science Module 1 (1)
DATA SCIENCE
Instructor
Abubakar Yussuf
EXPERIENCE
ANALYTICS
What is data?
• Data are the raw facts about a certain thing or idea. It refers to any
thing that can bring useful information.
• Data can be volunteered ,observed, or inferred.
• Also data can be structured or unstructured
Data science
• Data science is the field of study that combines domain expertise,
programming skills, and knowledge of mathematics and statistics to
extract meaningful insights from data
Where does data come from?
• Volunteered data are data shared by the individual out of free will
and voluntarily. this is created and explicitly shared by individuals,
such as social network profiles. this type of data might include video
files ,pictures , text or audio files.
• Line Chart
• Column chart
• Bar chart
• Pie Chart
• Scatter plots
Line Chart
• are a type of visualization that uses lines to connect data points
• They are particularly useful for showing trends and changes over time.
• Tracking trends: Line charts are great for spotting trends in data,
such as growth in sales figures or changes in stock prices over time.
• Comparing data sets: You can use multiple lines on the same chart to
compare trends between different groups or categories.
• Highlighting seasonality: Line charts can reveal seasonal patterns in
data, such as fluctuations in website traffic or product sales throughout
the year.
Best practices when drawing Line Charts
• Make sure to add a title
• Label the Axis: The line Charts has Two axis that is X-axis and Y-axis
• Use solid lines to connects the data points
• Limit the number of lines in one chart for easier understanding of the
chart.
• Color the lines with different colors in order to distinguish which line
represent which trend.
• Make sure to add a legend that will explain each line represents what
based on colors
Column Chart
• A column chart, also sometimes called a vertical bar chart, is a type of
visualization that uses vertical bars to represent data . These bars are
helpful for comparing different categories of data and their values.
• A column chart visually displays data by using rectangles (columns),
where the height of each column corresponds to the values being
plotted.
Key elements of Column Chart
Categories: These are the groups or types of data being represented.
Each category is typically displayed on the horizontal axis (X-axis) of
the chart.
Values: These are the numerical quantities being compared. The
values are typically represented on the vertical axis (Y-axis) of the
chart. The height of each column is proportional to the value it
represents.
Columns: Rectangles that extend upwards from the X-axis for each
category. The length of each column reflects the value associated with
that category.
Axes:
• X-axis (horizontal): This axis lists the categories being compared.
• Y-axis (vertical): This axis displays the scale of the values being
measured. It usually starts at zero to accurately represent comparisons.
Bar Charts
• Bar charts, also sometimes called bar graphs, are a type of
visualization that uses rectangular bars to represent data .
• They are a versatile tool for displaying comparisons among different
categories of data.
• They is drawn horizontaly.
• For a bar chart the Y axis typically displays a category such as top
grossing movies of 2019 in the example below, whilst the X axis
displays a discrete value.
Best practices for Bar and Column Charts
• Label the axes.
• Consider ordering the bars so that the lengths go from longest to
shortest. The data type will most likely determine whether the longest
bar should be on the bottom or the top to best illustrate the intended
pattern or trend.
• Start the value of the x-axis at zero to accurately reflect the total value
of the bars.
• The spacing between bars should be roughly half the width of a bar.
Pie Charts
• A pie chart is a circular graph used to represent portions of a whole.
It's a great way to visualize data that represents categories and their
contribution to a total value.
Uses of Pie Charts:
• Showcasing proportions: Pie charts excel at highlighting the relative
sizes of different categories and their contribution to the whole.
• Simple comparisons: They are effective for comparing a few major
categories (ideally 4-6) at a glance.
• Limit the number of slices: Stick
to 4-6 slices ideally. Too many
slices make the pie chart difficult to
interpret and visually clutter the
data.
• Focus on proportions: Ensure the
pie chart is used to represent parts
of a whole, where all slices
combined add up to 100%.
• Use different colors for each
segment/slice
Scatter plot
• is a type of visual representation
used to display relationships
between two continuous variables
Uses of Scatter Plots: