Course Code Course Title L T P C
10212EC151 DATA SCIENCE AND 2 0 2 3
VISUALIZATION
a) Course Category
Program Elective
b) Preamble
Data science is one of the hottest professions of the decade, and the demand for data
scientists who can analyze data and communicate results to inform data driven decisions
has never been greater. This course will help in pursuing a career in data science or
machine learning develop career-relevant skills and experience in visualizing the data with
programming in Python and R
c) Prerequisite
Data structures, Object Oriented Programming, Python.
d) Related Courses
Machine Learning, Deep Learning
e) Course Outcome
Upon the successful completion of the course, student will be able to:
Knowledge Level
CO Course
No. Outcomes (Based on
Revised Bloom’s
Taxonomy)
CO1 Understand the basics of Data Science K2
Analyze data by using various statistical and data mining
CO2 K3
approaches
Analyze and Visualize the data using APIs and tools K3
CO3
Understand and create the insights to art of visualization K2
CO4
Analyze the implementation and management tools to
K3
CO5 organize the information system.
f) Correlation of COs with POs (Program Outcomes defined by National Board of
Accreditation, India)
PO1 PO2 PO PO PO PO PO PO PO PO1 PO1 PO1 PS PS
3 4 5 6 7 8 9 0 1 2 O1 O2
CO1 H H - M M - - - - - - - - -
CO2 H H M - - - - - - - - - - -
CO3 M M M - M - - - - - - - - -
CO4 M L L L - - - - - - - - - -
CO5 H H H M M L - - - - - - - -
g) Course Content
UNIT I INTRODUCTION TO DATA SCIENCE 12
Definition – Why data science – Exploring Data Engineering Pipelines and Infrastructure –
Data Scientist - Data Science Process Overview –Defining goals – Retrieving data – Data
preparation – Data exploration – Data modeling – Presentation.
UNIT II EXTRACT KNOWLEDGE FROM DATA 12
Learning from Data with Machine - Math, Probability, and Statistical Modeling - Using
Clustering to Subdivide Data - Modeling with Instances - Building Models That Operate
Internet-of-Things Devices
UNIT III TOOLS FOR DATA SCIENCE 12
Python and R for Data science – Rows and Columns, Creating Data frames, Exploring Data
frames, Accessing Columns in a Data frame - Excel, Knime and SQL in Data Science, Data
Munging: Reading a CSV Text File, Removing Rows and Columns, Renaming Rows and
Columns, Cleaning Up the Elements, Sorting Data frames
UNIT IV DATA VISUALIZATION 10
Principles of Data Visualization Design - D3.js for data Visualization, Web-Based Applications
for Visualization - Exploring Best Practices in Dashboard Design - Making Maps from Spatial
Data
UNIT V DATA VISUALIZATION TOOLKIT 14
Basic principles, categorical and continuous variables, Exploratory graphical analysis, Creating
static graphs, animated visualizations, loops, GIFs and Videos, Data visualization in Python
and R, examples from Bokeh, Altair, ggPlot, ggplot2, gganimate, Image Magick.
Total 6 0 Hrs
h) Learning Resources
Text Books
1. Jeffrey S. Saltz, Jeffrey M. Stanton, An Introduction to Data Science, SAGE Publications,
2017, ISBN: 9781506377537
2. Lillian Pierson, Jake Porway, Data Science For Dummies, Wiley publication, 2017, ISBN:
978-1-119-32763-9
3. Noab Iliinsky, Julie Steele, Designing data visualizations, O’ Reilly publishers, 2011.
Reference Books
1. Rafael A. Irizarry, Introduction to Data Science: Data Analysis and Prediction Algorithms
with R, CRC Press, 2020.
2. Davy Cielen, Arno D. B. Meysman, Mohamed Ali, “Introducing Data Science”, Manning
Publications Co., 1st edition, 2016.
Online Resource
1. http://www.python-course.eu/numpy.php
2. https://www.learnpython.org/en/Pandas_Basics
3. https://www.r-bloggers.com/2015/08/data-manipulation-with-dplyr/
4. https://towardsdatascience.com/intro-to-data-science-531079c38b22?gi=1fb573279fdb
5. https://www.simplilearn.com/tutorials/data-science-tutorial/introduction-to-data-
science
6. https://www.udemy.com
List of Experiments
S.No Experiments CO
Mapping
1 Generating random numbers using probability distributions CO1
a. Write an R program to create an ordered factor from data
consisting of the names of months.
b. Write an R program to get the statistical summary and nature
of the data of a given data frame.
c. Write a Python program which accepts the radius of a circle
from the user and compute the area.
d. Write a Python program to display your details like name, age,
address in three different lines.
2 Programs based on Data aggression, Filtering and Transformation CO2
3 Programs based on appending / merging data CO3
e. Create, manipulate and plot the time series data using R for
annual rainfall details.
f. Create a 2D Numpy array for student database perform
indexing arrays by slicing and perform basic operations on the
array
4 Creation of Basic Visualizations CO4
• Bar chart
• Geographic map
• Crosstab report
• Scatter plot
• Line chart
5 Developing a project to visualize data using Tableau, Python and R CO5