0% found this document useful (0 votes)
6 views3 pages

Data Science Syllabus

The document outlines the Data Science course for the academic year 2024-25, aimed at students from the 2021-22 batch, detailing course objectives, structure, and content. It covers key topics such as predictive modeling, classification, clustering, and text mining, with a focus on applying data science techniques to real-world business problems. The course includes a total of 40 lecture hours, with assessments divided between continuous internal evaluation and final exams.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views3 pages

Data Science Syllabus

The document outlines the Data Science course for the academic year 2024-25, aimed at students from the 2021-22 batch, detailing course objectives, structure, and content. It covers key topics such as predictive modeling, classification, clustering, and text mining, with a focus on applying data science techniques to real-world business problems. The course includes a total of 40 lecture hours, with assessments divided between continuous internal evaluation and final exams.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Academic Year : 2024-25 Batch : 2021-22

DATA SCIENCE
Contact Hours/ Week: : 3+0+0 Credits: 3
Total Lecture Hours: : 40 CIE Marks: 50
Sub. Code: : NECE25 SEE Marks: 50

Course objectives:
This course will enable students to:
1. Describe the concept of data science, its scope in business and explain the available
techniques. (L1, L2)
2. Understand Predictive modeling, explain supervised segmentation and given data set
should be able to select (through solving) the attribute for segmentation using the
available techniques. (L2, L3)
3. Explain the concept of Classification and classify (solve) a given data set. (L3)
4. Understand and describe the concept of similarity, neighbors and clustering and apply
it for any real world data. (L3, L4)
5. Explain the concepts of mining text and other data science tasks and techniques. (L2,
L4)
UNIT I
Introduction: Data-Analytic Thinking: The Ubiquity of Data Opportunities, Example:
Hurricane Frances, Example: Predicting Customer Churn. Data Science, Engineering, and Data-
Driven Decision Making, Data Processing and “Big Data”, Data and Data Science Capability as
a Strategic Asset, Data-Analytic Thinking.
Business Problems and Data Science Solutions: From Business Problems to Data
MiningTasks, Supervised Versus Unsupervised Methods, Data Mining and Its Results, The Data
Mining Process, Business Understanding, Data Understanding, Data Preparation, Modeling,
Evaluation, Deployment, Other Analytics Techniques and Technologies: Statistics, Database
Querying, Data Warehousing, Regression Analysis, Machine Learning and Data Mining.
8 Hours

Department of Electronics & Communication Engg., SIT, Tumakuru 94


Academic Year : 2024-25 Batch : 2021-22

UNIT II
Introduction to Predictive Modeling: From Correlation to Supervised Segmentation Models,
Induction, and Prediction, Supervised Segmentation, Selecting Informative Attributes Example:
Attribute Selection with Information Gain, Supervised Segmentation with Tree-Structured
Models, Visualizing Segmentations, Trees as Sets of Rules, Probability Estimation, Example:
Addressing the Churn Problem with Tree Induction.
8 Hours
UNIT III
Fitting a Model to Data: Classification via Mathematical Functions: Linear Discriminant
Functions, Optimizing an Objective Function, An Example of Mining a Linear Discriminant
from Data, Linear Discriminant Functions for Scoring and Ranking Instances, Support Vector
Machines briefly, Regression via Mathematical Functions, Class Probability Estimation and
Logistic “Regression”. Logistic Regression: Some Technical Details. Example: Logistic
Regression versus Tree Induction, Non Linear Functions, Support vector machines and Neural
Networks. Over fitting and Its Avoidance: Fundamental Concepts, Exemplary Techniques,
Regularization, Genaralization, Over fitting, Over fitting Examined.
8 Hours

UNIT IV
Similarity, Neighbors, and Clusters: Similarity and Distance, Nearest-Neighbor
Reasoning,Example: Whiskey Analytics, Nearest Neighbors for Predictive Modeling, How
Many Neighbors and How Much Influence? Geometric Interpretation, Overfitting, and
Complexity Control. Issues with Nearest-Neighbor Methods. Some important Technical Details
Relating to Similarities and neighbors. Clustering, Example: Whiskey Analytics Revisited,
Hierarchical Clustering, Nearest Neighbors Revisited: Clustering Around Centroids.
Understanding the Results of Clustering.
8 Hours

Department of Electronics & Communication Engg., SIT, Tumakuru 95


Academic Year : 2024-25 Batch : 2021-22

UNIT V
Decision Analytic Thinking I: What is a Good Model?: Evaluating Classifiers Plain
Accuracyand its Problems, The confusion matrix, Problems with unbalanced Classes, Problems
with Unequal Costs and Benefits.
Representing and Mining Text:
Why Text Is Important? Why Text Is Difficult?
Representation, Bag of Words, Term Frequency, Measuring Sparseness: Inverse Document
Frequency, Combining Them: TFIDF, Example: Jazz Musicians
Other Data Science Tasks and Techniques: Co-occurrences and Associations: Finding Items
That Go Together, Measuring Surprise: Lift and Leverage, Example: Beer and Lottery Tickets,
Associations Among Facebook Likes, Profiling: Finding Typical Behavior, Link Prediction and
Social Recommendation.
8 Hours

TEXT BOOKS
1 Foster Provost and Tom Data Science for Business, Published by O’ReillyMedia, Inc.
Fawcett First Edition, July 2013.

REFERENCE BOOKS
1 Rachel Schutt & Cathy Doing Data Science, O’Reilly Media, First Edition, October
O’Neil 2013.
2 Hector Cuesta Practical Data Analysis, PACKT Publishing, First
published: October 2013
3 Michael R. Berthold, Guide to Intelligent Data Analysis, Springer-Verlag London
Christian Borgelt, Frank Limited 2010.
Hijppner, Frank Klawonn

Department of Electronics & Communication Engg., SIT, Tumakuru 96

You might also like