100% found this document useful (2 votes)

39 views18 pages

Unit 1-FDS

The document outlines a Data Science course aimed at teaching fundamentals, machine learning algorithms, data collection, and evaluation techniques. It emphasizes the importance of data science in decision-making across various industries and highlights the ethical considerations, including data privacy and bias mitigation. Additionally, it lists essential skills and tools for aspiring data scientists, such as statistical skills, programming languages, and cloud services.

Uploaded by

thotakurilokesh29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

39 views18 pages

Unit 1-FDS

Uploaded by

thotakurilokesh29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Course Objectives:

This course aims to:

1. To understand the fundamentals concepts of Data Science

2. Demonstrate and analyze the different data types and analytic techniques.
3. To learn about various machine learning algorithms.
4. To familiarize with data collection techniques.
5. To study different evaluation techniques.

Course Outcomes:
Upon completion of this course, students will be able to:

1. Explain the need of Data Science to analyze the skill sets of data scientists.
2. Describe the Data Science Process and its components interact.
3. Apply basic machine learning algorithms for predictive modeling.
4. Simplify a real-world problem into mathematical terms.
5. Create effective visualization of given data.
UNIT-I Syllabus

Introduction: Introduction to Data Science,

Evolution of Data Science,
Data Science Roles,
Stages in a Data Science Project,
Information vs Data,
Computational Thinking,
Skills for Data Science,
Tool for Data Science,
Issues of Ethics, Bias, Privacy in Data Science.

Text Books:

1. Chirag Shah, A Hands-On Introduction to Data Science. Cambridge: Cambridge University Press, 2020.
2. Rafael A. Irizarry, Introduction to Data Science: Data Analysis and Prediction Algorithms with R, CRC Press,
2020.
What is Data Science?

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and
systems to extract knowledge and insights from data in various forms, both structured and
unstructured, similar to data mining.

Why is Data Science?

• Because you have too many data such as money, reviews, customer data, people working, etc.
• You want to keep it clear and easy to understand so you can make a change that’s why data science
is relevant.
• Data analysis lets people make better decisions, either faster or better.
Why Data Science is important?

Every company, however, has information, and its business value depends on how much information it thinks.
Since late, Information Science has acquired significance in the light of the fact that it can assist companies with
growing business estimation of their accessible knowledge and thus allow them to take the upper hand against their
rivals.

It can help us know our customers better, it can help us refine our processes and it can help us make better decisions.
Knowledge, in the light of information technology, has become a vital instrument.
Role of Data Scientist

• Data scientists help organizations understand and handle data, and address complex problems using
knowledge from a range of technology.

• They are typically built in the fields of computer science, modeling, statistics, analytics and
mathematics, coupled with modeling statistics and mathematics combined with a clear business sense.
How to do Data Science?

A typical data science process looks like this, which can be modified for specific use
case:

 Understand the business

 Collect & explore the data
 Prepare & process the data
 Build & validate the models
 Deploy & monitor the performance
Evolution of Data Science

1962: American mathematician John W. Tukey first articulated the data science dream. In his now-famous article "The Future of Data Analysis," he
foresaw the inevitable emergence of a new field nearly two decades before the first personal computers.

1977: The theories and predictions of "pre" data scientists like Tukey and Naur became more concrete with the establishment of The International
Association for Statistical Computing (IASC), whose mission was "to link traditional statistical methodology, modern computer technology, and the
knowledge of domain experts in order to convert data into information and knowledge.“

1980s and 1990s: Data science began taking more significant strides with the emergence of the first Knowledge Discovery in Databases (KDD)
workshop and the founding of the International Federation of Classification Societies (IFCS).

1994: BusinessWeek published a story on the new phenomenon of "Database Marketing.” It described the process by which businesses were collecting
and leveraging enormous amounts of data to learn more about their customers, competition, or advertising techniques. The only problem at the time
was that these companies were flooded with more information than they could possibly manage.

1990s and early 2000s: We can clearly see that data science has emerged as a recognized and specialized field.

2000s: Technology made enormous leaps by providing nearly universal access to internet connectivity, communication, and (of course) data collection.

2005: Big data enters the scene. With tech giants such as Google and Facebook uncovering large amounts of data, new technologies capable of
processing them became necessary. Hadoop rose to the challenge, and later on Spark and Cassandra made their debuts.
2014: Due to the increasing importance of data, and organizations’ interest in finding patterns and making better business decisions, demand for data
scientists began to see dramatic growth in different parts of the world.
2015: Machine learning, deep learning, and Artificial Intelligence (AI) officially enter the realm of data science. These
technologies have driven innovations over the past decade — from personalized shopping and entertainment to self-
driven vehicles along with all the insights to efficiently bring forth these real-life applications of AI into our daily lives.

2018: New regulations in the field are perhaps one of the biggest aspects in the evolution in data science.

2020s: We are seeing additional breakthroughs in AI, machine learning, and an ever-more-increasing demand for qualified
professionals in Big Data
Applications of Data Science

1. Data Science in Healthcare

2. Data Science in E-commerce
3. Data Science in Manufacturing
4. Data Science as Conversational Agents
5. Data Science in Transport
Top Data Science Skills to Learn in 2024
1. Statistical Skills
As a Data Scientist, your primary job is to collect, analyze, and interpret large amounts of data and produce actionable insights for a
company. So obviously Statistical Skills are a big part of the job description.

2. Programming Languages
Python is used because of its capacity for statistical analysis and its easy readability. Python also has rich libraries and various packages for
Machine Learning, data visualization, data analysis, etc. that make it suited for data science.

3. Machine Learning
Machine Learning is all the rage in Data Science these days! It enables machines to learn a task from experience without programming them
specifically. This is done by training the machines using various machine learning models using the data and different algorithms. So you
need to be familiar with Supervised and Unsupervised Machine Learning algorithms like Linear Regression, Logistic Regression, K-means
Clustering, Decision Tree, K Nearest Neighbor.

4.Cloud Services
Well, more and more companies are moving their databases to the cloud with time. This could be a move to the public, private or hybrid
cloud with the most popular contenders being Amazon Web Services and Microsoft Azure. Most companies are also moving big data and
analytics applications on the cloud and so Data Scientist needs to understand these cloud services a little more deeply so that they can
perform data analytics effectively.

5.SQL
You should be able to write and execute complex queries in SQL that will help in carrying out analytical functions and changing the
database as required. You need to be proficient in SQL as a Data Scientist that you can access the data easily as well as work on it SQL can
give you deep insights into a database depending on your query.
Tools for Data Science

1. Python
2. R
3. SQL
4. Hadoop
5. Tableau
6. Weka
The Importance of Ethical Data Usage
Data Scientists are the Heart of Data they hold the data which can make powerful decisions that can shape
the future. The data is more valuable than anything so maintaining ethical standards is not a obligation but
it's a fundamental aspect of a Data scientist ensuring responsible data usage.

Ethical Data usage is the main block of trust. When individuals provide their Data to organizations or
platforms, they expect it to maintain with integrity and basic ethics. Respecting their privacy is most
important part as it will increase the organization reputation.
Bias Mitigation
Identifying and mitigating biases in data and algorithms is critical for fair outcomes.

This includes:

1. Data Audits: Regularly auditing datasets for inherent biases based on demographics or historical
imbalances.

2. Algorithm Fairness: Assessing algorithms to detect and rectify biases in decision-making processes to
ensure fairness across diverse groups.

3. Diverse Representation: Actively seeking diverse perspectives and inclusivity in datasets and model
development to avoid reinforcing existing biases.
Data Privacy and Consent
Respecting data privacy laws and obtaining informed consent are foundational principles:

a. Informed Consent: Clearly communicating to individuals how their data will be used, ensuring they understand
and agree to its usage.

b. Anonymization: Stripping personally identifiable information whenever possible to protect individual

identities.

c. Compliance: Adhering to legal frameworks such as GDPR, HIPAA, or CCPA to ensure lawful and ethical data
handling.

Seminar On Data Science
100% (7)
Seminar On Data Science
25 pages
Fundamentals of Data Science
100% (3)
Fundamentals of Data Science
62 pages
Chapter one-DSA
No ratings yet
Chapter one-DSA
20 pages
Unit-3 Intr Data Science
No ratings yet
Unit-3 Intr Data Science
150 pages
Unit 1 DS BCA NOTES
No ratings yet
Unit 1 DS BCA NOTES
7 pages
Basics of Data Science KPK
No ratings yet
Basics of Data Science KPK
38 pages
Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
No ratings yet
Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
5 pages
Himadev
No ratings yet
Himadev
37 pages
Introduction To Data-Science
No ratings yet
Introduction To Data-Science
246 pages
Introductiontodatascience 230122140841 B90a0856
No ratings yet
Introductiontodatascience 230122140841 B90a0856
44 pages
Fundamentals of Data Science
No ratings yet
Fundamentals of Data Science
53 pages
Data Science PDF
No ratings yet
Data Science PDF
8 pages
DS Notes
No ratings yet
DS Notes
159 pages
DS B&V-1
No ratings yet
DS B&V-1
30 pages
IDS Complete Notes
No ratings yet
IDS Complete Notes
126 pages
Anu Data Scie
No ratings yet
Anu Data Scie
32 pages
Data Science Life Cycle
No ratings yet
Data Science Life Cycle
12 pages
Introductiontodatascience 230122140841 B90a0856 1
No ratings yet
Introductiontodatascience 230122140841 B90a0856 1
44 pages
Data Science CLASS 12 INVESTIGATORY PROJECT
No ratings yet
Data Science CLASS 12 INVESTIGATORY PROJECT
9 pages
Unit - 1 DS
No ratings yet
Unit - 1 DS
24 pages
HUI-CMP201 Note 5
No ratings yet
HUI-CMP201 Note 5
62 pages
Data Science M-1 Notes
No ratings yet
Data Science M-1 Notes
34 pages
Ch7-Overview of Data Science-Part 1
No ratings yet
Ch7-Overview of Data Science-Part 1
37 pages
Introduction To Data Science What Is Data Science?
No ratings yet
Introduction To Data Science What Is Data Science?
11 pages
CD101 Fundamental of Data Science
No ratings yet
CD101 Fundamental of Data Science
41 pages
FDS - Lecture Notes - III AIML, CSM
No ratings yet
FDS - Lecture Notes - III AIML, CSM
101 pages
Computational Data Science - Unit 1
No ratings yet
Computational Data Science - Unit 1
18 pages
Data Science Components
No ratings yet
Data Science Components
7 pages
AI UNIT 1 Data Science
No ratings yet
AI UNIT 1 Data Science
16 pages
Introduction To Datasciecne
No ratings yet
Introduction To Datasciecne
50 pages
Handbook Introduction of Data Science AY 23-24
No ratings yet
Handbook Introduction of Data Science AY 23-24
171 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
17 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
25 pages
Data Science Modern Technology5
No ratings yet
Data Science Modern Technology5
6 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
24 pages
Applied - Data - Science MODULE 1 SEM8
No ratings yet
Applied - Data - Science MODULE 1 SEM8
16 pages
Data Science Intro Session-18 & 19
No ratings yet
Data Science Intro Session-18 & 19
48 pages
Vishwha D
No ratings yet
Vishwha D
29 pages
1) Data-Sci Chapter-1
No ratings yet
1) Data-Sci Chapter-1
17 pages
PSD02 - Data Science Overview
No ratings yet
PSD02 - Data Science Overview
64 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
17 pages
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
No ratings yet
Unit 1 FUNDAMENTALS OF DATA SCIENCE-1
27 pages
Unit-1 - IDS
No ratings yet
Unit-1 - IDS
29 pages
Applied Data Science Career Guide
No ratings yet
Applied Data Science Career Guide
15 pages
What Is A Data Scientist
No ratings yet
What Is A Data Scientist
21 pages
Learn About The Importance of Data Science
No ratings yet
Learn About The Importance of Data Science
10 pages
Module 1 Applied Data Science 1.1 and 1.2
No ratings yet
Module 1 Applied Data Science 1.1 and 1.2
104 pages
Chapter 1
No ratings yet
Chapter 1
85 pages
Session 1819
No ratings yet
Session 1819
47 pages
Fds Module 1
No ratings yet
Fds Module 1
65 pages
Basic of Ds
No ratings yet
Basic of Ds
14 pages
Data Science Notes Mtech
No ratings yet
Data Science Notes Mtech
115 pages
Dsbda Unit 1
No ratings yet
Dsbda Unit 1
119 pages
Datascience Internship
No ratings yet
Datascience Internship
19 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
85 pages
347 862932 Introduction
No ratings yet
347 862932 Introduction
35 pages
What Is Data Science
No ratings yet
What Is Data Science
14 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Comprehensive Guide to Implementing Data Science and Analytics: Tips, Recommendations, and Strategies for Success
From Everand
Comprehensive Guide to Implementing Data Science and Analytics: Tips, Recommendations, and Strategies for Success
Rick Spair
No ratings yet
Mastering Data Science with Python: The Ultimate Guide: Unlock the Power of Data Analysis and Visualization with Python's Cutting-Edge Tools and Techniques
From Everand
Mastering Data Science with Python: The Ultimate Guide: Unlock the Power of Data Analysis and Visualization with Python's Cutting-Edge Tools and Techniques
daniel Huston
No ratings yet

Unit 1-FDS

Uploaded by

Unit 1-FDS

Uploaded by

Course Objectives:

This course aims to:

1. To understand the fundamentals concepts of Data Science

Introduction: Introduction to Data Science,

Why is Data Science?

 Understand the business

1. Data Science in Healthcare

b. Anonymization: Stripping personally identifiable information whenever possible to protect individual

You might also like