0% found this document useful (0 votes)

27 views3 pages

Data Science Unit-I Notes

Uploaded by

krishnareddyrpt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views3 pages

Data Science Unit-I Notes

Uploaded by

krishnareddyrpt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Definition: Data science is a multidisciplinary field that uses algorithms, processes, scientific methods,

and systems to extract knowledge and insights from structured and unstructured data.

Characteristics of Data Science:

One of the key concepts in this field is model fitting.

This article delves deep into what model fitting in data science is, its importance, how it works, and its
application in various industries.

Defining model fitting in data science

Model fitting measures how well a statistical model describes a set of
observations.

In data science, models are mathematical constructs that represent

real-world processes.

These models are created using algorithms and are then ‘fitted’ to a
set of data points.

The fitting process involves adjusting the model’s parameters to

optimise the match between the model’s predictions and the actual
data.

Model fitting in data science is a crucial step in data analysis.

It allows data scientists to make accurate predictions, identify patterns,

and make informed decisions based on the data.

The quality of the model fit can significantly impact the validity and
reliability of the conclusions drawn from the data.

The process of model fitting in data

science
The process of model fitting in data science involves several steps.

These include data collection, model selection, parameter estimation,

and evaluation.
Each step is crucial in ensuring that the model accurately represents
the data and can make reliable predictions.

Data collection
The first step in the model fitting process is data collection.

This involves gathering relevant data that will be used to train the
model.

The quantity and quality of the data collected can significantly

influence the accuracy of the model fit.

Collecting diverse data is essential to ensure the model can generalise

well to new, unseen data.

Model selection
Once the data has been collected, the next step is model selection.

This involves choosing the most appropriate statistical or machine-

learning model for the data.

The choice of model depends on the nature of the data, the problem at
hand, and the specific goals of the analysis.

Parameter estimation
After selecting a model, the next step is parameter estimation.

This involves adjusting the model’s parameters to optimise the fit to

the data.

This is often done using maximum likelihood estimation or least

squares estimation.

Model evaluation
The final step in the model fitting process is model evaluation.
This involves assessing the quality of the model fit and determining
how well the model can predict new data.

This is typically done using techniques such as cross-

validation or bootstrapping.

Why does overfitting occur?

You only get accurate predictions if the machine learning model generalizes to all types of data
within its domain. Overfitting occurs when the model cannot generalize and fits too closely to
the training dataset instead. Overfitting happens due to several reasons, such as:
• The training data size is too small and does not contain enough data samples to accurately
represent all possible input data values.
• The training data contains large amounts of irrelevant information, called noisy data.
• The model trains for too long on a single sample set of data.
• The model complexity is high, so it learns the noise within the training data.

Overfitting examples
Consider a use case where a machine learning model has to analyze photos and identify the
ones that contain dogs in them. If the machine learning model was trained on a data set that
contained majority photos showing dogs outside in parks , it may may learn to use grass as a
feature for classification, and may not recognize a dog inside a room.
Another overfitting example is a machine learning algorithm that predicts a university student's
academic performance and graduation outcome by analyzing several factors like family income,
past academic performance, and academic qualifications of parents. However, the test data
only includes candidates from a specific gender or ethnic group. In this case, overfitting causes
the algorithm's prediction accuracy to drop for candidates with gender or ethnicity outside of
the test dataset.

English9 - q1 - Mod10 - Take Note of Sequence Signals or Connectors To Determine Patterns of Idea Development Given in A Text
100% (1)
English9 - q1 - Mod10 - Take Note of Sequence Signals or Connectors To Determine Patterns of Idea Development Given in A Text
15 pages
Lesson 1 Fundamentals of Reading Academic Texts
100% (6)
Lesson 1 Fundamentals of Reading Academic Texts
26 pages
U&O Fitting
No ratings yet
U&O Fitting
6 pages
Week 15
No ratings yet
Week 15
41 pages
Data Science Concepts Overfitting Underfitting
No ratings yet
Data Science Concepts Overfitting Underfitting
8 pages
Machine Learning Notes Anna University
No ratings yet
Machine Learning Notes Anna University
9 pages
ML - Underfitting and Overfitting - GeeksforGeeks
No ratings yet
ML - Underfitting and Overfitting - GeeksforGeeks
8 pages
Overfitting and Underfitting in Machine Learning
No ratings yet
Overfitting and Underfitting in Machine Learning
3 pages
Machine Learning Basics Understanding Overfitting and Underfitting
No ratings yet
Machine Learning Basics Understanding Overfitting and Underfitting
11 pages
Bias and Variance
No ratings yet
Bias and Variance
4 pages
Underfitting and Overfitting Slides and Transcript
No ratings yet
Underfitting and Overfitting Slides and Transcript
13 pages
Underfitting and Overfitting
No ratings yet
Underfitting and Overfitting
4 pages
Emsemble Methods-Pages-Deleted
No ratings yet
Emsemble Methods-Pages-Deleted
2 pages
OVERFITTING and UNDERFITTING
No ratings yet
OVERFITTING and UNDERFITTING
5 pages
DSA Module 3 Notes
No ratings yet
DSA Module 3 Notes
22 pages
Model Evaluation
No ratings yet
Model Evaluation
29 pages
Fitting A Model: Generalization, Overfitting, Underfitting
No ratings yet
Fitting A Model: Generalization, Overfitting, Underfitting
1 page
Lecture - 1
No ratings yet
Lecture - 1
35 pages
Bias - Variance
No ratings yet
Bias - Variance
2 pages
Overfitting
No ratings yet
Overfitting
7 pages
Module3 DS PPT
No ratings yet
Module3 DS PPT
68 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Chapter 1-ML
No ratings yet
Chapter 1-ML
27 pages
Bias and Variance in Machine Learning
No ratings yet
Bias and Variance in Machine Learning
3 pages
Week 4 - Intro To ML
No ratings yet
Week 4 - Intro To ML
37 pages
Underfitting and Overfitting in Machine Learning by ROll (41,42)
No ratings yet
Underfitting and Overfitting in Machine Learning by ROll (41,42)
29 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
Lecture 2.2 Example Data Preparation Feature Engineering
No ratings yet
Lecture 2.2 Example Data Preparation Feature Engineering
25 pages
Underfitting
No ratings yet
Underfitting
13 pages
Overfitting and Underfitting
No ratings yet
Overfitting and Underfitting
25 pages
Ass 2
No ratings yet
Ass 2
12 pages
Unit III
No ratings yet
Unit III
19 pages
DSA Module 3
No ratings yet
DSA Module 3
30 pages
ML Bu
No ratings yet
ML Bu
31 pages
Classification
No ratings yet
Classification
53 pages
Data Analyst Interview Questionaries
No ratings yet
Data Analyst Interview Questionaries
16 pages
Machine Leafning
No ratings yet
Machine Leafning
5 pages
DL Unit1
100% (1)
DL Unit1
79 pages
Overfitting Vs Underfitting
No ratings yet
Overfitting Vs Underfitting
3 pages
5.3 Model
No ratings yet
5.3 Model
26 pages
ML & DL
No ratings yet
ML & DL
19 pages
Complete ML Concepts
No ratings yet
Complete ML Concepts
30 pages
Introduction To ML
No ratings yet
Introduction To ML
55 pages
Regression
No ratings yet
Regression
24 pages
Unit 1 DataScience
No ratings yet
Unit 1 DataScience
105 pages
Overfitting and Underfitting
No ratings yet
Overfitting and Underfitting
3 pages
Data Science
No ratings yet
Data Science
38 pages
Data Science Career Guide Interview Preparation
From Everand
Data Science Career Guide Interview Preparation
Gradient Publication
No ratings yet
Data Analytics with Generative AI
From Everand
Data Analytics with Generative AI
Younish P
No ratings yet
DL UNIT 1 (AB22) Continution
No ratings yet
DL UNIT 1 (AB22) Continution
9 pages
Data Science
No ratings yet
Data Science
64 pages
Unit II - 2.5 - Overfitting Underfitting at CSJMU - 6 Slides Handouts
No ratings yet
Unit II - 2.5 - Overfitting Underfitting at CSJMU - 6 Slides Handouts
5 pages
Dav Unit 3
No ratings yet
Dav Unit 3
50 pages
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet
Csa202 Unit 2
No ratings yet
Csa202 Unit 2
36 pages
Data Science-Unit-4 - 05.10.23
No ratings yet
Data Science-Unit-4 - 05.10.23
59 pages
Overfitting & Feature Engineering
No ratings yet
Overfitting & Feature Engineering
37 pages
Diagnosing Bias Vs Variance
No ratings yet
Diagnosing Bias Vs Variance
11 pages
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet
Lecture 11
No ratings yet
Lecture 11
18 pages
Unit 1 Notes - FML
No ratings yet
Unit 1 Notes - FML
95 pages
Lecture 1 Machine Learning
No ratings yet
Lecture 1 Machine Learning
22 pages
Work Immersion Appraisal
No ratings yet
Work Immersion Appraisal
7 pages
Assistive Technology Implementation Plan: Student Information
No ratings yet
Assistive Technology Implementation Plan: Student Information
4 pages
Norms
No ratings yet
Norms
3 pages
LR of Selection Process
No ratings yet
LR of Selection Process
12 pages
Nachiket Sakharam Jambhorkar
No ratings yet
Nachiket Sakharam Jambhorkar
1 page
Logic Syllabus
No ratings yet
Logic Syllabus
5 pages
Aysel Aghasiyeva: Written, Spoken, Understanding
No ratings yet
Aysel Aghasiyeva: Written, Spoken, Understanding
4 pages
Overview of Qualitative Research Designs (9!5!2020)
No ratings yet
Overview of Qualitative Research Designs (9!5!2020)
14 pages
The Theory of Mind Myth
No ratings yet
The Theory of Mind Myth
5 pages
Flying A Kite Online Lesson Plan
No ratings yet
Flying A Kite Online Lesson Plan
4 pages
Chapter 7
100% (1)
Chapter 7
38 pages
Decision Making
100% (1)
Decision Making
7 pages
Q4 English 8 Week2
No ratings yet
Q4 English 8 Week2
5 pages
Building Body Acceptance - 04 - Reducing Checking and Reassurance Seeking
No ratings yet
Building Body Acceptance - 04 - Reducing Checking and Reassurance Seeking
15 pages
Ágnes Heller - General Ethics (1990, Blackwell)
No ratings yet
Ágnes Heller - General Ethics (1990, Blackwell)
208 pages
Resume Format For Media Student
No ratings yet
Resume Format For Media Student
2 pages
Support Programs:: Multigrade Progam in Philippine Education (Mppe)
100% (3)
Support Programs:: Multigrade Progam in Philippine Education (Mppe)
1 page
Grade 10 Quiz
No ratings yet
Grade 10 Quiz
2 pages
Business English Negotiations Jigsaw Dialogues and Useful Phrases
No ratings yet
Business English Negotiations Jigsaw Dialogues and Useful Phrases
6 pages
TEST I. (1 Point Each) Multiple Choice: Choose The Best Answer of The Following Item. Write The Letter of Your
100% (1)
TEST I. (1 Point Each) Multiple Choice: Choose The Best Answer of The Following Item. Write The Letter of Your
3 pages
Graf 1987
No ratings yet
Graf 1987
24 pages
Clarissa Stinson: Education
No ratings yet
Clarissa Stinson: Education
3 pages
Skill-Based Pay Skill-Based Pay, Sometimes Called Specialty-Based or Market-Based Pay, Refers To Higher
No ratings yet
Skill-Based Pay Skill-Based Pay, Sometimes Called Specialty-Based or Market-Based Pay, Refers To Higher
2 pages
Purposive Communication Lesson 5
No ratings yet
Purposive Communication Lesson 5
14 pages
HLP EdD Nurul Hayat Yahya
No ratings yet
HLP EdD Nurul Hayat Yahya
5 pages
Accomplishment Report FBS
No ratings yet
Accomplishment Report FBS
3 pages
Classroom Improvement Plan
100% (5)
Classroom Improvement Plan
2 pages
The Role of Stress in The Pathogenesis and Maintenance of Obsessive-Compulsive Disorder
No ratings yet
The Role of Stress in The Pathogenesis and Maintenance of Obsessive-Compulsive Disorder
11 pages

Data Science Unit-I Notes

Uploaded by

Data Science Unit-I Notes

Uploaded by

Definition: Data science is a multidisciplinary field that uses algorithms, processes, scientific methods,

Characteristics of Data Science:

One of the key concepts in this field is model fitting.

Defining model fitting in data science

In data science, models are mathematical constructs that represent

The fitting process involves adjusting the model’s parameters to

Model fitting in data science is a crucial step in data analysis.

It allows data scientists to make accurate predictions, identify patterns,

The process of model fitting in data

These include data collection, model selection, parameter estimation,

The quantity and quality of the data collected can significantly

Collecting diverse data is essential to ensure the model can generalise

This involves choosing the most appropriate statistical or machine-

This involves adjusting the model’s parameters to optimise the fit to

This is often done using maximum likelihood estimation or least

This is typically done using techniques such as cross-

Why does overfitting occur?

You might also like