Data Science Notes
Data Science Notes
Class X 10
WHAT IS PROJECT CYCLE
Project Cycle is a step-by-step process to solve problems using proven
scientific methods and drawing inferences about them.
Components of the project cycle are the steps that contribute to completing
the Project. The Components of AI Project Cycle are:-
PROBLEM SCOPING
The 4W’s of Problem Scoping are Who, What, Where, and Why. This Ws
helps in identifying and understanding the problem in a better and efficient
manner.
Who - “Who” part helps us in comprehending and categorizing who all
are affected directly and indirectly with the problem and who are called
the Stake Holders
Where - "Where” does the problem arise, situation, context, and location.
The Problem Statement Template helps us to summarize all the key points into
one single template.
So that in the future, whenever there is a need to look back at the basis of the
problem, we can take a look at the Problem Statement Template and
understand its key elements of it.
DATA ACQUISITION
2 Types of Data Sets
The process of collecting accurate and reliable data to work with.
Data features ⚆_⚆ Refer to the type of data you want to collect.
→
Cameras Observations
Camera captures the visual When we observe something
information and then that information carefully we get some information
which is called image is used as a For ex: Scientists Observe
source of data. creatures to study them.
Cameras are used to capture raw Observations is a time consuming
visual data. data source.
API Surveys
Application Programming interface. The survey is a method of
gathering specific information
API is a messenger which takes
from a sample of people.
requests and tells the system about
requests and gives the response. Example, a census survey for
analyzing the population.
Ex: Twitter API, Google Search API
DATA EXPLORATION
Data Exploration is the process of arranging the gathered data uniformly for a
better understanding. Data can be arranged in the form of a table, plotting a
chart, or making a database.
To analyse the data, you need to visualise it in some user-friendly format so
that you can:
Quickly get a sense of the trends, relationships and patterns
Define strategy for which model to use at a later stage
Communicate the same to others effectively
1 DATA VISUALISATION TOOLS
The tools used to visualise the acquired data are known as data visualisation
or exploration tools.
Rule Based Approach
AI MODELLING → 2 ways/Approaches
Learning Based Approach
Ex: You trained your model with 100 images of apples and bananas. Now If you test
it by showing an apple, it will figure out and tell if it's an apple or not. Here Labeled
images of apple and banana were fed, due to which the model could detect the fruit.
*Labeled Images: Simply, when the model is told about what the image is.
Data Sets
→ Classification
Here, Data is categorized under different labels
according to some parameters given in the input
and then the labels are predicted for the data.
→ Regression
Regression is a type of supervised learning
which is used to predict continuous value.
→ Clustering
Its an algorithm which can cluster the unknown
data according to the patterns or trends identified
out of it
The patterns observed can be known to the
developer or it can be unique.
Reinforcement Learning
Used for Training the Model Used for Testing the Model after it is trained
Use
Is a lot bigger than testing data and It is smaller than Training Set andconstitutes
Size constitutes about 70% to 80% about 20% to 30%
EVALUATION