9 Ai Project Cycle Notes
9 Ai Project Cycle Notes
Project Cycle is a step-by-step process to solve problems using proven scientific methods and drawing
inferences about them.
Creating birthday card
consider budget
make list of gather data
create model on collected data
Show to someone else to evaluate it.
DATA EXPLORATION: - interpreting/finding useful information (patterns & trends) out from data acquired.
Explore data
Arrange it uniformly
Validate/verify data
Is data gathered according to requirement (specifications decided)?
Is gathered data free from error?
Is it meeting our needs?
DATA CLEANING: - getting rid of commonly found errors & mistakes in data set such as:
Outliers: data points existing out of range/ distant from other observations
Missing data
Erroneous data: incorrect data points/ rejected
DATA VISUALISATION: - graphical representation of data & information
NEED FOR DATA VISUALISATION: -
Graphical representation makes data easier to understand and communicate which gives
sense of trends, patterns, relationships and reveal hidden patterns also.
Helps us to make sense out of huge data.
Helps in strategy building
DATA VISUALISATION TECHNIQUES: -
LINE GRAPH: - graphs that uses lines to connect individual data points. It is most frequently
used for quantitative values over continuous intervals (e.g.: - daily profit earned by
shopkeeper)
AREA GRAPH: - display development of quantitative values over intervals. It is extension of
line chart. (e.g.: - showing trends over time by comparing them: - most preferred show to
watch in Netflix)
BAR CHART: - horizontal/vertical, display comparisons among non – continuous development
over time (e.g.: - rise in population in 5yrs)
HISTOGRAM: - continuous development over time.
PIE CHART: - display circular representation of data that shows proportions and percentages
between categories by dividing the circle (100%) into segments (each showing relative size of
data). This gives quick idea of distribution of data. (e.g.: - marks of all students in class)
BUBBLE CHART: - combination of bubble chart + data visualization + map. It visualize location
and proportions using circles over geographical regions with area of circle proportional to
value in data set (e.g.: - no. of tigers in national park in India)
DATA MODELLING: -
RULE - BASED
AI MODELS
MACHINE LEARNING
LEARNING BASED
DEEP LEARNING
RULE BASED APPROACH: - based on rules and facts defined by developer, fed to machine to perform task
and generate output accordingly. So, it’s a static model i.e. the machine once trained, does not take into
consideration any changes made in the original training dataset. (e.g.: - weather forecast AI model)
LEARNING BASED APPROACH: - dynamic approach AI modelling where the machine learns by itself.
Relationships or patterns are not defined by the developer. Random data is fed into machine and machine
develops its own patterns or trends based on data outputs. The machine tries to extract similar features &
then cluster them into same datasets.
Mimics human intelligence Can learn through data and solve Helps in discovering patterns & trends.
complex problems
Subset of data science Subset of AI Subset of machine learning
It is the stimulation of It is the training of machines to It is the process of using artificial neural
intelligence in machines. take decisions with experience. network for solving complex problems
DECISION TREE – RULE BASED APPROACH: - decision trees are tools that follow rule – based approach. It’s
a kind of flowchart where flow starts at root node & ends with a decision made at leaves. It is used to depict
conditions and their outcomes.
Root node: - 1st node, represents entire set of data with two different ways/condition (yes/no)
Branching: - dividing node at one level into 2 or more sub nodes at next level. The forks or diversions
are known as branches of tree.
Decision node: - dividing a node further into another level sub – node.
Leaf node: - A node that does not split further.
Parent node: - a node that is level above a sub – node.
Child node: - a sub – node that falls under another node.
Following are some of the important points to consider while designing a decision tree.
Possibility of multiple decision trees which lead to correct prediction for a single dataset. The simplest
one should be chosen.
Dataset might contain redundant data at times, so it is necessary that only those parameters that
affect the output directly should be included.
Try selecting any one output and on its basis, find out the common links which all the similar outputs
have.
AI PROJECT EVALUATION: - Evaluation is a process of understanding the reliability of any AI model, based
on outputs by feeding the test dataset into the model and comparing it with actual answers. When AI model
is evaluated for its efficiency and accuracy, it enables continuous improvement of the model to achieve the
project goals. The model must be tested with varied data to ensure that the results are satisfactory. Once
the model is evaluated it must be deployed.
The efficiency of the model is calculated on the basis of the parameters mentioned below:
Accuracy: - percentage of correct predictions out of all the observations.
Precision: - percentage of true positive cases versus all the cases where theprediction is true.
Recall: - fraction of positive cases that are correctly Identified.
F1 Score: - number between 0 and 1 and is the harmonic mean of precision and recall.
AI PROJECT DEPLOYMENT: - Deployment is the process of integrating a newly created Al model into an
existing production environment to make practical implementation of the model with actual data taken as
input to give the desired output. It requires certain settings to be done in terms of hardware and software
so that the Al model can be put to use efficiently by the end users.