Project cycle- Key points
Project cycle- Key points
AI Project Cycle
AI Project cycle is a framework which is used to design an AI project. This concept provides an organized plan for
breaking down the task of program development into manageable modules. The project cycle consists of 5 stages
namely: Problem scoping, Data acquisition, Data Exploration, Modelling and Evaluation.
The AI Project Cycle mainly have 5 stages as shown
Problem Scoping-
Problem scoping refers to understanding a problem and finding out various factors that affect the problem. In this
stage of the AI project cycle, 4W problem canvas method is used that helps the user answer questions related to the
problem thereby arriving at a definite problem statement. The 4Ws are Who, What , When/Where and Why. The
answers to these questions lead to a problem statement.
Reference: https://sdgs.un.org/goals
Data acquisition- Data acquisition refers to acquiring authentic data crucial for the AI model from reliable sources.
The data acquired can then be divided into two categories: Training Data and Testing Data. The AI model gets
trained on the basis of training data and is evaluated on the basis of testing data. There can be various ways in which
students can collect data. Some of them are:
• Surveys • Cameras • Web Scraping – data.gov.in, kaggle.com
• Sensors • Observations • Application Program Interface
Loopy is an open source tool to understand the concept of system maps. A system map shows the components and
boundaries of a system and the components of the environment at a specific point in time. With the help of system
maps, one can easily define a relationship amongst different elements which come under a system. The map shows
the cause & effect relationships of elements with each other with the help of arrows. The arrow-had depicts the
direction of the effect and a sign (+ or -) shows their relationship. A + sign indicated positive relationship and a - sign
indicates negative relationship between the elements. Considering the data features of any problem to be solved, a
system map can be drawn.
Reference: https://ncase.me/loopy/
Data Exploration- This is the process of refining the gathered data which needs to be arrange uniformly
for better understanding .
After acquiring data comes the need to analyze the data. For this, they need to visualize the acquired data in some
user-friendly format so that they can:
● Quickly get a sense of the trends, relationships and patterns contained within the data.
● Define strategy for which model to use at a later stage.
● Communicate the same to others effectively.
Data Exploration basically refers to visualizing the data to determine the pattern, relationships between elements and
trends in the dataset that gives a clear meaning and understanding of the dataset. Data exploration is important as it
helps the user to select an AI model in the next stage of the AI project cycle. To visualize the data, various types of
visual representations can be used such as diagrams, charts, graphs, flows and so on.
Rule Based Approach: Refers to the AI modelling where the rules are defined by the developer. The machine
follows the rules or instructions mentioned by the developer and performs its task accordingly.
Decision Tree is a rule based AI model to solve classification or regression problems which helps the
machine in predicting the element with the help of various rules fed to it. A decision tree looks like an inverted tree
where root is at the top and the tree further divides into branches, nodes and leaves. Root is the starting point of a
decision tree. Depending on the rules, the tree splits further into various branches that lead to an end point known as a
leaf. Each leaf of the tree is labelled with a class.
Learning Based Approach: Refers to the AI modelling where the machine learns by itself. Under the Learning Based
approach, the AI model gets trained on the data fed to it and then is able to design a model which is adaptive to the
change in data. That is, if the model is trained with X type of data and the machine designs the algorithm around it,
the model would modify itself according to the changes which occur in the data so that all the exceptions are handled
in this case.
1)Supervised Learning: In a supervised learning model, the dataset which is fed to the machine is labelled. In other
words, we can say that the dataset is known to the person who is training the machine only then he/she is able to label
the data. A label is some information which can be used as a tag for data. For example, students get grades according
to the marks they secure in examinations. These grades are labels which categorize the students according to their
marks.
There are two types of Supervised Learning models:
a) Classification: Where the data is classified according to the labels. For example, in the grading system, students
are classified on the basis of the grades they obtain with respect to their marks in the examination. This model works
on discrete dataset which means the data need not be continuous.
b) Regression: Such models work on continuous data. For example, if you wish to predict your next salary, then you
would put in the data of your previous salary, any increments, etc., and would train the model. Here, the data which
has been fed to the machine is continuous.
2) Unsupervised Learning:
An unsupervised learning model works on unlabelled dataset. This means that the data which is fed to the machine
is random and there is a possibility that the person who is training the model does not have any information regarding
it. The unsupervised learning models are used to identify relationships, patterns and trends out of the data which is
fed into it. It helps the user in understanding what the data is about and what are the major features identified by the
machine in it.
For example, you have a random data of 1000 dog images and you wish to understand some pattern out of it, you
would feed this data into the unsupervised learning model and would train the machine on it. After training, the
machine would come up with patterns which it was able to identify out of it. The Machine might come up with
patterns which are already known to the user like colour or it might even come up with something very unusual like
the size of the dogs.
2) Unsupervised learning models can be further divided into two categories:
a)Clustering: Refers to the unsupervised learning algorithm which can cluster the unknown data according to the
patterns or trends identified out of it. The patterns observed might be the ones which are known to the developer or it
might even come up with some unique patterns out of it.
b)Dimensionality Reduction: We humans are able to visualise upto 3-Dimensions only but according to a lot of
theories and algorithms, there are various entities which exist beyond 3-Dimensions. For example, in Natural
language Processing, the words are considered to be N-Dimensional entities. Which means that we cannot visualise
them as they exist beyond our visualisation ability. Hence, to make sense out of it, we need to reduce their
dimensions. Here, dimensionality reduction algorithm is used.
As we reduce the dimension of an entity, the information which it contains starts getting
distorted. For example, if we have a ball in our hand, it is 3-Dimensions right now. But if we click its picture, the data
transforms to 2-D as an image is a 2-Dimensional entity. Now, as soon as we reduce one dimension, at least 50% of
the information is lost as now we will not know about the back of the ball. Whether the ball was of same colour at the
back or not? Or was it just a hemisphere? If we reduce the dimensions further, more and more information will get
lost. Hence, to reduce the dimensions and still be able to make sense out of the data, we use Dimensionality
Reduction.
3) Reinforcement learning: This is a reward based AI model. It is a self-teaching system that essentially learns by
trial and error. It performs actions with the aim of maximizing positive rewards.
References:
https://research.google.com/semantris/ , https://teachablemachine.withgoogle.com/
https://thing-translator.appspot.com/
https://ed.ted.com/lessons/exploring-other-dimensions-alex-rosenthal-and-george-zaidan
Evaluation- Evaluation is a stage in the AI project cycle where the performance of the model is evaluated based on
certain metrics such as accuracy, precision recall and F1 score. This gives a clear idea to the user to compare the
expectations with the actual results.
References: https://youtu.be/4IOfSQc6I7c
https://youtu.be/49PbBi9aG-Q