AI Project Cycle Notes and Questions - 250605 - 112457
AI Project Cycle Notes and Questions - 250605 - 112457
Whatt iss an
n AII Projectt Cycle?
1. Problem Scoping
Data Features
o Refer to the type of data you want to collect.
o E.g.: Salary amount, increment percentage, increment period, bonus etc.
Big data
o It includes data with sizes that exceed the capacity of
traditional software to process within an acceptable time
and value.
o The main focus is on unstructured type of data
D Volume
Ь Amount of data produced
E Variety
Big Data Ь Types of data produced
F Velocity
Ь Speed of data produced
Data Exploration
ٳWeb Scraping
ٳAPI
Outliers
2.By Using an Imputer to find the best possible substituteto replace missing values.
3. Erroneous Data:
Class
Erroneous data is test data that falls outside of what is
acceptable and should be rejected bythe system. Student Name
RIYA GEORGE XA
JOSHUA SAM XA
APARNA BINU XA
SIDHARDH V R XA
NITHILA M 57
ATHULYA M S XA
ANUJA MS XB
KEERTHI KRISHNANATH XB
1) Data Visualization
Area Graphs
Area Graphs are Line Graphs but with the area
below the line filled in with a certain colour or
texture. Like Line
Graphs, Area Graphs are used to
display the development of quantitative
values over an interval or time period. They
are mostcommonly used to show trends, rather
than convey
specific values.
Bar Charts
The classic Bar Chart uses either horizontal or
vertical bars (column chart) to show discrete,
numerical comparison across categories. Bars
Charts are distinguished from Histograms, as
they do not display continuous
developments over an interval. Bar Chart's
Histogram
Scatterplots
Flow Charts
This type of diagram is used to show the sequential
steps of a process. Flow Charts map out a
process using a series of connected symbols,
which makes process easy to understand and
aids in its communication to other people. Flow
Charts are useful for explaining how a complex
and/or abstract procedure, system, concept or
algorithm work. Drawing a Flow Chart can also
help in planning an developing an existing one
relationship or correlation between the two
variables exists.
7. Pie Charts
In this approach, the rules are defined by the developer. The machine
follows the rules or instructions mentioned by the developer and
performs its task accordingly. So, it’s a static model. i.e. the machine
once trained, does not take into consideration any changes made inthe
original training dataset
5XOH EDVHG DSSURDFK
Data
rule Answer
Thus, machine learning gets introduced as an extension to this as in that
case, the machine adapts to change in data and rules and follows the
updated path only, while a rule-based model does what it has been taught
once.
Learning Based Approach
It’s a type of AI modelling where the machine learns by itself. Under
the Learning Based approach, the AI model gets trained on the data fed
to it and then is able to design a model which is adaptive to the change
in data. That is, if the model is trained with X type of data and the
machine designs the algorithm around it, the model would modify itself
according to the changes which occur in the data so that all the
Data
/HDUQLQJEDVHGDSSURDFK
Rules Answers
After training, the machine is now fed with testing data. Now, the testing
data might not havesimilar images as the ones on which the model has
been trained. So, the model adapts to the features on which it has been
trained and accordingly predicts the output.
In this way, the machine learns by itself by adapting to the new data
which is flowing in.This is the machine learning approach which
introduces the dynamicity in the model.
Generally, learning based models can be classified as follows:
MachineLearningModels
L UnsupervisedL ReinforcementL
earning earning earning
Dimensionality
Regression Classification Clustering Reduction
5HJUHVVLRQ
6XSHUYLVHG
/HDUQLQJ
&ODVVLILFDWLRQ
D Classification
&OXVWHULQJ
8QVXSHUYLVHG
/HDUQLQJ
'LPHQVLRQDOLW\
5HGXFWLRQ
Clustering
It refers to the unsupervised learning algorithm which can
cluster the unknown data according to the patterns or trends
identified out of it. The patterns observed might be the ones
which are known to the developer or it might even come up
with some unique patterns
out of it.
D Dimensionality Reduction
We humans are able to visualize up to 3 Dimensions
only but according to a lot of theories and algorithms,
there are various entities which exist beyond 3-
99
Dimensions.
For example, in Natural language Processing, the words are
considered to be N-Dimensional entities. Which means that
we cannot visualize them as they exist beyond our
visualization ability.Hence, to make sense out of it, we
need to reduce their dimensions. Here, dimensionality
reduction algorithm is
used.
,,, Reinforcement Learning
5. Evaluation
Evaluation is a process of understanding the reliability of any AI model,
based on outputs by feeding the test dataset into the model and comparing it
with actual answers. i.e.o once a model has been made and trained, it needs to
go through proper testing so that one can calculate the efficiency and
performance of the model. Hence, the model is tested with the help of Testing
100
Data (which was separated out of the acquired dataset at Data Acquisition
stage.
The efficiency of the model is calculated on the basis of the parameters mentioned below:
1. Accuracy
Accuracy is defined as percentage of the correct predictions out of all the observations.
2. Precision
Precision is defined as the percentage of true positive cases versus all the cases where
101
(b) Hidden layer
(c) Input layer
(d) Data layer
5. How you can identify the problem scoping in the project.
a. Understand why the project was started
b. Define the project’s primary objectives
c. Outline the project’s work statement.
d. All of the above
6. Identify the algorithm based on the given graph
12.___________refers to why we need to address the problem and what the advantages will
be for the stakeholders once the problem is solved.
a. Who
b. What
c. Where
d. Why
104
24.What is a System Map?
a. Helps to make relation between multiple element
b. Only one element will be responsible
c. Indicate the relationship using + or –
d. Both a) and c)
25.Data analysts utilize data visualization and statistical tools to convey dataset
characterizations, such as ___________.
a. size
b. amount
c. accuracy
d. All of the above
26.Data exploration is a technique used to visualize data in the form of statistical methods or
using graphs.
a. Statistical methods
b. Graphical methods
c. Both a) and b)
d. None of the above
27.Data Exploration helps you gain a better understanding of a _________.
a. Dataset
b. Database
c. accuracy
d. None of the above
28._____________helps to represent graphical data that use symbols to convey a story and
help people understand large volumes of information.
a. Dataset
b. Data visualization
c. Data Exploration
d. None of the above
29.A machine that work and react like human is known as ____________.
a. Artificial Intelligence
b. Machine Learning
c. Deep Learning
d. None of the above
30. Machine have abilities to learn from the experience or data.
a. Artificial Intelligence
b. Machine Learning
c. Deep Learning
d. None of the above
105
31._________ is a program that has been trained to recognize patterns using a set of data.
a. AI model
b. Dataset
c. Visualization
d. None of the above
32.Type of AI model are _____________.
a. Lesson Based and Rood Based
b. Learning Based and Rule Based
c. Machine Learning and Visualization
d. None of the above
33.___________refers to AI modelling in which the developer hasn’t specified the
relationship or patterns in the data.
a. Learning Based
b. Rule Based
c. Decision Tree
d. None of the above
34. After a model has been created and trained, it must be thoroughly tested in order to
determine its efficiency and performance; this is known as ___________.
a. Evaluation
b. Learning
c. Decision
d. None of the above
35.Which of the following is the first and the crucial stage of AI Project development which
focuses on identifying and understanding problems?
a) Problem Scoping (ii) Data Acquisition (iii) Data Exploration (iv) Modelling
36.…………………… refer to the type of data to be collected.
a) Data security (ii) Data policy (iii) Data quality (iv) Data features
37.Which of the following uses dots to represent the relationship between two different
numeric variables represented on the x and y axis?
a) Histogram (ii) Scatter plot (iii) Bullet Graphs (iv) Tree Diagram
38.Statement A: Neural networks are made up of layers of neurons.
Statement B: Human brain consists of millions of neurons.
i)Only Statement A is correct (ii) Only Statement B is correct
(iii) Both the statements are correct (iv) None of the statements is correct
39. The process of developing AI machines has different stages that are collectively
known as Al …………………… .
106
a) Project status (ii) Project cycle (iii) Both a) and (b) (iv) None of these
Ans) Project Cycle is a step-by-step process to solve problems using proven scientific
methods and
drawing inferences about them. The AI Project Cycle provides us with an appropriate
framework which can lead us towards the goal.
The AI Project Cycle mainly has 5 stages: They are
a) Problem Scoping b) Data Acquisition c) Data Exploration d) Modelling e) Evaluation.
2) Name the 4Ws of problem canvases under the problem scoping stage of the AI
Project Cycle.
Ans) a. Who, b. What c. Where d. Why
3) What is a problem statement template and what is its significance?
Ans) The problem statement template gives a clear idea about the basic framework required
to achieve the goal. It is the 4Ws canvas which segregates; , who is affected, what is the
problem, where does it arise, why is it a problem? It takes us straight to the goal.
4) What is the need of an AI Project Cycle? Explain.
Ans) Project cycle is the process of planning, organizing, coordinating, and finally
developing a project effectively throughout its phases, from planning through execution
then completion and review to achieve pre-defined objectives. Our mind makes up plans for
every task which we have to accomplish which is why things become clearer in our mind.
Similarly, if we have to develop an AI project, the AI Project Cycle provides us with an
appropriate framework which can lead us towards the goal. The major role of AI Project
Cycle is to distribute the development of AI project in various stages so that the
development becomes easier, clearly understandable and the steps / stages should become
more specific to efficiently get the best possible output. It mainly has 5 ordered stages
which distribute the entire development in specific and clear steps: These are Problem
Scoping, Data Acquisition, Data Exploration, Modelling and Evaluation.
5) What is Sustainable development?
ANS – Sustainable development is the development that satisfies the needs of the present
without compromising the capacity of future generations.
This was a warning to all countries about the effects of globalization and economic growth
on the environment.
107
6) How many goals are there in
Sustainable Development? Mention any
two goals
ANS – In 2015, The general assembly of UN
adopted the 2030 agenda for SD based on the
principle “Leaving None Behind”. The 17 goals
in Sustainable Development goals are –
1. No poverty
2. Zero Hunger
3. Good Health and Well Being
4. Quality Education
5. Gender Equality
6. Clean water and Sanitation
7. Affordable and Clean Energy
8. Decent Work and Economic Growth
9. Industry Innovation and Infrastructure
10.Reduced Inequalities
11.Sustainable Cities and Communities
12.Responsible Consumption and Production
13.Climate Action
14.Life Below Water
15.Life on Land
16.Peace, Justice and Strong Institution
17.Partnership for the Goals
Ans) Data should be collected from an authentic source, and should be accurate. The
redundant and irrelevant data should not be a part of prediction.
108
9) Explain Data Exploration Stage.
Ans) In this stage of project cycle, we try to interpret some useful information out of the
data we have
acquired. For this purpose, we need to explore the data and try to put it uniformly for a
better
understanding. This stage deals with validating or verification of the collected data and to
analyse
that:
ڹThe data is according to the specifications decided.
ڹThe data is free from errors.
ڹThe data is meeting our needs
Ans) Any Artificial Neural Network, irrespective of the style and logic of
implementation, has a few basic features as given below.
x The Artificial Neural Network systems are modelled on the human brain and nervous
system.
x They are able to automatically extract features without feeding the input by
programmer.
x Every node of layer in a Neural Network is compulsorily a machine learning algorithm.
x It is very useful to implement when solving problems for very huge datasets.
Ans) a) Supervised learning is an approach to creating artificial intelligence (AI), where the
program is given labelled input data and the expected output results. OR Supervised learning
is a learning in which we teach or train the machine using data which is well labelled that
means some data is already tagged with the correct answer. After that, the machine is
provided with a new set of examples (data) so that supervised learning algorithm analyses the
training data (set of training examples) and produces a correct outcome from labelled data. In
a supervised learning model, the dataset which is fed to the machine is labelled. It means
some data is already tagged with the correct answer. In other words, we can say that the
dataset is known to the person who is training the machine only then he/she is able to label
the data.
14) Explain the Unsupervised Learning
Ans) Classification: The classification Model works on the labelled data. For example, we
have 3 coins of different denomination which are labelled according to their weight then the
model would look for the labelled features for predicting the output. This model works on
discrete dataset which means the data need not be continuous.
16) Draw the graphical representation of Regression AI model.
Regression: These models work on continuous data to predict the output based on patterns.
For example, if you wish to predict your next salary, then you would put in the data of your
previous salary, any increments, etc., and would train the model. Here, the data which has
been fed to the machine is continuous.
110