0% found this document useful (0 votes)
39 views7 pages

Ai Notes

The document discusses the different stages of an AI project cycle which includes problem scoping, data acquisition, data exploration, modelling, and evaluation. It provides details about each stage, including problem identification, problem scoping, data types and sources, different modelling approaches like decision trees and supervised learning, and the goal of evaluating models.

Uploaded by

lilykapoor0108
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views7 pages

Ai Notes

The document discusses the different stages of an AI project cycle which includes problem scoping, data acquisition, data exploration, modelling, and evaluation. It provides details about each stage, including problem identification, problem scoping, data types and sources, different modelling approaches like decision trees and supervised learning, and the goal of evaluating models.

Uploaded by

lilykapoor0108
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

AI project cycle notes

what is a project cycle?


if we have to develop an AI project, the AI project cycle provides us with an appropriate
framework which can lead us towards the goal. the AI project cycle mainly has 5 stages:
problem scoping, data acquisition, data exploration, modelling, evaluation.

steps of a project cycle


Under problem scoping, we look at various parameters which affect the problem we wish to
solve so that the picture becomes clearer. To proceed,
● You need to acquire data which will become the base of your project as it will help you in
understanding what the parameters that are related to problem scoping are.
● You go for data acquisition by collecting data from various reliable and authentic sources.
Since the data you collect would be in large quantities, you can try to give it a visual image of
different types of representations like graphs, databases, flow charts, maps, etc. This makes it
easier for you to interpret the patterns which your acquired data follows.
● After exploring the patterns, you can decide upon the type of model you would build to
achieve the goal. For this, you can research online and select various models which give a
suitable output.
● You can test the selected models and figure out which is the most efficient one.
● The most efficient model is now the base of your AI project and you can develop your
algorithm around it.
● Once the modelling is complete, you now need to test your model on some newly fetched
data. The results will help you in evaluating your model and improving it.
● Finally, after evaluation, the project cycle is now complete and what you get is your AI project.

problem identification depends upon:


● determining if the AI system will create any real value for the organisation or not.
● determining if the selected AI system will be feasible on the parameters of data, money,
workforce, time, etc.
● these things can help the organisation identify the fields where implementing the AI
technology will create the highest possible impact with the least possible input.

problem scoping
problem scoping is the stage that begins after identifying the problem in developing the AI
projects. this is where we identify the project and set goals that we want our project to achieve.
all the other stages of the AI project cycle come after this phase, and without proper problem
scoping everything may turn out to be useless.

steps of problem scoping


● identifying the problem
● goal setting
● identification of the stakeholders
● identifying the existing measures
● identifying ethical concerns

sustainable development goals (SGD’s)


sustainable development goals are a set of seventeen goals that was adopted by the
participating nations at the united nations conference on sustainable development in rio de
janeiro in 2012. the SDG’s were adopted to meet the urgent challenges of the world concerning
the following three factors: environment, social, economy.

issues covered under SDG’s:


● terminating global poverty
● ensuring that all children receive a good education, and achieve equal opportunities
● supporting better practices for consumption and production that will help make the planet
cleaner and healthier.

4Ws problem canvas


the 4Ws problem canvas helps in identifying key elements related to the problem. the 4Ws
canvas includes the following four question: who? what? where? why?

who canvas
the “who” block helps in analysing the people getting affected directly or indirectly due to it.
under this, we find out who the ‘stakeholders’ to this problem are and what we know about
them. stakeholders are the people who face this problem and would be benefitted with the
solution.

what canvas
under the “what” block, you need to look into what you have on hand. at this stage, you need to
determine the nature of the problem. what is the problem and how do you know that it is a
problem? you also gather evidence to prove that the problem you have selected actually exists,
for example, newspaper articles, media announcements, etc.

where canvas
this focuses on the context/situation/location of the problem. this block will help you look into the
situation in which the problem arises, the context of it, and the location where it is prominent.

why canvas
the the “why” canvas, think about the benefits which the stakeholders would get from the
solution and how it will benefit them as well as society.

problem statement template


the problem statement template helps to summarise all the key points into one single template
so that in the future, whenever there is need to look back at the basis of the problem, we can
take a look at the problem statement template and understand the key elements of it.
the [stakeholders] who?

have a problem of [issue, problem] what?

when/while [context, situation] where?

an ideal solution [how the solution will help why?


stakeholders]

data acquisition
as the name clearly mentions, this stage is about acquiring data for the project. data can be a
piece of information or facts and statistics collected together for reference or analysis. whenever
we want an AI project to be able to predict an output, we need to train it first using data.

data classification
● basic
i) textual
made up of characters and, numbers, but we cannot perform calculations on these numbers. for
example: mickey, donald, grapes, ad43et, etc.
ii) numerical - discrete and continuous
involves the use of numbers. this is the data on which we can perform mathematical operations
such as addition, subtraction, multiplication, etc.
discrete data: contains only whole numbers
continuous data: with fractions and decimals

● structured
i) structured data
this type of data has a predefined data model, and it is organised in a predefined manner. train
schedules, mark sheets are common examples of this form of data.
ii) unstructured data
this type of data does not have any predefined structure. it can take any form. Videos, audios,
emails, documents, etc. are some examples of this form of data.
iii) semi-structured data
this type of data has a structure that has the qualities of both structured as well as unstructured
data. this data is not organised logically.

data features
data features refer to the type of data, which needs to be collected. for example, salary amount,
increment percentage, increment period, etc. ways in which you can collect data: surveys, web
scrapping, sensors, cameras, observations, application program interface (API).

data exploration
data exploration refers to techniques and tools that are used for identifying important patterns
and trends. we can do it through data visualisation or by adopting sophisticated statistical
methods. to analyse data, you need to visualise it in some user-friendly format so you can:
● quickly get a sense of the trends, relationships, and patterns contained within the data.
● define stratergy for which model to use at a larger stage
● communicate the same to other effectively. to visualize data, we can use various types of
visual representation.

modelling
based on trends and patterns regarding data exploration, the next important thing is to use data
for making predictions or future forecasts. we can do it through different data and modelling
methods. for example: decision trees.

decision tree
for the classification of data, one of the simplest methods used today is the ‘decision tree’. we
can use a tree-like model of decisions along with all the possible consequences.

rule based approach


refers to the AI modelling where the rules are defined by the developer. the machine follows the
rules or instriuctions mentioned by the developer and performs its task accordingly. a
drawback/feauture for this approach is that the learning is static. the machine once trained, does
not take into consideration any changes made in the original training dataset. once trained, the
model cannot improve intself on the basis of feedbacks.

learning based approach


refers to the AI modelling where the machine learns by itself. under the learning based
approach, the AI model gets trained on the data fed to it and then is able to design a model
which is adaptive to the change in data. the learning-based approach can further be divided into
three parts: supervised, unsupervised and reinforcement learning.

supervised learning
in a supervised learning model, the dataset which is fed to the machine is labelled. we can say
that the dataset is known to the person who is training the machine only then he/she is able to
label the data. for example: students get grades according to the marks they secure in
examinations. these grades are labels which categorise the students according to their marks.
there are two types of supervised learning models: classification and regression.

classification
where the data is classified according to the labels. for example, in the grading system, students
are classified on the basis of the grades they obtain with respect to their marks in the
examination. works on discrete dataset.

regression
here, the data which has been fed to the machine is continuous. for example, if you wish to
predict your next salary, then you would put in the data for your previous salary, any increments,
etc., and would train the model.
The main difference between Regression and Classification algorithms that
Regression algorithms are used to predict the continuous values such as price, salary, age,
etc. and Classification algorithms are used to predict/Classify the discrete values such as
Male or Female, True or False, Spam or Not Spam, etc

regression algorithm classification algorithm

output variable must be of continuous nature the output variable must be a discrete value
or real value

the task of regression algorithm is to map the the task of classification algorithm is to the
input value (x) with the continuous output map the input value (x) with the discrete
variable (y) output variable (y)

used with continuous data used with discrete data

used to solve problems such as weather used to solve problems such as identification
prediction, house price prediction, etc. of spam emails, speech recognition,
identification of cancer cells, etc.

unsupervised learning
unsupervised learning model works on unlabeled dataset. this means that the data which is fed
to the machine is random and there is a possibility that the person who is training the model
does not have any information regarding it. the unsupervised learning models are used to
identify relationships, patterns and trends out of the data which is fed into it. unsupervised
learning can be further divided into clustering.

clustering
refers to the unsupervised learning algorithm which can cluster the unknown data according to
the patterns or trends identified out of it. The patterns observed might be the ones which are
known to the developer or it might even come up with some unique patterns out of it.

applications of clustering:
● netflix has used clustering in implementing movie recommendations for its users.
● news summarization can be performed using clustering analysis where articles can be divided
into a group of related topics.
● document clustering is being effectively used in preventing the spread of fake news on social
media.

reinforcement learning
for every right action or decision of an algorithm, it is rewarded with a positive reinforcement. on
the other hand, for every wrong action, it is given a negative reinforcement. in this way, it learns
about the nature of actions that needed to be preformed and which need not be done. this type
of learning can assist in industrial automation.
dimentionality reduction
the number of input features, variables, or columns present in a given dataset is known as
dimensionality, and the process to reduce these features is called dimensionality reduction. it is
a way of converting the higher dimensions dataset into lesser dimensions dataset ensuring that
it provides similar information. it is commonly used in the fields that deal with high-dimension
data, such as speech recognition, data visualization, noise reduction, cluster analysis,
bioinformatics, etc.

why dimensionality reduction is important


● fewer features means less complexity
● you will need less storage space because you have fewer data
● fewer features require less computation time
● model accuracy improves due to less misleading data
● algorithms train faster thanks to fewer data
● reducing the data set’s feature dimensions help visualize the data faster
● it removes noise and redundant features

evaluation
once a model has been made and trained, it needs to go through proper testing so that one can
calculate the efficiency and performance of the model. Hence, the model is tested with the help
of testing data and efficiency of the model is calculated on the basis of the following parameters:
accuracy, precision, recall and F1 score.

neural networks
neural networks are loosely modelled after how neurons in the human brain behave. The key
advantage of neural networks are that they are able to extract data features automatically
without needing the input certain tasks. it is a fast and efficient way to solve problems for which
the dataset is very large, such as in images.

You might also like