0% found this document useful (0 votes)
39 views22 pages

Data Science Using ML and AI at Learn N Build Yashvardhan 19evj

The document summarizes Yashvardhan Pabari's 45-day internship at Learn N Build from June 27th to August 11th 2022. It includes an acknowledgement, table of contents, introduction, implementation details of a project on data science using machine learning and AI, and references. The internship was conducted in partial fulfillment of the requirements for a Bachelor's degree in Computer Science Engineering from Vivekananda Institute of Technology.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views22 pages

Data Science Using ML and AI at Learn N Build Yashvardhan 19evj

The document summarizes Yashvardhan Pabari's 45-day internship at Learn N Build from June 27th to August 11th 2022. It includes an acknowledgement, table of contents, introduction, implementation details of a project on data science using machine learning and AI, and references. The internship was conducted in partial fulfillment of the requirements for a Bachelor's degree in Computer Science Engineering from Vivekananda Institute of Technology.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

“Data Science using Ml and AI at Learn N Build”

Seminar report

Submitted

in partial fulfillment

for the award of the degree of

Bachelor of Technology in Computer Science Engineering

2022-2023

Submitted to: Submitted by:


Mayuree Katara Yashvardhan Pabari
Assistant Professor 19EVJCS097
Dept. of Computer Science Engg. B.Tech. VII Semester

==================================================================================================

DEPARTMENT OF Computer Science ENGINEERING


VIVEKANANDA INSTITUTE OF TECHNOLOGY

SISYAWAS,SECTOR-36,NRI ROAD,JAGATPURA,JAIPUR,RAJASTHAN
================================================================================================

1
VIVEKANANDA INSTITUTE OF TECHNOLOGY
(Approved by AICTE, New Delhi | Affiliated to RTU Kota, Rajasthan)

Candidates Declaration

I Yashvardhan Pabari hereby declare that I have undertaken 45 days of


industrial training at Learn N Build during a period from 27 June 2022 to 11
August 2022 in partial fulfilment of requirements for the award of degree of
B.Tech (Computer Science) at VIVEKANANDA INSTITUTE OF
TECHNOLOGY, JAIPUR. The work, which is being presented in the training
report, submitted to Department of Computer Science Engineering at
VIVEKANANDA INSTITUTE OF TECHNOLOGY, JAIPUR is an authentic
record of training work. It has not been submitted anywhere else for the award
of any degree, diploma and fellowship of any University or Institution.

Name of the Candidate : Yashvardhan Pabari


RTU Roll No. : 19EVJCS097

2
INTERNSHIP CERTIFICATE

3
ACKNOWLEDGEMENT

This report pertains to take vocational training which was undertaken in partial fulfilment of
the requirement for the BACHELOR OF TECHNOLOGY IN COMPUTER SCIENCE
ENGINEERING from the VIVEKANANDA INSTITUTE OF TECHNOLOGY. The
main purpose of the training was to acquaint myself with practical experience of actual work
condition in which we are required to work in future. This helped me to develop the habit of
analysis critically various aspects of problem at the time of decision making.

I would like to acknowledge to Ms Mayuree Katara during training period who gave me clear
details & guidelines on my project and presentation.

Finally, I would like to express my gratitude to all of the technical & non-technical people for
the co-operation & valuable guidance during my training period.

- Yashvardhan Pabari

4
Table of Content

TITLE PAGE NO.

CHAPTER – 1 INTRODUCTION…………………………………………….……..6

WHATISDATA SCIENCE?...............................................................6
1.1 DATA SCIENCE LIFE
CYCLE……………………………………...6
1.2 NEED OF DATA
SCIENCE……………………….............................7

CHAPTER – 2 INTRODUCTION TO AI and ML…………………………………..8

2.1 WHAT IS AI?...........................................................................................8


2.2 TYPES OF AI…………………………………………………………..8
2.3 WHAT IS ML…………………………………………………………..11
2.4 WORKING OF ML………………………………….............................11
2.5 TYPES OF ML………………………………………………………….12

CHAPTER – 3 PROJECT………………………………………………………….…15
3.1 IMPLEMENTATION…………………………………………………..15

REFERENCES…………………………………………..……………………………24

5
Chapter – 1

Introduction

1.1 What Is Data Science?


Data science is the field of study that combines domain expertise, programming skills, and
knowledge of mathematics and statistics to extract meaningful insights from data. Data
science practitioners apply machine learning algorithms to numbers, text, images, video,
audio, and more to produce artificial intelligence (AI) systems to perform tasks that ordinarily
require human intelligence. In turn, these systems generate insights which analysts and
business users can translate into tangible business value.

1.2 Data Science Life Cycle

1. Capture: Data Acquisition, Data Entry, Signal Reception, Data Extraction. This stage
involves gathering raw structured and unstructured data.
2. Maintain: Data Warehousing, Data Cleansing, Data Staging, Data Processing, Data
Architecture. This stage covers taking the raw data and putting it in a form that can be used.
3. Process: Data Mining, Clustering/Classification, Data Modeling, Data Summarization. Data
scientists take the prepared data and examine its patterns, ranges, and biases to determine how
useful it will be in predictive analysis.
4. Analyze: Exploratory/Confirmatory, Predictive Analysis, Regression, Text Mining,
Qualitative Analysis. Here is the real meat of the lifecycle. This stage involves performing the
various analyses on the data.
5. Communicate: Data Reporting, Data Visualization, Business Intelligence, Decision Making.
In this final step, analysts prepare the analyses in easily readable forms such as charts, graphs,
and reports

6
1.3 Need of Data Science
So to know about the need of Data Science, we need to know about the word that is constant,
i.e. “Data”.

Data is a set of values of qualitative or quantitative variables. This definition focuses more on
what data entails. And although it is a reasonably short definition. So we need to know about
the definition in brief.

Set of values - The first term to concentrate on is “a set of values” – to have data, we require
a set of values to include. In statistics, this set of values is known as the population. For
example, that set of values needed to answer your question might be all websites or
applications. But generally, it’s a set of things that you’re going to make measurements on.

Variables - The next thing to focus on is “variables” – variables are measurements or


characteristics of an item. For example, you could be measuring the weight of a person, or
you are estimating the amount of time a person visits a website or app. Or it may be a further
qualitative characteristic you are trying to measure, like what a person clicks on a website, or
whether you think the person visiting is male or female.

Qualitative and quantitative variables - Finally, we have both “qualitative and quantitative
variables“. Qualitative variables are information about qualities.They’re usually represented
by words, not numbers, and they not in ordered. On the other hand, quantitative variables are
information regarding quantities. Quantitative measurements are normally represented by
numbers and are estimated on a constant ordered scale; they’re something like weight, height.

Big Data literally means large amounts of data. Big data is the pillar behind the idea that one
can make useful inferences with a large body of data that wasn’t possible before with smaller
datasets. So extremely large data sets may be analyzed computationally to reveal patterns,
trends,

7
Chapter – 2

Introduction to AI

2.1 What is AI?


Artificial Intelligence is a method of making a computer, a computer-controlled robot, or
a software think intelligently like the human mind. AI is accomplished by studying the
patterns of the human brain and by analyzing the cognitive process. The outcome of these
studies develops intelligent software and systems.

2.2 Types of AI
Artificial Intelligence can be divided in various types, there are mainly two types of main
categorization which are based on capabilities and based on functionally of AI. Following is
flow diagram which explain the types of AI.

8
AI TYPE – 1 (BASED ON CAPABILITIES)

1. Weak AI or Narrow AI –

o Narrow AI is a type of AI which is able to perform a dedicated task with


intelligence.The most common and currently available AI is Narrow AI in the world
of Artificial Intelligence.
o Narrow AI cannot perform beyond its field or limitations, as it is only trained for one
specific task. Hence it is also termed as weak AI. Narrow AI can fail in unpredictable
ways if it goes beyond its limits.
o Apple Siriis a good example of Narrow AI, but it operates with a limited pre-defined
range of functions.
o Some Examples of Narrow AI are playing chess, purchasing suggestions on e-
commerce site, self-driving cars, speech recognition, and image recognition.

2. General AI –

o General AI is a type of intelligence which could perform any intellectual task with
efficiency like a human.
o The idea behind the general AI to make such a system which could be smarter and
think like a human by its own.
o Currently, there is no such system exist which could come under general AI and can
perform any task as perfect as a human.
o The worldwide researchers are now focused on developing machines with General AI.

3. Super AI –

o Super AI is a level of Intelligence of Systems at which machines could surpass human


intelligence, and can perform any task better than human with cognitive properties. It
is an outcome of general AI.
o Some key characteristics of strong AI include capability include the ability to think, to
reason,solve the puzzle, make judgments, plan, learn, and communicate by its own.
o Super AI is still a hypothetical concept of Artificial Intelligence. Development of such
systems in real is still world changing task.

9
AI TYPE – 2 (BASED ON FUNCTIONALITY)

1. Reactive Machines –

o Purely reactive machines are the most basic types of Artificial Intelligence.
o Such AI systems do not store memories or past experiences for future actions.
o These machines only focus on current scenarios and react on it as per possible best
action.
o Google's AlphaGo is also an example of reactive machines.

2. Limited Memory

o Limited memory machines can store past experiences or some data for a short period
of time.
o These machines can use stored data for a limited time period only.
o Self-driving cars are one of the best examples of Limited Memory systems. These
cars can store recent speed of nearby cars, the distance of other cars, speed limit, and
other information to navigate the road.

3. Theory of Mind

o Theory of Mind AI should understand the human emotions, people, beliefs, and be
able to interact socially like humans.
o This type of AI machines are still not developed, but researchers are making lots of
efforts and improvement for developing such AI machines.

4. Self – Awareness

o Self-awareness AI is the future of Artificial Intelligence. These machines will be


super intelligent, and will have their own consciousness, sentiments, and self-
awareness.
o These machines will be smarter than human mind.

10
Chapter – 3

Introduction to ML

3.1 What is ML?

Machine learning is a growing technology which enables computers to learn automatically


from past data. Machine learning uses various algorithms for building mathematical models
and making predictions using historical data or information. Currently, it is being used for
various tasks such as image recognition, speech recognition, email filtering, Facebook auto-
tagging, recommender system, and many more.

Machine Learning is said as a subset of artificial intelligence.with the help of sample


historical data, which is known as training data, machine learning algorithms build
a mathematical model that helps in making predictions or decisions without being explicitly
programmed. Machine learning brings computer science and statistics together for creating
predictive models. Machine learning constructs or uses the algorithms that learn from
historical data. The more we will provide the information, the higher will be the performance.

3.2 Working of ML

A Machine Learning system learns from historical data, builds the prediction models, and
whenever it receives new data, predicts the output for it. The accuracy of predicted output
depends upon the amount of data, as the huge amount of data helps to build a better model
which predicts the output more accurately.

Suppose we have a complex problem, where we need to perform some predictions, so instead
of writing a code for it, we just need to feed the data to generic algorithms, and with the help
of these algorithms, machine builds the logic as per the data and predict the output. Machine
learning has changed our way of thinking about the problem. The below block diagram
explains the working of Machine Learning algorithm:

11
3.3 Classification of Machine Learning
At a broad level, machine learning can be classified into four types:

1. Supervised Machine Learning

Supervised learning is a type of machine learning method in which we provide sample


labeled data to the machine learning system in order to train it, and on that basis, it predicts
the output.

The system creates a model using labeled data to understand the datasets and learn about each
data, once the training and processing are done then we test the model by providing a sample
data to check whether it is predicting the exact output or not.

The primary objective of the supervised learning technique is to map the input variable (a)
with the output variable (b). Supervised machine learning is further classified into two broad
categories:

Classification: These refer to algorithms that address classification problems where the
output variable is categorical; for example, yes or no, true or false, male or female, etc. Real-
world applications of this category are evident in spam detection and email filtering.

Some known classification algorithms include the Random Forest Algorithm, Decision Tree
Algorithm, Logistic Regression Algorithm, and Support Vector Machine Algorithm.

Regression: Regression algorithms handle regression problems where input and output
variables have a linear relationship. These are known to predict continuous output variables.
Examples include weather prediction, market trend analysis, etc.

2. Unsupervised Machine Learning

Unsupervised learning refers to a learning technique that’s devoid of supervision. Here, the
machine is trained using an unlabeled dataset and is enabled to predict the output without any
supervision. An unsupervised learning algorithm aims to group the unsorted dataset based on
the input’s similarities, differences, and patterns.

12
For example, consider an input dataset of images of a fruit-filled container. Here, the images
are not known to the machine learning model. When we input the dataset into the ML model,
the task of the model is to identify the pattern of objects, such as colour, shape, or differences
seen in the input images and categorize them. Upon categorization, the machine then predicts
the output as it gets tested with a test dataset.

Unsupervised machine learning is further classified into two types:

Clustering: The clustering technique refers to grouping objects into clusters based on
parameters such as similarities or differences between objects. For example, grouping
customers by the products they purchase.

Some known clustering algorithms include the K-Means Clustering Algorithm, Mean-Shift
Algorithm, DBSCAN Algorithm, Principal Component Analysis, and Independent
Component Analysis.

Association: Association learning refers to identifying typical relations between the variables
of a large dataset. It determines the dependency of various data items and maps associated
variables. Typical applications include web usage mining and market data analysis.

Popular algorithms obeying association rules include the Apriori Algorithm, Eclat Algorithm,
and FP-Growth Algorithm.

3. Semi – Supervised Learning

Semi-Supervised learning is a type of Machine Learning algorithm that lies between


Supervised and Unsupervised machine learning. It represents the intermediate ground
between Supervised (With Labelled training data) and Unsupervised learning (with no
labelled training data) algorithms and uses the combination of labelled and unlabeled datasets
during the training period.

Although Semi-supervised learning is the middle ground between supervised and


unsupervised learning and operates on the data that consists of a few labels, it mostly consists
of unlabeled data. As labels are costly, but for corporate purposes, they may have few labels.
It is completely different from supervised and unsupervised learning as they are based on the
presence & absence of labels.

We can imagine these algorithms with an example. Supervised learning is where a student is
under the supervision of an instructor at home and college. Further, if that student is self-
analysing the same concept without any help from the instructor, it comes under unsupervised
learning. Under semi-supervised learning, the student has to revise himself after analyzing the
same concept under the guidance of an instructor at college.

4. Reinforcement Learning

Reinforcement learning works on a feedback-based process, in which an AI agent (A


software component) automatically explore its surrounding by hitting & trail, taking action,

13
learning from experiences, and improving its performance. Agent gets rewarded for each
good action and get punished for each bad action; hence the goal of reinforcement learning
agent is to maximize the rewards.

In reinforcement learning, there is no labelled data like supervised learning, and agents learn
from their experiences only.

The reinforcement learning process is similar to a human being; for example, a child learns
various things by experiences in his day-to-day life. An example of reinforcement learning is
to play a game, where the Game is the environment, moves of an agent at each step define
states, and the goal of the agent is to get a high score. Agent receives feedback in terms of
punishment and rewards.

Due to its way of working, reinforcement learning is employed in different fields such
as Game theory, Operation Research, Information theory, multi-agent systems.

A reinforcement learning problem can be formalized using Markov Decision


Process(MDP). In MDP, the agent constantly interacts with the environment and performs
actions; at each action, the environment responds and generates a new state.

Positive reinforcement learning: This refers to adding a reinforcing stimulus after a specific
behaviour of the agent, which makes it more likely that the behaviour may occur again in the
future, e.g., adding a reward after a behaviour.

Negative reinforcement learning: Negative reinforcement learning refers to strengthening a


specific behaviour that avoids a negative outcome.

Real-world Use cases of Reinforcement Learning

Video Games:

o RL algorithms are much popular in gaming applications. It is used to gain super-


human performance. Some popular games that use RL algorithms
are AlphaGO and AlphaGO Zero.

Resource Management:

o The "Resource Management with Deep Reinforcement Learning" paper showed that
how to use RL in computer to automatically learn and schedule resources to wait for
different jobs in order to minimize average job slowdown.

14
Chapter – 3

Project

During this training internship, I had worked on several projects based on Data Science using
Artificial Intelligence and Machine Learning. And from those projects new and existing
concepts is been implemented practically to attain its objective and completed on time.

One of the projects was “Human Activity Recognition with Machine Learning”.

Objectives
This is used to provide the Human activity recognition (HAR) aims to classify a
person's actions from a series of measurements captured by sensors .

These are some of the data you could use:

 Body acceleration.
 Gravity acceleration.
 Body angular speed.
 Body angular acceleration.

Implementation
Now let’s start the task of Human Activity Recognition with machine learning by importing
the necessary Python libraries:

15
Now let’s import the smartphone data and have a look at the first five rows from the dataset:

There are a total of 7352 records in the training dataset. Also, there are no null values in the
dataset. The test dataset contains 2947 records to test our models. This dataset does not have
null values.

I can see that the dataset consists of accelerometer and gyro sensor values for each record. In
addition, the last two columns are the subject which refers to the subject number and the
activity which defines the type of activity. The Activity column will be represented by the
label y and all other columns will be represented by X:

Data Visualization

Now let’s visualize the data to understand some hidden features of the dataset:

16
The percentage of values shows that the size of the data for each activity is comparable. The
dataset is also distributed. By inspecting the dataset, I can see that there are a lot of features.
It is easy to identify that there is an accelerometer, gyroscope, and other values in the data set.
I can check everyone’s share by plotting a bar graph of each type. Accelerometer values have
Acc in them, Gyroscope values have Gyro, and rest can be considered like others:

17
The accelerometer provides the maximum functionality, followed by the gyroscope. The
other features are much less so.

18
The data collected is in continuous time series for each individual and was recorded at the
same rate. So I can just assign time values to each activity starting from 0 whenever the topic
changes:

19
If I take a closer look at the graph, we can see that each row on average transit between a
maximum range of 0.2 to 0.3 values. This is indeed the expected behaviour as slight
variations can be attributed to minor human errors.

Human Activity Recognition Model with Python:

Now I will train machine learning models for the task of recognizing human activity. Here I
will be using various machine learning algorithms available in the Scikit-Learn library in
Python that I have already imported. For each algorithm, I’ll calculate the accuracy of the
prediction and identify the most accurate algorithm:

20
21
So we can clearly see that the Logistic Regression model performs the best for the task of
Human Activity Recognition with Machine Learning. I hope you liked this article on
Machine Learning Project on Recognizing Human Activities with Python.

22

You might also like