0% found this document useful (0 votes)
105 views

Machine Learning Report PDF

1. The document provides a summary of an internship training report on machine learning. It discusses the technologies learned including artificial intelligence, machine learning algorithms, and techniques like supervised and unsupervised learning. 2. Math concepts reviewed include linear algebra, calculus, probability, and statistics. Supervised learning methods like regression and classification are covered along with specific algorithms. 3. The future of machine learning is discussed, with the prediction that it will become more prevalent for technological progress due to increasing data availability. Machine learning is seen as a competitive advantage for companies.

Uploaded by

Ishak gauri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views

Machine Learning Report PDF

1. The document provides a summary of an internship training report on machine learning. It discusses the technologies learned including artificial intelligence, machine learning algorithms, and techniques like supervised and unsupervised learning. 2. Math concepts reviewed include linear algebra, calculus, probability, and statistics. Supervised learning methods like regression and classification are covered along with specific algorithms. 3. The future of machine learning is discussed, with the prediction that it will become more prevalent for technological progress due to increasing data availability. Machine learning is seen as a competitive advantage for companies.

Uploaded by

Ishak gauri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

SUMMER TRAINING REPORT

On

Machine Learning Internship Certification Training

Submitted by

Ishak Gauri
Department of computer
science and technology,
Quantum University,Roorkee

Under the Guidance of

Asst.Prof.Bhanu Pratap
Department of computer
science and engineering

School of Computer Science &


Engineering Quantum University,
Roorkee, 2023
ACKNOWLEDGEMENT
The success and final outcome of learning Machine Learning required a lot of
guidance and assistance from many people and I am extremely privileged to have got
this all along the completion of my course and few of the projects. All that I have
done is only due to such supervision and assistance and I would not forget to thank
them.
I respect and thank Codsoft, for providing me an opportunity to do the course and project
w
amor k and giving me all support and guidance, which made me complete the course duly. I
extremely thankful to the course advisor .
I am thankful to and fortunate enough to get constant encouragement, support and
guidance
from all Teaching staffs of Codsoft which helped us in successfully completing my course
and project work.

………………………………………
(Signature of Student)
Name of Student: - Ishak gauri

Date:………………..
SUMMER TRAINING CERTIFICATE
About Codsoft
Who We Are
CodSoft are IT services and IT consultancy that specializes in creating innovative
solutions for businesses. We are passionate about technology and believe in the
power of software to transform the world. Our internship program is just one of the
ways in which we are investing in the future of the industry.
At CodSoft, we believe practical knowledge is the key to success in the tech
industry. Our aim is to help students lacking basic skills by offering hands-on
learning through live projects and real-world examples.

We Provide All Exclusive services For Clients


We build websites and web applications.

Today, every business should have a website.


No matter how small or large your business, having a website is must
have at this time.
Having a website help you to maintain your online presence.

INTERNSHIP POSITION
Machine learning intern
Gain mastery in Machine learning from the comfort of your home and
open doors to amazing job opportunities with our certification
program. Enroll in our intensive 4-week internship, where you'll
acquire knowledge in web application development and deployment .
Establish a strong base for your career and real-world implementation
within a supportive and collaborative setting.
TABLE OF CONTENTS
1. Introduction………………………………………………………………………………07
1.1. A Taste of Machine Learning……………………………………………………….07
1.2. Relation to Data Mining………………………………………………………….….07
1.3. Relation to Optimization………………………………………………………….…07
1.4. Relation to Statistics…………………………………………………………............08
1.5. Future of Machine Learning………………………………………………………....08
2. Technology Learnt……………………………………………………………………….08
2.1. Introduction to Artificial Intelligence and Machine Learning……………………....08
2.1.1. Definition of Artificial Intelligence…………………………………………..08
2.1.2. Definition of Machine Learning…………………………………………...…09
2.1.3. Machine Learning Algorithms……………………………………………….10
2.1.4. Applications of Machine Learning………………………………………...…11
2.2. Techniques of Machine Learning…………………………………………………....12
2.2.1. Supervised Learning…………………………………………...……………..12
2.2.2. Unsupervised Learning……………………………………………...………..16
2.2.3. Semi- supervised Learning……………………………………………..…….18
2.2.4. Reinforcement Learning…………………………………………………..….19
2.2.5. Some Important Considerations in Machine Learning…………………........19
2.3. Data Preprocessing………………………………………………………….……....20
2.3.1. Data Preparation………………………………………………………….….20
2.3.2. Feature Engineering…………………………………………………….……21
2.3.3. Feature Scaling…………………………………………………………….…22
2.3.4. Datasets………………………………………………………………………24
2.3.5. Dimensionality Reduction with Principal Component Analysis………….….24
2.4. Math Refresher………………………………………………………………………25
2.4.1. Concept of Linear Algebra……………………………………………...……25
2.4.2. Eigenvalues, Eigenvectors, and Eigen decomposition……………………....30
2.4.3. Introduction to Calculus…………………………………………………..….30
2.4.4. Probability and Statistics………………………………………………….….31
2.5. Supervised learning……………………………………………………………….…34
2.5.1. Regression……………………………………………………………………34
2.5.1.1. Linear Regression…………………………………………………….35
2.5.1.2. Multiple Linear Regression…………………………………………..35
2.5.1.3. Polynomial Regression……………………………………………….36
2.5.1.4. Decision Tree Regression…………………………………………….37
2.5.1.5. Random Forest Regression…………………………………………...37
2.5.2. Classification…………………………………………………………………38
2.5.2.1. Linear Models………………………………………………………..39
2.5.2.1.1. Logistic Regression…………………………………………..39
2.5.2.1.2. Support Vector machines…………………………………….39
2.5.2.2. Nonlinear Models…………………………………………………….40
2.5.2.2.1. K-Nearest Neighbors (KNN)…………………………………40
2.5.2.2.2. Kernel Support Vector Machines (SVM)…………………….40
2.5.2.2.3. Naïve Bayes…………………………………………………..41
2.5.2.2.4. Decision Tree Classification…………………………………41
1. Introduction
1.1.A Taste of Machine Learning
✓ Arthur Samuel, an American pioneer in the field of computer gaming and
✓artificial intelligence, coined the term "Machine Learning" in 1959.
Over the past two decades Machine Learning has become one of the mainstays
of information technology.
✓With the ever-increasing amounts of data becoming available there is good
reason to believe that smart data analysis will become even more pervasive as
a necessary ingredient for technological progress.
1.2. Relation to Data Mining

• Data mining uses many machine learning methods, but with different goals; on the
other hand, machine learning also employs data mining methods as "unsupervised
learning" or as a preprocessing step to improve learner accuracy.
1.3. Relation to Optimization
Machine learning also has intimate ties to optimization: many learning problems
are formulated as minimization of some loss function on a training set of examples.
Loss functions express the discrepancy between the predictions of the model being
trained and the actual problem instances.
1.4.Relation to Statistics

Michael I. Jordan suggested the term data science as a placeholder to call the overall
field.
Leo Breiman distinguished two statistical modelling paradigms: data model and
algorithmic model, wherein "algorithmic model" means more or less the machine
learning algorithms like Random forest.
1.5.Future of Machine Learning

Machine Learning can be a competitive advantage to any company be it a top MNC
or a startup as things that are currently being done manually will be done tomorrow
by machines.

Machine Learning revolution will stay with us for long and so will be the future of
Machine Learning.

❖ 2. Technology Learnt
Introduction to AI & Machine Learning

2.1.1. Definition of Artificial Intelligence


Data Economy
✓ World is witnessing real time flow of all types structured and unstructured data from
social media, communication, transportation, sensors, and devices.
✓ International Data Corporation (IDC) forecasts that 180 zettabytes of data will
be generated by 2025.

This explosion of data has given rise to a new economy known as the Data
Economy.
✓Data is the new oil that is precious but useful only when cleaned and processed.

❖ ✓There is a constant battle for ownership of data between enterprises to derive


benefits from it.
Define Artificial Intelligence
Artificial intelligence refers to the simulation of human intelligence in machines
that are programmed to think like humans and mimic their actions. The term may
also be applied to any machine that exhibits traits associated with a human mind
such as learning and problem- solving.

2.1.2.Definition of Machine Learning


❖ Relationship between AI and ML

Machine Learning is an approach or subset of Artificial Intelligence that is based on the idea
that machines can be given access to data along with the ability to learn from it.
❖ Define Machine Learning
Machine learning is an application of artificial intelligence (AI) that provides
systems the ability to automatically learn and improve from experience without
being explicitly programmed. Machine learning focuses on the development of
computer programs that can access data and use it learn for themselves.
❖ Features of Machine Learning
✓ Machine Learning is computing-intensive and generally requires a large
amount of training data.
✓ It involves repetitive training to improve the learning and decision making
of algorithms.
✓ As more data gets added, Machine Learning training can be automated for
learning new data patterns and adapting its algorithm.

❖ 2.1.3.Machine Learning Algorithms


Traditional Programming vs. Machine Learning Approach

❖ Traditional Approach
Traditional programming relies onhard-coded rules.
❖Machine Learning Approach
Machine Learning relies on learning patterns based on sample data.

❖ Machine Learning Techniques


✓ Machine Learning uses a number of theories and techniques
from Data Science.

✓ Machine Learning can learn from labelled data (known as supervised


learning) or unlabeled data (known as unsupervised learning).
2.1.4.Applications of Machine Learning
❖ Image Processing
✓ Optical Character Recognition (OCR)
✓ Self-driving cars
✓ Image tagging and recognition
❖ Robotics
✓ Industrial robotics
✓ Human simulation
❖ Data✓Mining
Association rules
✓ Anomaly detection
✓ Grouping and Predictions
❖ Video✓ Pokémon
games
✓ PUBG
❖ Text✓Analysis
Spam Filtering
✓ Information Extraction
✓ Sentiment Analysis
✓ Emergency Room & Surgery
❖ Healthcare
✓ Research
✓ Medical Imaging & Diagnostics
2.2. Techniques of Machine Learning
2.2.1.Supervised Learning
❖ Define Supervised Learning
Supervised learning is the machine learning task of learning a function that maps an input
to an output based on example input-output pairs. It infers a function from labeled
training data consisting of a set of training examples.
In supervised learning, each example is a pair consisting of an input object (typically a vector)
and a desired output value (also called the supervisory signal).
❖ Supervised Learning Flow
✓ Data Preparation
Clean data
Label data (x, y)
Feature Engineering
Reserve 80% of data for Training (Train_X) and 20% for Evaluation
(Train_E)

Training Step
Design algorithmic logic
Train the model with Train X
Derive the relationship between x and y, that is, y = f(x)

Evaluation or Test Step
Evaluate or test with Train E
If accuracy score is high, you have the final learned algorithm y = f(x)
If accuracy score is low, go back to training step

Production Deployment
Use the learned algorithm y = f(x) to predict production data.
The al
gToeristhtimng c athn eb eA ilmgoprriotvhemd sb y more training data, capacity, or algo redesign.
❖ ✓ Once the algorithm is trained, test it with test data (a set of data instances that
do not appear in the training set).

A well-trained algorithm can predict well for new test data.

If the learning is poor, we have an underfitted situation. The algorithm will not
work well on test data. Retraining may be needed to find a better fit.


If learning on training data is too intensive, it may lead to overfitting–a situation
where the algorithm is not able to handle new testing data that it has not seen
before. The technique to keep data generic is called regularization.
❖ Examples of Supervised Learning
✓ Voice Assistants
✓Gmail Filters
✓ Weather Apps
❖ Types of Supervised Learning

✓ Classification
➢ Answers “ What class?”
➢Applied when the output has finite and discreet values Example: Social
media sentiment analysis has three potential outcomes, positive,
negative, or neutral
✓ Regression
➢Answers “How much?”

➢Applied when the output is a continuous number


➢ A simple regression algorithm: y = wx + b. Example: relationship
between environmental temperature (y) and humidity levels (x)
2.2.2.Unupervised Learning
❖ Define
s
Unsupervised Learning
Unsupervised learning is the training of machine using information that is neither
classified nor labeled and allowing the algorithm to act on that information without
guidance.
Here the task of machine is to group unsorted information according to similarities, patterns
and differences without any prior training of data.

Types of Unsupervised Learning


Clustering
The most common unsupervised learning method is cluster analysis. It is used to
find data clusters so that each cluster has the most closely matched data.

Visualization Algorithms
Visualization algorithms are unsupervised learning algorithms that accept unlabeled
data and display this data in an intuitive 2D or 3D format. The data is separated into
somewhat clear clusters to aid understanding.

Anomaly Detection
This algorithm detects anomalies in data without any prior training.

❖Define 2.2.3.Semi-supervised
Semi-supervised Learning
Learning

Semi-supervised learning is a class of machine learning tasks and techniques that


also make use of unlabeled data for training – typically a small amount of labeled
data with a large amount of unlabeled data.

Semi-supervised learning falls between unsupervised learning (without any labeled training
data) and supervised learning (with completely labeled training data).

Example of Semi-supervised Learning
many degrees of freedom (such as a high-degree polynomial model) is likely to
have high variance and thus overfit the training data.
❖ Bias & Variance Dependencies
➢Increasing a model’s complexity will reduce its bias and increase its variance.

➢Conversely, reducing a model’s complexity will increase its bias and reduce its

variance. This is why it is called a tradeoff.
What is Representational Learning
In Machine Learning, Representation refers to the way the data is presented. This
often make a huge difference in understanding.

2.3. Data Preprocessing


2.3.1.Data Preparation
❖Data
✓Machine
Preparation Process

✓ Data Learning depends largely on test data.


preparation involves data selection, filtering, transformation, etc.

✓ Data preparation is a crucial step to make it suitable for ML.


✓ A large amount of data is generally required for the most common forms of
ML.
❖ Types of Data
✓ Labelled Data or Training Data

Unlabeled✓Data
Test Data ✓
Validation Data
2.3.2.Feature Engineering
❖ ✓
Defin e Feature Engineering
The transformation stage in the data preparation process includes an important step
known as Feature Engineering.

Feature Engineering refers to selecting and extracting right features from the data that are
relevant to the task and model in consideration.

Aspects of Feature Engineering
Featu✓re Selection
Most useful and relevant features are selected from the available data

Feature Addition
New features are created by gathering new data

Feature Extraction
Existing features are combined to develop more useful ones

Feature Filtering
Filter out irrelevant features to make the modelling step easy
2.3.3. Feature Scaling
❖Define Feature Scaling

Feature scaling is an important step in the data transformation stage
✓of data preparation process.
Feature Scaling is a method used in Machine Learning for
standardization of independent variables of data features.
❖Techniques of Feature Scaling

✓Standardization
▪Standardization is a popular feature scaling method, which gives data
the property of a standard normal distribution (also known as Gaussian
distribution).
▪ All features are standardized on the normal distribution (a mathematical
▪model).
The mean of each feature is centered at zero, and the feature column has
a standard deviation of one.
2.3.4.Datasets
➢Machine Learning problems often need training or testing datasets.
➢A dataset is a large repository of structured data.
➢In many cases, it has input and output labels that assist in Supervised Learning.

❖ 2.3.5.Dimensionality Reduction with Principal Component Analysis


Define Dimensionality Reduction
✓ Dimensionality reduction involves transformation of data to new dimensions in
a way that facilitates discarding of some dimensions without losing any key
information.

❖Define Principal Component Analysis (PCA)


✓ Principal component analysis (PCA) is a technique for dimensionality reduction
that helps in arriving at better visualization models.
➢Random Forests use an ensemble of decision trees to perform regression
tasks.
2.5.2. Classification
It specifies the class to which data elements belong to.
It predicts a class for an input variable.
It is best used when the output has finite and discreet values.

There are 2 types of classificationb, inomial and multi-class.


2.6. Unsupervised learning
2.6.1.Clustering
2.6.1.1. Clustering Algorithms

❖Clustering means
✓ Clustering is a Machine Learning technique that involves the grouping
of data points.

❖Prototype Based Clustering


▪Prototype-based clustering assumes that most data is located near
prototypes; example: centroids (average) or medoid (most frequently
occurring point)
▪ K-means, a Prototype-based method, is the most popular method for
clustering that involves:
• Training data that gets assigned to matching cluster based on
similarity
• Iterative process to get data points in the best clusters possible
2.6.1.2. K-means Clustering

❖K-means Clustering Algorithm


Step 1: randomly pick k centroids
Step 2: assign each point to the nearest centroid
Step 3: move each centroid to the center of the respective cluster
Step 4: calculate the distance of the centroids from each point again
Step 5: move points across clusters and re-calculate the distance from
the centroid
Step 6: keep moving the points across clusters until the Euclidean
❖distance is minimized
ElbowO Mneet hcodu ld plot the Distortion against the number of clusters K. Intuitively, if K
➢increases, distortion should decrease. This is because the samples will be close
to their assigned centroids. This plot is called the Elbow method.
Scientists have figured out that high-performing graphics processing units (GPU)
can be used for deep learning.

ML Vs Deep Learning

2.7.2.Artificial Neural Networks



Deep learning relies on multiple layers of training.
✓ Artificial Neural Network is a computing system made up of a number of
simple, highly interconnected processing elements which process information
by their dynamic state response to external inputs.


It is an interconnected group of nodes akin to the vast network of layers of
neurons in a brain.
❖ 2.7.3. TensorFlow
TensorFlow is the open source Deep Learning library provided by Google.

It allows development of a variety of neural network applications such as computer


❖ vision, speech processing, or text recognition.
❖ It uses data flow graphs for numerical computations.

3.Reason for choosing Machine Learning


➢ Learning machine learning brings in better career opportunities
✓Machine learning is the shining star of the moment.
✓Every industry looking to apply AI in their domain, studying machine learning
opens world of opportunities to develop cutting edge machine learning
applications in various verticals – such as cyber security, image recognition,
medicine, or face recognition.
✓Several machine learning companies on the verge of hiring skilled ML
engineers, it is becoming the brain behind business intelligence.

➢ Machine Learning Jobs on the rise


The major hiring is happening in all top tech companies in search of those

special kind of people (machine learning engineers) who can build a hammer
(machine learning algorithms).
✓The job market for machine learning engineers is not just hot but it’s sizzling.
Machine Learning Jobs on Indeed.com - 2,500+(India) & 12,000+(US)

4. Learning Outcome
➢ Have a good understanding of the fundamental issues and challenges of machine
learning: data, model selection, model complexity, etc.
Have an understanding of the strengths and weaknesses of many popular machine

learning approaches.
Appreciate the underlying mathematical relationships within and across Machine

Learning algorithms and the paradigms of supervised and un-supervised learning.
Be able to design and implement various machine learning algorithms in a range of

real-world applications.
Ability to integrate machine learning libraries and mathematical and statistical tools
➢ with modern technologies
Ability to understand and apply scaling up machine learning techniques and
➢ associated computing techniques and technologies.
5. Gantt Chart

6. Bibliography
6.1.All Content used in this report is from
✓ https://www.simplilearn.com/
✓ https://www.wikipedia.org/
https://towardsdatascience.com/
✓ https://www.expertsystem.com/
https://www.coursera.org/
✓ https://www.edureka.co/
https://subhadipml.tech/
✓ https://www.forbes.com/
https://medium.com/
✓ https://www.google.com/
6.2. All Pictures are from
✓ https://ww
http w.simplilearn.com/s://www.google.com/
✓ https://www.wikipedia.org/
https://www.youtube.com/
✓ https://www.edureka.co/


Hands-on Machine Learning with Scikit-learn & Tensorflow By Aurelien Geron
6.3.BooPky tIh roenfe Mrraecdh ainree Learning by Sebastian Raschka

You might also like