0% found this document useful (0 votes)
0 views

Machine Learning

Machine learning, a subset of artificial intelligence, enables computers to learn from data and improve decision-making without explicit programming. It has various applications, including image recognition, natural language processing, and medical diagnosis, and can be categorized into supervised, unsupervised, and reinforcement learning. Each type employs different algorithms and methodologies to analyze data and derive insights.

Uploaded by

honuleritesh603
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Machine Learning

Machine learning, a subset of artificial intelligence, enables computers to learn from data and improve decision-making without explicit programming. It has various applications, including image recognition, natural language processing, and medical diagnosis, and can be categorized into supervised, unsupervised, and reinforcement learning. Each type employs different algorithms and methodologies to analyze data and derive insights.

Uploaded by

honuleritesh603
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

MACHINE LEARNING

Machine learning is a part of artificial intelligence that helps


computers learn from data and make decisions. Instead of being
programmed for every task, these systems improve over time by finding
patterns in data. This makes them useful in many different industries.
APPLICATIONS OF ML

1. Image Recognition – Identifies objects in images, like detecting tumors


in medical scans.
2. NLP (Natural Language Processing) – Helps computers understand
human language, like in chatbots or translation.
3. Medical Diagnosis – Predicts diseases based on patient data for early
detection.
4. Predictive Maintenance – Forecasts equipment failures to prevent
breakdowns.
5. Financial Fraud Detection – Spots unusual transactions to prevent
fraud.
6. Supply Chain Optimization – Improves inventory and logistics for
efficient operations.

Types of Machine Learning:

1. Supervised Learning:

 Meaning: Supervised learning involves training a model on a labeled


dataset, where each input data point is paired with the correct output. The
model learns to predict the output for new, unseen inputs by finding
patterns in the training data.
 Types:
o Classification: Assigning inputs into predefined categories.
o Regression: Predicting continuous values based on input data.
 Common Algorithms:
o Linear Regression: Models the relationship between input
variables and a continuous output variable by fitting a linear
equation.
o Support Vector Machines (SVM): Classifies data by finding the
optimal hyperplane that separates data points of different classes.
o Decision Trees: Uses a tree-like model of decisions and their
possible consequences to classify data.
o k-Nearest Neighbors (k-NN): Classifies a data point based on the
majority class among its k nearest neighbors.
 Typical Steps:

o Data Collection: Gather a labeled dataset relevant to the problem.


o Data Preprocessing: Clean and format the data, handling missing
values and normalizing features.
o Model Selection: Choose an appropriate algorithm based on the
problem type and data characteristics.
o Training: Feed the training data into the model to learn the
mapping from inputs to outputs.
o Evaluation: Assess the model's performance using metrics like
accuracy or mean squared error on a separate validation set.
o Prediction: Use the trained model to make predictions on new,
unseen data.

2. Unsupervised Learning:

 Meaning: Unsupervised learning deals with unlabeled data. The model


seeks to identify inherent structures, patterns, or groupings within the data
without predefined labels.
 Types:
o Clustering: Grouping a set of objects such that those within the
same group are more similar to each other than to those in other
groups.
o Dimensionality Reduction: Reducing the number of random
variables under consideration by obtaining a set of principal
variables.
 Common Algorithms:
o k-Means Clustering: Partitions data into k clusters by minimizing
the variance within each cluster.
o Hierarchical Clustering: Creates a hierarchy of clusters through
iterative merges or splits.
o Principal Component Analysis (PCA): Transforms data into a set
of linearly uncorrelated variables called principal components,
ordered by the amount of variance they capture.
 Typical Steps:

o Data Collection: Gather unlabeled data relevant to the analysis.


o Data Preprocessing: Clean and standardize the data to ensure
consistency.
o Algorithm Selection: Choose an unsupervised learning algorithm
suited to the data and desired insights.
o Model Training: Apply the algorithm to the data to uncover
patterns or groupings.
o Interpretation: Analyze the output to draw meaningful
conclusions about the data's structure.

3. Reinforcement Learning:

 Meaning: Reinforcement learning involves an agent that learns to make


decisions by performing actions and receiving feedback in the form of
rewards or penalties. The goal is to develop a strategy that maximizes
cumulative rewards over time.
 Types:
o Model-Based: The agent builds a model of the environment and
uses it to plan actions.
o Model-Free: The agent learns a policy or value function directly
from interactions with the environment without an explicit model.
 Common Algorithms:
o Q-Learning: A model-free algorithm that learns the value of
actions in states to derive an optimal policy.
o Deep Q Networks (DQN): Combines Q-learning with deep neural
networks to handle high-dimensional input spaces.
o Policy Gradient Methods: Optimize the policy directly by
gradient ascent on expected reward.
 Typical Steps:

o Environment Definition: Specify the environment, including


states, actions, and rewards.
o Agent Initialization: Initialize the agent's policy or value function.
o Interaction: The agent interacts with the environment by taking
actions and observing the resulting states and rewards.
o Learning: Update the agent's policy or value function based on the
received rewards to improve future actions.
o Iteration: Repeat the interaction and learning steps until the
agent's performance converges to an optimal or satisfactory level.

Dataset Formats:

Datasets can be structured in various formats, including:


 Tabular Data: Organized in tables with rows and columns, commonly
stored in CSV (Comma-Separated Values) files. Each row represents an
observation, and each column represents a feature.
 Images: Stored in formats like JPEG or PNG, where each image can be
treated as an array of pixel values.
 Text: Unstructured data that may require preprocessing techniques like
tokenization and vectorization to be used in machine learning models.

Features and Observations in Machine Learning

1. Features:
o Features are the characteristics or properties of the data.
o In a dataset, they are the columns that describe different aspects of
an object.
o Example: In a house price dataset, features can be size, number of
bedrooms, location, and price.
2. Observations:
o Observations are individual data points or records in the dataset.
o Each observation is a row in the dataset, containing values for all
features.
o Example: One house with 1,500 sq. ft., 3 bedrooms, New York,
$500,000 is an observation.
SUPERVISED LEARNING

Linear regression aims to find the best-fitting linear equation that predicts the
dependent variable (YYY) based on the independent variable(s) (XXX). The
general form of this equation is:

Y=mX+cY = mX + cY=mX+c

where:

 mmm represents the slope, indicating the change in YYY for a unit
change in XXX.
 ccc is the intercept, the value of YYY when XXX is zero.

Types:

1. Simple Linear Regression: Involves one independent variable predicting


a dependent variable.
2. Multiple Linear Regression: Involves multiple independent variables
predicting a dependent variable. The equation extends to:

Y=m1X1+m2X2+⋯+mnXn+cY = m_1X_1 + m_2X_2 + \dots + m_nX_n +


cY=m1X1+m2X2+⋯+mnXn+c

The cost function J(m,c)J(m, c)J(m,c) for Linear Regression is given by:
where:

 NNN = Number of data points


 YiY_iYi = Actual value of the dependent variable
 Y^i=mXi+c\hat{Y}_i = mX_i + cY^i=mXi+c (Predicted value using the
regression model)
 (Yi−Y^i)2(Y_i - \hat{Y}_i)^2(Yi−Y^i)2 = Squared error for each data
point

Explanation:

 The function calculates the difference (error) between actual and


predicted values.
 It squares the errors to avoid negative values and give more weight to
larger errors.
 The sum of squared errors is averaged over all data points.
 The 1/2 factor is used to simplify derivative calculations when applying
Gradient Descent.

Example Calculation:

Consider a dataset:

XXX (Years YYY Y^\hat{Y}Y^ Squared Error


Error Y−Y^Y
of (Actual (Predicted (Y−Y^)2(Y - \
- \hat{Y}Y−Y^
Experience) Salary) Salary) hat{Y})^2(Y−Y^)2
1 30 30 0 0
2 35 35 0 0
3 40 40 0 0
4 45 45 0 0
5 50 50 0 0
If the predictions are perfect, the cost function value is 0 (best case). However,
if predictions have errors, the cost function value increases.

Goal of Linear Regression:

The objective of Linear Regression is to minimize the cost function


J(m,c)J(m, c)J(m,c). This is achieved using Gradient Descent, which iteratively
adjusts mmm and ccc to find the best fit line.

Conclusion:

The Cost Function quantifies how well the model fits the data. Lower cost
means a better fit. It is minimized during training to improve prediction
accuracy.

You might also like