0% found this document useful (0 votes)
8 views2 pages

ML Concepts

Uploaded by

Hari Sree. M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views2 pages

ML Concepts

Uploaded by

Hari Sree. M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Introduction to Machine Learning

Machine Learning (ML) is a subset of artificial intelligence that focuses on building systems that can
learn from and make decisions based on data. It involves various algorithms and statistical models to
perform tasks without explicit instructions, relying on patterns and inference instead.

Key Concepts and Algorithms

Supervised Learning:

Linear Regression: A technique to model the relationship between a dependent variable and one or
more independent variables by fitting a linear equation to observed data.

Logistic Regression: Used for binary classification problems, it models the probability that a given
input belongs to a particular category.

Decision Trees: A tree-like model used for classification and regression. It splits the data into subsets
based on feature values, creating branches for each possible outcome.

Random Forests: An ensemble learning method that constructs multiple decision trees and merges
them to get a more accurate and stable prediction.

Support Vector Machines (SVM): A classification method that finds the hyperplane which best
separates the classes in the feature space.

Unsupervised Learning:

K-means Clustering: A method to partition data into K distinct, non-overlapping subsets (clusters)
based on feature similarity.

Hierarchical Clustering: Builds a hierarchy of clusters either agglomeratively (bottom-up) or divisively


(top-down).

Principal Component Analysis (PCA): A dimensionality reduction technique that transforms data into
a set of orthogonal components, preserving as much variability as possible.

Model Evaluation:

Confusion Matrix: A table used to evaluate the performance of a classification model. It summarizes
true positives, false positives, true negatives, and false negatives.

Precision, Recall, F1-Score: Metrics derived from the confusion matrix. Precision is the ratio of true
positives to all positive predictions, recall is the ratio of true positives to all actual positives, and F1-
score is the harmonic mean of precision and recall.

ROC-AUC: Receiver Operating Characteristic curve plots the true positive rate against the false
positive rate. The Area Under the Curve (AUC) measures the overall performance.
Natural Language Processing (NLP):

Text Preprocessing: Involves cleaning and preparing text data, including tokenization, stop-word
removal, and stemming/lemmatization.

TF-IDF: Term Frequency-Inverse Document Frequency is a statistical measure used to evaluate the
importance of a word in a document relative to a collection of documents.

Word Embeddings: Techniques like Word2Vec and GloVe transform the text into continuous vector
representations, capturing semantic meaning.

Neural Networks and Deep Learning:

Basic Neural Networks: Composed of layers of neurons, neural networks are used for various tasks
like classification and regression. Each neuron applies a weighted sum of inputs, passes it through an
activation function, and produces an output.

Convolutional Neural Networks (CNN): Specially designed for processing grid-like data such as
images. CNNs use convolutional layers to automatically and adaptively learn spatial hierarchies of
features.

Recurrent Neural Networks (RNN): Suitable for sequential data, RNNs have loops allowing
information to persist. They are used in tasks like time series prediction and language modelling.

Mathematics for ML:

Linear Algebra: Essential for understanding ML algorithms, involving vectors, matrices, and their
operations.

Probability and Statistics: Fundamental for dealing with uncertainties in data and model predictions.
Includes concepts like probability distributions, mean, variance, and hypothesis testing.

Optimization Techniques: Methods like gradient descent are used to minimize the loss function and
improve model performance.

You might also like