0% found this document useful (0 votes)
3 views

ECE3025 AI With Python

The document outlines the course structure for 'Artificial Intelligence with Python,' detailing program outcomes, course outcomes, and assessment types. It covers essential topics in AI, Python programming, and machine learning, including rule-based and data-driven systems, libraries, and mathematical foundations. Additionally, it emphasizes the importance of Python in AI development due to its simplicity, extensive libraries, and community support.

Uploaded by

sekar
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

ECE3025 AI With Python

The document outlines the course structure for 'Artificial Intelligence with Python,' detailing program outcomes, course outcomes, and assessment types. It covers essential topics in AI, Python programming, and machine learning, including rule-based and data-driven systems, libraries, and mathematical foundations. Additionally, it emphasizes the importance of Python in AI development due to its simplicity, extensive libraries, and community support.

Uploaded by

sekar
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 134

Dr. R.

Sekar
Associate Professor
Department of Electronics and Communication Engineering

1
2
3
There are TWELVE program outcomes, which are the course requirements that are met in the particular
program outcomes.

4
5
6
7
There are FOUR program-specific outcomes, which are the course requirements that are met in the
particular program-specific outcomes.

8
TABLE 1: COURSE OUTCOMES
CO CO Expected
Number BLOOMS LEVEL
1 Explain the basic principles of AI and the Python programming language Understand
2 Understand the mathematical formulation and computational models of Understand
classification and regression using supervised learning and predictive
analytics with ensemble learning.

3 Interpret Building Recommender Systems Apply


4 Interpret Reinforcement Learning. Apply

Module:1: Semiconductors and Junction Diodes


The Course outcomes listed for the course (ARTIFICIAL INTELLEGENCE WITH PYTHON) cover a
range of cognitive levels according to Bloom's Taxonomy, which classifies

Module:1: Semiconductors and Junction Diodes


TABLE 2a: CO PO Mapping ARTICULATION MATRIX
CO.
No PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO 10 PO 11 PO 12
CO1 M M L L L L M L
CO2 H H M M L M M L
CO3 H H L L L M M L
CO4 M M L L L L M L

TABLE 2b: CO PSO Mapping ARTICULATION MATRIX

PSO1 PSO2 PSO3 PSO4


CO. No
CO1 L L L
CO2 M L M
CO3 L L L
CO4 L L L

Module:1: Semiconductors and Junction Diodes


12
Module:1: Semiconductors and Junction Diodes
Module:1: Semiconductors and Junction Diodes
By the end of the course, the students will be able to

Module:1: Semiconductors and Junction Diodes


Sl. No Assessment type Contents CO. Duration Marks Weightage
NO In Hours
1 Quiz1 Module-1 CO1 20 Min 10 5%
2 Mid-term Module 1,2 CO1,C 1 hr 50 25%
O2
3 Assignment Module3 CO2,C 1 hr 30 10%
O3
4 Seminar Module3 CO1,C 20Min 10 10%
O3
5 End term Entire Course CO1,C 3 hr 100 50 %
O2,CO3
,CO4

Module:1: Semiconductors and Junction Diodes


• AI systems can be rule-based or data-driven, and
• they range from simple algorithms to complex neural networks that mimic the human brain's
functioning. environments.
Rule-based Systems:
•Rule-based systems rely on explicitly defined rules or logic that dictate how the system should behave in
different situations.
•Developers manually create a set of rules (if-then statements) that the AI follows to make decisions. For
example, in a medical diagnosis system, a rule might be, "If the patient has a fever and a cough, then consider
the possibility of the flu.“
Pros:
• Easy to understand and implement.
• Effective for well-defined problems with clear rules.
Cons:
• Difficult to scale or adapt to complex problems.
• Cannot learn from new data unless rules are manually updated.
Data-Driven Systems:
•Data-driven systems rely on large amounts of data to learn patterns, make predictions, or make
decisions. These systems use machine learning algorithms to automatically identify patterns and improve
over time.
•Instead of manually coding rules, data-driven systems analyze data and learn the underlying patterns.
For example, a spam email filter might be trained on thousands of examples of spam and non-spam
emails, allowing it to recognize spam based on learned patterns rather than predefined rules.
Pros:
• Can handle complex and dynamic environments.
• Learns and adapts as more data becomes available.
Cons:
• Requires large amounts of data for training.
• The underlying model might be less transparent, making it harder to understand how
decisions are made.
Why Python for AI?
Popularity: Python is one of the most popular programming languages for AI due to its simplicity and readability.
Rich Ecosystem: It has a vast collection of libraries and frameworks (e.g., TensorFlow, Keras, Scikit-learn) that
make AI development easier.
Community Support: Large community, extensive documentation, and resources available.

To develop the Coding in Python and AI Modeling


Basics of Python:
Syntax and Structure: Understanding Python's syntax, indentation, variables, and basic data types (integers,
floats, strings, lists, dictionaries).
Control Flow: Mastery of loops (for, while) and conditionals (if, else, elif).
Functions: Defining and using functions, understanding parameters and return values.
Error Handling: Using try-except blocks to manage errors and exceptions.
File Handling: Reading from and writing to files, understanding file operations.
Data Structures and Algorithms
Lists, Tuples, Sets, and Dictionaries: Operations like adding, removing, and accessing elements.
Comprehensions: List, dictionary, and set comprehensions for concise code.
Basic Algorithms: Sorting, searching, and understanding algorithm complexity.

Libraries and Frameworks


NumPy: Understanding arrays, matrix operations, and mathematical functions.
Pandas: Data manipulation using DataFrames, data cleaning, and basic analysis.
Matplotlib and Seaborn: Data visualization techniques, creating plots, histograms, and scatter plots.
Scikit-learn: Machine learning basics, including models like logistic regression, Naive Bayes, decision trees,
random forests, and clustering algorithms.
API stands for Application
Advanced Python Topics Programming Interface. In the
Object-Oriented Programming (OOP): Classes, objects, inheritance, and context of APIs.Any software
polymorphism. with a distinct function.
Decorators and Generators: Advanced Python features for more efficient coding. Interface can be thought of as a
Working with APIs: Understanding how to interact with web services using Python. contract of service between two
applications. This contract
defines how the two
communicate with each other
using requests and responses.
Some important Libraries

TensorFlow
Flexible Architecture: TensorFlow allows you to deploy computations to one or more CPUs or GPUs in a
desktop, server, or mobile device with a single API.
Wide Range of Applications: TensorFlow is versatile and can be used for various tasks such as speech
recognition, image recognition, neural networks, and more.
TensorFlow Serving: Ideal for deploying machine learning models in production environments.
Keras
User-Friendly: Keras is designed with user experience in mind, offering a high-level, user-friendly interface for
building and training deep learning models.
Runs on Top of TensorFlow: Keras originally could run on various backends like Theano, TensorFlow, and
CNTK(Microsoft Cognitive Toolkit), but it has been tightly integrated with TensorFlow as its official high-level API.
Modularity and Extensibility: Keras is modular, allowing users to create neural layers, cost functions,
optimizers, and activation functions easily.
Some important Libraries

PyTorch: Developed by Facebook's AI Research lab, PyTorch is another open-source machine learning framework
that’s particularly popular for its dynamic computational graph and ease of use.

Scikit-learn: A versatile library for machine learning in Python, Scikit-learn is ideal for classical machine learning
algorithms like regression, clustering, and more. It’s not as powerful for deep learning but is perfect for many
general-purpose ML tasks.

MXNet: An open-source deep learning framework developed by the Apache Software Foundation, MXNet supports
flexible and efficient operations, making it suitable for both research and production environments.
Mathematics for Machine Learning
Linear Algebra: Vectors, matrices, and operations like dot products and eigenvectors.
Probability and Statistics: Understanding distributions, probability theory, and statistical measures (mean,
variance, etc.).
Calculus: Differentiation and integration, particularly in the context of optimization problems.

Problem-Solving Skills
Algorithm Design: Structuring and breaking down complex problems into smaller, manageable tasks.
Debugging and Testing: Writing test cases, using debugging tools, and ensuring code reliability.
Module-1 Introduction to Artificial Intelligence Classifier
AI Vs ML Vs DL (probabilistic
AI Applications machine
learning)
Introduction Algorithms algorithm)
to AI Logistic
Classifier/ regression
Regression Classifier
Support Vector
Machine(SVM))

Binarization Confusion
Matrix
Preprocessing Mean removal
Data Analysis
ScNormalization

Label Encoding Data analysis


Python for
using SQL
Data science
Data Visualization
Data Analysis
using python
using Excel

Module:1: Semiconductors and Junction Diodes


What is AI??
AI Vs ML Vs DL
AI Applications
Machine Learning
Machine Learning
Machine Learning
TRAINING DATA

Module:1: Semiconductors and Junction Diodes


TRAINING DATA

Module:1: Semiconductors and Junction Diodes


TESTING DATA

Module:1: Semiconductors and Junction Diodes


Machine Learning
Supervised Learning
Supervised Learning
Types of supervised Machine learning Algorithms:

Regression
Types of supervised Machine learning Algorithms:

Classification
Unsupervised Learning
Unsupervised Learning
Types of Unsupervised Machine learning Algorithms:
Unsupervised learning Algorithms
Supervised Vs Unsupervised learning
Supervised Learning Unsupervised Learning

Supervised learning algorithms are trained using labeled data. Unsupervised learning algorithms are trained using unlabeled data.

Supervised learning model takes direct feedback to check if it is predicting Unsupervised learning model does not take any feedback.
correct output or not.

Supervised learning model predicts the output. Unsupervised learning model finds the hidden patterns in data.

In supervised learning, input data is provided to the model along with the In unsupervised learning, only input data is provided to the model.
output.

The goal of supervised learning is to train the model so that it can predict The goal of unsupervised learning is to find the hidden patterns and useful
the output when it is given new data. insights from the unknown dataset.

Supervised learning needs supervision to train the model. Unsupervised learning does not need any supervision to train the model.

Supervised learning can be categorized Unsupervised Learning can be classified


in Classification and Regression problems. in Clustering and Associations problems.

Supervised learning can be used for those cases where we know the input Unsupervised learning can be used for those cases where we have only
as well as corresponding outputs. input data and no corresponding output data.

Supervised learning model produces an accurate result. Unsupervised learning model may give less accurate result as compared to
supervised learning.

Supervised learning is not close to true Artificial intelligence as in this, we Unsupervised learning is more close to the true Artificial Intelligence as it
first train the model for each data, and then only it can predict the correct learns similarly as a child learns daily routine things by his experiences.
output.

It includes various algorithms such as Linear Regression, Logistic It includes various algorithms such as Clustering, KNN, and Apriori
Regression, Support Vector Machine, Multi-class Classification, Decision algorithm.
tree, Bayesian Logic, etc.
Semi supervised Learning
Semi supervised Learning- Assumptions

Module:1: Semiconductors and Junction Diodes


Semi supervised Learning How does it work

Module:1: Semiconductors and Junction Diodes


Semi supervised Learning -Applications
Reinforcement Learning
Reinforcement Learning
Why Is Python?
Python is a very popular general-purpose interpreted, interactive, object-oriented, and high-level programming
language
An interpreted language is a type of programming language where the source code is executed line by line by an
interpreter at runtime, as opposed to a compiled language where the source code is translated into machine code
by a compiler before execution
No Compilation Step: In an interpreted language, you don't need to explicitly compile your code before running it.
Instead, you provide the source code directly to the interpreter, which then reads and executes the code line by
line.
Dynamic Typing: Interpreted languages often feature dynamic typing, which means that variable types are
determined at runtime. This allows for more flexibility but may lead to runtime errors if type mismatches occur.
Platform Independence: Since the code is executed by an interpreter, it can be more platform-independent. As
long as there is an interpreter available for a particular platform, the same code can run on different operating
systems without modification.
Easier Debugging: Interpreted languages often provide better error messages and debugging tools since they
can analyze and report issues as they occur during execution.
.
Why Is Python?
Python's interactive shell allows you to interact with your code in real-time. You can define variables, test
functions, and experiment with different algorithms on the fly. This helps developers quickly prototype and test
ideas without the need to create a separate script or compile code.
Immediate Feedback: Python provides clear and immediate feedback through error messages and
exceptions. If there's an issue in your code, Python will raise an exception with a descriptive error message,
helping you identify and fix problems quickly.

Extensive Standard Library: Python's extensive standard library offers a wide range of built-in modules and
functions that you can use interactively. You can import and utilize these modules in your code without
needing to write everything from scratch.

Third-Party Packages: Python has a vast ecosystem of third-party packages and libraries that can be
easily installed and integrated into your interactive coding sessions. This allows you to leverage specialized
tools and functionality without reinventing the wheel.
Why Is Python?
OOP is a way of structuring your code to model real-world entities and their interactions in a more intuitive and
organized manner.
Objects: Objects are the fundamental building blocks of OOP. An object is a self-contained unit that combines
both data (attributes) and behavior (methods). Think of objects as real-world entities or nouns, such as a car, a
person, or a bank account.
Classes: A class is a blueprint or template for creating objects. It defines the attributes (data members) and
methods (functions) that objects of that class will have. Classes act as a way to organize and encapsulate
related data and behavior. You can think of a class as a description or specification for an object.
For example, if we're modeling cars in a car rental system, we might have a Car class that defines attributes
like make, model, and year, and methods like start engine and accelerate.
Attributes: Attributes are the data members or properties of an object. They represent the state or
characteristics of an object. For instance, in a Person class, attributes might include name, age, and address.
These attributes store information about the object.
Methods: Methods are functions defined within a class that define the behavior or actions that an object can
perform. They operate on the attributes of the object and can be used to change the object's state or perform
specific actions.
Why Is Python?
Encapsulation: Encapsulation is the concept of bundling data (attributes) and the methods that operate on that
data into a single unit (a class). It hides the internal details of how a class works and exposes a clean interface
for interacting with objects. Encapsulation helps maintain data integrity and makes code more modular and
maintainable.
Inheritance: Inheritance is a mechanism that allows you to create a new class (a subclass or derived class)
based on an existing class (a parent class or base class). The subclass inherits the attributes and methods of
the parent class. It promotes code reuse and the creation of class hierarchies.
For example, you can have a Vehicle class as a base class, and then create subclasses like Car and
Motorcycle that inherit attributes and methods from the Vehicle class.
Polymorphism: Polymorphism means "many shapes" and refers to the ability of different objects to respond to
the same method or function call in a way that is appropriate for their specific class. It allows you to write code
that can work with objects of different classes as long as they share a common interface.
For instance, you can have a Shape class with a method called area, and then create subclasses like Circle
and Rectangle that implement their own area method. You can call area on any shape object, and the
appropriate method will be executed.
Python for Datascience
Data science is an interdisciplinary field that combines various techniques, algorithms, processes, and systems to
extract meaningful insights and knowledge from structured and unstructured data. It encompasses a wide range of
activities, from data collection and cleaning to data analysis and visualization
Structured data refers to data that is organized in a predefined manner, usually in tabular formats like
spreadsheets or databases. It includes data that can be easily stored, accessed, and analyzed, such as sales
records, customer information, and financial data.
Unstructured data, on the other hand, does not have a specific format or organization. This type of data includes
text documents, images, videos, social media posts, and emails. Unstructured data is more complex to process
and analyze but often contains valuable information that can be uncovered through advanced techniques like
natural language processing (NLP) and machine learning

Module:1: Semiconductors and Junction Diodes


Python for Datascience
Data science is an interdisciplinary field that combines various techniques, algorithms, processes, and systems to
extract meaningful insights and knowledge from structured and unstructured data. It encompasses a wide range of
activities, from data collection and cleaning to data analysis and visualization
Data Collection: Data science starts with the collection of relevant data from various sources, which can include
databases, APIs, sensors, websites, social media, and more. This data can be in the form of text, numbers,
images, or any other data type.
Data Cleaning and Preprocessing: Raw data is often messy, incomplete, or inconsistent. Data scientists clean
and preprocess the data to remove errors, missing values, and outliers. This step is crucial to ensure that the data
is suitable for analysis.
Exploratory Data Analysis (EDA): EDA involves the initial analysis of the data to understand its characteristics,
such as distributions, patterns, and correlations. Data visualization plays a significant role in EDA, helping to
identify trends and insights.
Feature Engineering: In machine learning, feature engineering is the process of selecting and transforming
relevant variables (features) from the data to improve the performance of predictive models.
Machine Learning: Machine learning is a subset of data science that focuses on building predictive models and
algorithms. It involves training models on historical data to make predictions or classifications on new data.
Common machine learning techniques include regression, classification, clustering, and deep learning.

Module:1: Semiconductors and Junction Diodes


Python for Datascience

Which is more crucial in the following domain


Statistical Analysis: Statistical methods are used to test hypotheses, estimate parameters, and make
inferences about the data. Statistics helps data scientists draw meaningful conclusions and make data-driven
decisions.
Data Visualization: Data scientists use charts, graphs, and interactive visualizations to communicate their
findings effectively. Visualization tools like Matplotlib, Seaborn, and D3.js help convey complex information in a
visually appealing manner.
Big Data: Data scientists often deals with large datasets, known as big data. Tools like Hadoop, Spark, and
distributed databases are used to handle and process these massive amounts of data efficiently

Module:1: Semiconductors and Junction Diodes


Python for Datascience
.
Domain Knowledge: Understanding the specific domain or industry is essential in data science. Domain
knowledge helps data scientists frame problems, interpret results, and generate actionable insights that are
relevant to the context.
Data Ethics and Privacy: Data scientists must consider ethical and privacy concerns when working with data.
They need to ensure that data collection and analysis adhere to legal and ethical guidelines, protect individuals'
privacy, and prevent biases in the results.
Data-Driven Decision Making: The ultimate goal of data science is to inform decision-making processes. Insights
derived from data can be used to optimize business operations, improve products, and make informed strategic
choices.
Continuous Learning and Iteration: Data science is an evolving field. Data scientists need to stay up-to-date
with the latest techniques, tools, and technologies and continuously iterate on their analyses and models to
improve results.

Module:1: Semiconductors and Junction Diodes


Python Libraries

Python is considered one of the best programming languages for artificial intelligence (AI). Python boasts a vast
and mature ecosystem of libraries and frameworks specifically tailored for AI and machine learning (ML)
development.

Normally, a library is a collection of books or is a room or place where many books are stored to be used later.
Similarly, in the programming world, a library is a collection of precompiled codes that can be used later on in a
program for some specific well-defined operations. Other than pre-compiled codes, a library may contain
documentation, configuration data, message templates, classes, and values, etc.

A Python library is a collection of related modules. It contains bundles of code that can be used repeatedly in
different programs. It makes Python Programming simpler and convenient for the programmer. As we don’t
need to write the same code again and again for different programs. Python libraries play a very vital role in
fields of Machine Learning, Data Science, Data Visualization, etc.

Most of the Python Libraries are written in the C programming language. The Python standard library consists
of more than 200 core modules.
Data Prepocessing
Data preprocessing is a crucial step in data analysis and machine learning. It involves cleaning, transforming, and
organizing raw data into a format suitable for analysis or modeling
1.Data quality assessment
2.Data cleaning
3.Data transformation
4.Data reduction
1.Data quality assessment
It's important to check the quality and consistency of data, as it often contains anomalies and inherent issues that
can affect analysis.
Mismatched data types: When you collect data from many different sources, it may come to you in different
formats. While the ultimate goal of this entire process is to reformat your data for machines, you still need to begin
with similarly formatted data. For example, if part of your analysis involves family income from multiple countries,
you’ll have to convert each income amount into a single currency.
Data Prepocessing
Mixed data values: Perhaps different sources use different descriptors for features – for example, man or male.
These value descriptors should all be made uniform.
Data outliers: Outliers can have a huge impact on data analysis results. For example if you're averaging test
scores for a class, and one student didn’t respond to any of the questions, their 0% could greatly skew the results.
Missing data: Take a look for missing data fields, blank spaces in text, or unanswered survey questions. This
could be due to human error or incomplete data. To take care of missing data, you’ll have to perform data cleaning.
2. Data cleaning
Data cleaning is the process of adding missing data and correcting, repairing, or removing incorrect or irrelevant
data from a data set. Data cleaning is the most important step of preprocessing because it will ensure that your
data is ready to go for your downstream needs.
Data cleaning will correct all of the inconsistent data you uncovered in your data quality assessment.
Depending on the kind of data you’re working with, there are a number of possible cleaners you’ll need to run your
data through.
Data Prepocessing
Ignore the tuples: A tuple is an ordered list or sequence of numbers or entities. If multiple values are missing
within tuples, you may simply discard the tuples with that missing information. This is only recommended for
large data sets, when a few ignored tuples won’t harm further analysis.
Manually fill in missing data: This can be tedious, but is definitely necessary when working with smaller
data sets.
Noisy data
Data cleaning also includes fixing “noisy” data. This is data that includes unnecessary data points, irrelevant data,
and data that’s more difficult to group together.
Binning: Binning sorts data of a wide data set into smaller groups of more similar data. It’s often used
when analyzing demographics. Income, for example, could be grouped: $35,000-$50,000, $50,000-$75,000, etc.
Regression: Regression is used to decide which variables will actually apply to your analysis.
Regression analysis is used to smooth large amounts of data. This will help you get a handle on your data, so
you’re not overburdened with unnecessary data.
Clustering: Clustering algorithms are used to properly group data, so that it can be analyzed with like
data. They’re generally used in unsupervised learning, when not a lot is known about the relationships within your
data.
Data Prepocessing

Noisy data
Data cleaning also includes fixing “noisy” data. This is data that includes unnecessary data points, irrelevant data,
and data that’s more difficult to group together.
Binning: Binning is a technique used in data cleaning to group continuous numerical data into smaller,
more manageable categories or "bins." This process helps to reduce noise, handle outliers, and make the data
easier to analyze. Binning is especially useful when dealing with large datasets where minor variations in data
can obscure meaningful patterns.
Regression: Regression is used to decide which variables will actually apply to your analysis.
Regression analysis is used to smooth large amounts of data. This will help you get a handle on your data, so
you’re not overburdened with unnecessary data.
Clustering: Clustering algorithms are used to properly group data, so that it can be analyzed with like
data. They’re generally used in unsupervised learning, when not a lot is known about the relationships within your
data.
Data Prepocessing

Noisy data
Data cleaning also includes fixing “noisy” data. This is data that includes unnecessary data points, irrelevant data,
and data that’s more difficult to group together.
Binning: Binning is a technique used in data cleaning to group continuous numerical data into smaller,
more manageable categories or "bins." This process helps to reduce noise, handle outliers, and make the data
easier to analyze. Binning is especially useful when dealing with large datasets where minor variations in data
can obscure meaningful patterns.
Grouping Data:
Binning involves dividing continuous data into a set number of intervals or ranges. Each data point is then
assigned to one of these bins based on its value.
Smoothing Data:
By grouping similar data points together, binning smooths out minor variations, making trends and patterns more
apparent.
Handling Outliers:
Extreme values that might skew the analysis can be grouped into broader categories, reducing their impact on
the overall analysis.
Data Prepocessing

Example of Binning in Data Cleaning:


Suppose you have a dataset of customer ages, and you want to categorize them into age groups.
Original Data:
Ages: [22, 25, 28, 35, 40, 45, 50, 60, 75]
After Binning:
Categories: ["Young Adult", "Middle-Aged", "Senior"]
Binned Ages: [20-30 -> "Young Adult", 31-50 -> "Middle-Aged", 51-80 -> "Senior"]
Benefits of Binning in Data Cleaning:
Simplifies Analysis: Reduces the complexity of data, making it easier to analyze.
Enhances Patterns: Helps reveal underlying trends by grouping similar data points together.
Improves Data Quality: Helps manage outliers and irregularities in the data, leading to more reliable analysis.
Data Prepocessing
After data cleaning, you may realize you have insufficient data for the task at hand. At this point you can also
perform data wrangling or data enrichment to add new data sets and run them through quality assessment
and cleaning again before adding them to your original data.

3. Data transformation
With data cleaning, we’ve already begun to modify our data, but data transformation will begin the process of
turning the data into the proper format(s) you’ll need for analysis and other downstream processes.
This generally happens in one or more of the below:
Aggregation:Data aggregation combines all of your data together in a uniform format.
Normalization: Normalization scales your data into a regularized range so that you can compare it more
accurately.
Feature selection: Feature selection is the process of deciding which variables (features, characteristics,
categories, etc.) are most important to your analysis. These features will be used to train ML models.
Discreditization:Discreditiization pools data into smaller intervals. It’s somewhat similar to binning, Example,
0-15 minutes, 15-30, etc.
Concept hierarchy generation : Concept hierarchy generation can add a hierarchy within and between your
features that wasn’t present in the original data.
Data Prepocessing
Data reduction
The more data you’re working with, the harder it will be to analyze, even after cleaning and transforming it.
Depending on your task at hand, you may actually have more data than you need. Especially when working
with text analysis, much of regular human speech is superfluous or irrelevant to the needs of the researcher.
Data reduction not only makes the analysis easier and more accurate, but cuts down on data storage.

Attribute selection: Similar to discreditization, attribute selection can fit your data into smaller pools. It,
essentially, combines tags or features, so that tags like male/female and professor could be combined
into male professor/female professor.
Numerosity reduction: This will help with data storage and transmission. You can use a regression model,
for example, to use only the data and variables that are relevant to your analysis.
Dimensionality reduction: This, again, reduces the amount of data used to help facilitate analysis and
downstream processes. Algorithms like K-nearest neighbors use pattern recognition to combine similar data
and make it more manageable.
 Less Code: Implementing AI involves tons and tons of algorithms. Thanks to Pythons
support for pre-defined packages, we don’t have to code algorithms. And to make things
easier, Python provides “check as you code” methodology that reduces the burden of
testing the code.
 Prebuilt Libraries: Python has 100s of pre-built libraries to implement various Machine
Learning and Deep Learning algorithms. So every time you want to run an algorithm on a
data set, all you have to do is install and load the necessary packages with a single
command. Examples of pre-built libraries include NumPy, Keras, Tensorflow, Pytorch,
and so on.
There are several Pre-processing techniques

1.Binarization
Convert our numerical values into Boolean values. Let us use in-built method to binaries the input data using
threshold ( the values are above the threshold is ‘1’ and below the threshold is ‘0’)
Simple Example of Binarization
Suppose you have a dataset of exam scores, and you want to categorize students as "Pass" or "Fail" based on
a threshold score.
Original Data:
Scores: [45, 75, 60, 82, 39, 55]
Threshold: 50
After Binarization:
Binary Scores: [0, 1, 1, 1, 0, 1]
Here, 1 represents "Pass" (scores ≥ 50), and 0 represents "Fail" (scores < 50).
import numpy as np
from sklearn.preprocessing import Binarizer

# Sample data: exam scores


scores = np.array([[45], [75], [60], [82], [39], [55]])

# Define the binarizer with a threshold of 50


binarizer = Binarizer(threshold=50)

# Apply binarization
binary_scores = binarizer.fit_transform(scores)

print("Original Scores:", scores.flatten())


print("Binary Scores:", binary_scores.flatten())
Mean Removal
Mean removal is a data preprocessing technique used to normalize data by subtracting the mean of each
feature (or column) from the dataset. This centers the data around zero, which is especially useful in machine
learning algorithms like Support Vector Machines (SVM) and Principal Component Analysis (PCA), where it's
important to remove biases and ensure that all features contribute equally.

Simple Example of Mean Removal


Suppose you have a dataset of students' test scores in three subjects. The scores might have different scales or
ranges, so mean removal is applied to standardize them.
Original Data:
Math Scores: [50, 60, 70, 80, 90]
English Scores: [40, 50, 60, 70, 80]
Science Scores: [30, 45, 60, 75, 90]
After Mean Removal:
Each score will have the mean of its respective subject subtracted from it, resulting in data centered around
0.
import numpy as np
from sklearn.preprocessing import StandardScaler

# Sample data: Students' test scores


scores = np.array([
[50, 40, 30],
[60, 50, 45],
[70, 60, 60],
[80, 70, 75],
[90, 80, 90]
])

# Displaying original data


print("Original Scores:")
print(scores)

# Applying mean removal using StandardScaler


scaler = StandardScaler(with_mean=True, with_std=False)
mean_removed_scores = scaler.fit_transform(scores)

# Displaying mean-removed data


print("\nScores After Mean Removal:")
print(mean_removed_scores)
Benefits of Mean Removal:
Centers Data: Aligns the data around zero, removing bias and making the dataset more balanced.
Improves Algorithm Performance: Many machine learning algorithms perform better on data that has been
centered, as it ensures that all features are on a similar scale.
Prepares Data for Further Processing: Mean removal is often a preliminary step before applying techniques like
PCA, which rely on mean-centered data.
Mean removal is a crucial preprocessing step that helps in standardizing data, making it ready for more advanced
machine learning algorithms and ensuring consistent performance across different features.
Scaling:
When working with machine learning models, it's common to have data where the values of different features vary
widely. For example, one feature might represent age in years (e.g., 25, 40, 60), while another feature might
represent income in thousands of dollars (e.g., 50, 100, 200). These features are on very different scales, which
can cause problems for many machine learning algorithms.
•In feature vector, the value of each feature vector can vary between many random values and it becomes very
important to scale those values.
•They don’t want the data artificially neither extreme high nor extreme low.
•That means the data should be scaled form ‘0’ to ‘1’.
Why Scale Feature Vectors?
Consistency Across Features: When features have widely different scales, it can lead to some features
dominating others in algorithms like gradient descent. Scaling helps in giving all features an equal
opportunity to influence the model.
Avoiding Artificial Bias: Without scaling, a feature with a larger range of values might be considered
more important by the model just because of its scale, even if it's not.
Improving Model Performance: Many machine learning algorithms perform better or converge faster
when the features are on a similar scale. This is especially true for algorithms that rely on distance calculations,
like k-Nearest Neighbors (k-NN) and Support Vector Machines (SVM).
Feature Scaling Example
Let's consider a simple example with two features: height in centimeters and weight in kilograms.
Original Data:
Height: [150, 160, 170, 180, 190]
Weight: [50, 60, 70, 80, 90]
Since height is measured in centimeters and weight in kilograms, their ranges are quite different. Scaling these
features will bring them to a common scale, typically between 0 and 1.
Original Data:
Height: [150, 160, 170, 180, 190] Scaled Data:
Weight: [50, 60, 70, 80, 90] Height: [0, 0.25, 0.5, 0.75, 1]
Weight: [0, 0.25, 0.5, 0.75, 1]

Module:1: Semiconductors and Junction Diodes


import numpy as np
from sklearn.preprocessing import MinMaxScaler

# Sample data: Heights (in cm) and Weights (in kg)


data = np.array([
[150, 50],
[160, 60],
[170, 70],
[180, 80],
[190, 90]
])

# Displaying original data


print("Original Data:")
print(data)

Module:1: Semiconductors and Junction Diodes


# Displaying original data Original Data:
print("Original Data:") [[150 50]
print(data) [160 60]
[170 70]
# Applying Min-Max scaling [180 80]
scaler = MinMaxScaler(feature_range=(0, 1)) [190 90]]
scaled_data = scaler.fit_transform(data)
Scaled Data:
# Displaying scaled data [[0. 0. ]
print("\nScaled Data:") [0.25 0.25]
print(scaled_data) [0.5 0.5 ]
[0.75 0.75]
[1. 1. ]]

Module:1: Semiconductors and Junction Diodes


Normalization is a technique used to scale the values of a feature vector so that they fall within a common range,
typically between 0 and 1 or -1 and 1. The purpose of normalization is to bring all the features onto the same scale,
which ensures that no single feature dominates the others due to its range of values. This is particularly important
for machine learning algorithms that rely on distance metrics, such as k-Nearest Neighbors (k-NN) and neural
networks.

Why Normalize Data?


Consistency: Different features may have different units or scales (e.g., age in years, income in dollars).
Normalization puts them on a common scale, making them easier to compare.
Improved Performance: Algorithms that rely on gradient descent or distance calculations can perform better and
converge faster when the data is normalized.
Eliminating Bias: Without normalization, features with larger scales could unduly influence the results, leading to
biased models.

Module:1: Semiconductors and Junction Diodes


L1-Normalization
L1 normalization scales the feature vector by dividing each element by the sum of the absolute values of the
vector. This results in the sum of the absolute values of the vector elements being 1..

*It is useful when you want to


emphasize the sparsity of the
data or when feature vectors
have many zero values.*

*It is more commonly used in


machine learning, especially in
algorithms that rely on the
Euclidean distance between data
points.*

Module:1: Semiconductors and Junction Diodes


Normalization Example
Let's say you have a dataset with two features: height in centimeters and income in thousands of dollars.
Original Data:
Height: [150, 160, 170, 180, 190]
Income: [50, 60, 70, 80, 90]
These features are on different scales, with height values much larger than income values.
Explanation:
Original Data: This is the initial dataset with height and income values.
L1 Normalization:
The sum of the absolute values of each row is calculated.
Each element in the row is divided by this sum.
For example, for the first row [150, 50], the sum of the absolute values is 150 + 50 = 200. The L1-normalized
vector is [150/200, 50/200] = [0.75, 0.25].
L2 Normalization:
The Euclidean norm (square root of the sum of squared values) of each row is calculated.
Each element in the row is divided by this norm.
For example, for the first row [150, 50], the L2 norm is sqrt(150^2 + 50^2) ≈ 158.11. The L2-normalized
vector is [150/158.11, 50/158.11] ≈ [0.9487, 0.3162].

Module:1: Semiconductors and Junction Diodes


import numpy as np
from sklearn.preprocessing import Normalizer

# Sample data: Heights (in cm) and Incomes (in thousands of dollars)
data = np.array([
[150, 50],
[160, 60],
[170, 70],
[180, 80],
[190, 90]
])

print("Original Data:")
print(data)

Module:1: Semiconductors and Junction Diodes


# Applying L1 normalization
l1_normalizer = Normalizer(norm='l1')
l1_normalized_data = l1_normalizer.fit_transform(data)

print("\nL1 Normalized Data:")


print(l1_normalized_data)

# Applying L2 normalization
l2_normalizer = Normalizer(norm='l2')
l2_normalized_data = l2_normalizer.fit_transform(data)

print("\nL2 Normalized Data:")


print(l2_normalized_data)

Module:1: Semiconductors and Junction Diodes


Original Data: L1 Normalized Data: L2 Normalized Data:
[[150 50] [[0.75 0.25 ] [[0.9487 0.3162]
[160 60] [0.72727273 0.27272727] [0.9417 0.3536]
[170 70] [0.70833333 0.29166667] [0.9363 0.3851]
[180 80] [0.69230769 0.30769231] [0.9312 0.4138]
[190 90]] [0.67857143 0.32142857]] [0.9262 0.4454]]

Module:1: Semiconductors and Junction Diodes


Label encoding
Label encoding is a technique used to convert categorical data (i.e., data that can take on a limited number of
values, often representing categories or labels) into numerical form. This is important because most machine
learning algorithms work with numerical data rather than categorical data. Label encoding assigns each unique
category in a feature a unique integer value.

Why Use Label Encoding?


Machine Learning Algorithms: Many machine learning algorithms cannot work with categorical data directly.
Label encoding transforms these categorical values into a numerical format that the algorithms can process.
Efficiency: By converting categories to numbers, label encoding reduces the complexity of the data, making it
easier to analyze and model.

Module:1: Semiconductors and Junction Diodes


Simple Example of Label Encoding
Imagine you have a dataset with a feature that represents the color of cars: Red, Blue, and Green. These are
categorical values, and you want to convert them into numerical values.
Steps to Implement Label Encoding in Python
Original Data: Let's create a simple dataset with car colors.
Apply Label Encoding: We'll use the LabelEncoder from sklearn.preprocessing to convert the color labels into
numerical values.
Explanation:
Original Data: The dataset contains car colors, which are categorical values (Red, Blue, Green).
car_colors = ['Red', 'Blue', 'Green', 'Blue', 'Red', 'Green']
Label Encoding:
The LabelEncoder object is used to transform these categorical values into numerical labels.
Each unique color is assigned a unique integer value:
Blue is encoded as 0
Green is encoded as 1
Red is encoded as 2
The original list of colors is then transformed into an array of these numerical labels: [2, 0, 1, 0, 2, 1].

Module:1: Semiconductors and Junction Diodes


from sklearn.preprocessing import LabelEncoder Original Data:
['Red', 'Blue', 'Green', 'Blue', 'Red', 'Green']
# Sample data: Car colors
car_colors = ['Red', 'Blue', 'Green', 'Blue', 'Red', 'Green'] Encoded Labels:
[2 0 1 0 2 1]
# Displaying original data
print("Original Data:")
print(car_colors)

# Creating a LabelEncoder object


label_encoder = LabelEncoder()

# Fitting the label encoder and transforming the data


encoded_colors = label_encoder.fit_transform(car_colors)

# Displaying the encoded labels


print("\nEncoded Labels:")
print(encoded_colors)

Module:1: Semiconductors and Junction Diodes


Module:1: Semiconductors and Junction Diodes
Parametric vs. Non-Parametric Estimation

In statistics and machine learning, parametric and non-parametric estimation methods refer to different
approaches for modeling data and making predictions. The distinction between these two lies primarily in the
assumptions they make about the underlying data distribution and the number of parameters involved.

Parametric :Assumes a specific form (mapping function between input and output)for the data distribution
(e.g., normal distribution, linear relationship).
Fixed Number of Parameters: The model is defined by a fixed number of parameters, regardless of the size
of the data set.
Examples: Linear regression, logistic regression, Gaussian distribution, and ARIMA models in time series.

Module:1: Semiconductors and Junction Diodes


Parametric vs. Non-Parametric Estimation

Advantages:
Efficiency: Often computationally efficient because of the fixed number of parameters.
Interpretability: The models are usually easier to interpret because of the simplicity in the functional form.
Less Data Needed: Since the model form is specified, parametric methods may require less data to estimate the
parameters accurately.
Disadvantages:
Model Bias: If the assumptions about the data distribution are incorrect, the model may not fit the data well,
leading to biased estimates.
Inflexibility: Less flexible in modeling complex relationships or distributions that don't fit the assumed form.

Example of Parametric Estimation: Linear Regression


Linear regression assumes a linear relationship between the input variables (features) and the output (target).
The model estimates the parameters (slope and intercept) by minimizing the difference between the
predicted and actual values.

Module:1: Semiconductors and Junction Diodes


Parametric vs. Non-Parametric Estimation

Non-Parametric Estimation
Definition: Non-parametric estimation does not assume any specific form for the data distribution. Instead, it uses
the data to model the underlying distribution directly, without assuming a predetermined number of parameters.
Key Characteristics:
No Assumptions: Makes fewer assumptions about the underlying data distribution.
Flexible: The complexity of the model can grow with the size of the data, as more data allows for capturing
more complex patterns.
Examples: K-nearest neighbors (KNN), kernel density estimation (KDE), decision trees, and support vector
machines (SVM).

Module:1: Semiconductors and Junction Diodes


Parametric vs. Non-Parametric Estimation

Advantages:
Flexibility: Can model a wide range of data distributions and relationships because it doesn't assume a specific
form.
Adaptability: The model can adapt to the data's complexity, potentially leading to better performance when the
underlying relationship is unknown or complex.
Disadvantages:
Computational Cost: Often more computationally intensive, especially with large datasets, because the model
may need to consider many parameters.
Overfitting: There's a higher risk of overfitting, especially with small datasets, as the model may become too
complex.
More Data Needed: Non-parametric methods often require more data to achieve reliable estimates, especially as
the model's complexity increases.

Module:1: Semiconductors and Junction Diodes


Parametric vs. Non-Parametric Estimation

Example of Non-Parametric Estimation: K-Nearest Neighbors (KNN)


K-Nearest Neighbors is a non-parametric method where the prediction for a new data point is based on the
majority vote or average of the K nearest points in the training set. The model does not assume a specific form for
the data distribution and instead relies on the structure of the data itself.

Module:1: Semiconductors and Junction Diodes


Regression Model

Model Parameters are the internal coefficients or Hyperparameters are the external settings or configurations
weights that a model learns from the training data that must be set before the training process begins. Unlike
during the learning process. These parameters are model parameters, they are not learned from the data but
used to make predictions on new data are set by the user to control the training process.
Example: In linear regression, the model tries to Example: In linear regression:
find the best-fit line for the data. The equation of a If you add L2 regularization (Ridge regression), a
line is y=mx+b hyperparameter called alpha (or λ) controls the strength of
where: the regularization. This hyperparameter is not learned from
m (slope) and b(intercept) are the model the data but is set before training.
parameters. In k-nearest neighbors (KNN), the number of neighbors k
These parameters are learned by the model during is a hyperparameter that defines how many neighboring data
training by minimizing the error between the points are considered when making a prediction.
predicted values and the actual values.
.

Module:1: Semiconductors and Junction Diodes


Role of the Regression Model

Module:1: Semiconductors and Junction Diodes


Regression Model
Regression algorithms are used if there is a relationship between the input variable and the output variable. It is
used for the prediction of continuous variables, such as Weather forecasting, Market Trends, etc.

Module:1: Semiconductors and Junction Diodes


Linear Regression Model
Linear Regression
Linear regression is used for predicting continuous values. It assumes a linear relationship between the input
variables (features) and the output variable (target). The goal is to find the line that best fits the data points by
minimizing the error between the predicted and actual values.

Example :The weight of the person is linearly related to their height. So, this shows a linear relationship
between the height and weight of the person. According to this, as we increase the height, the weight of the
person will also increase.

It is not necessary that one variable is dependent on others, or one causes the other, but there is some critical
relationship between the two variables.
In such cases, we use a scatter plot to simplify the strength of the relationship between the variables. If there
is no relation or linking between the variables then the scatter plot does not indicate any increasing or
decreasing pattern
In such cases, the linear regression design is not beneficial to the given data. And
another example is predicting house prices based on features such as square
footage, number of bedrooms, and location.

Module:1: Semiconductors and Junction Diodes


Linear Regression Model

Correlation Coefficients:
The measure of the relationship between two variables is shown by the correlation coefficient. The range of the
coefficient lies between -1 to +1. This coefficient shows the strength of the association of the observed data
between two variables.
Statistical measure that quantifies the strength and direction of the linear relationship between two variables.

Module:1: Semiconductors and Junction Diodes


Linear Regression Model

Module:1: Semiconductors and Junction Diodes


Linear Regression Model

Least Square Regression Line or Linear Regression Line


The most popular method to fit a regression line in the ‘XY’plot is found by using least-squares. This process is
used to determine the best-fitting line for the given data by reducing the sum of the squares of the vertical
deviations from each data point to the line.
The idea behind it is to minimize the sum of the vertical distance between all of the data points and the
line of best fit.
Let us consider there are three regression lines, out of which , we need to find best fit line

Module:1: Semiconductors and Junction Diodes


they all look like they could be a fair line of best fit, but in fact Diagram 3 is the most accurate as the
regression line has been calculated using the least squares regression line.

If a point rests on the fitted line accurately, then the value of its perpendicular deviation is 0. It is 0 because the
variations are first squared, then added, so their positive and negative values will not be cancelled. Linear
regression determines the straight line, known as the Least-Squares Regression Line or LSRL

Module:1: Semiconductors and Junction Diodes


Performances matrices of the Regression

Module:1: Semiconductors and Junction Diodes


Problem Solving
Find a linear regression equation for the following two sets of data:
x 2 4 6 8
y 3 7 5 10

Module:1: Semiconductors and Junction Diodes


Since, sum of difference of the variable ‘x’ or ‘y’ and mean of the respective variable are always zero.

Module:1: Semiconductors and Junction Diodes


We shall use these values to predict the values of ‘y’ for the given values of ‘x’. The performance of the model can
be analyzed by calculating the root mean square error and R2 value.
Calculations are shown below

Module:1: Semiconductors and Junction Diodes


Module:1: Semiconductors and Junction Diodes
Python Coding
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Dataset
X = np.array([2, 4, 6, 8]).reshape(-1, 1) # Independent variable
Y = np.array([3, 7, 5, 10]) # Dependent variable

# Create a linear regression model


model = LinearRegression()

# Fit the model


model.fit(X, Y)

Module:1: Semiconductors and Junction Diodes


# Get the regression coefficients Output:
intercept = model.intercept_ Regression Equation: Y = 1.50 + 0.95X
slope = model.coef_[0] Mean Squared Error (MSE): 2.17
R-squared: 0.67
# Predict the Y values
Y_pred = model.predict(X)

# Calculate performance metrics


mse = mean_squared_error(Y, Y_pred)
r2 = r2_score(Y, Y_pred)

# Output the regression equation and performance metrics


print(f"Regression Equation: Y = {intercept:.2f} + {slope:.2f}X")
print(f"Mean Squared Error (MSE): {mse:.2f}")
print(f"R-squared: {r2:.2f}")

Module:1: Semiconductors and Junction Diodes


Classification: Classification algorithms are used when the output variable is categorical, which means there are
two classes such as Yes-No, Male-Female, True-false, etc. Example: Spam Filtering

Linear regression

• Linear regression and logistic regression are both data analytics that use independent variables to predict the
dependent variables. However, it differs in how it will handle the dependent variable. In linear regression, the
dependent variable is continuous, For example, predicting the price of the house.

• Logistic regression is used for binary classification problems. It predicts the probability that a given input
belongs to a certain class. The output is transformed using the logistic (sigmoid) function, which maps any real-
valued number into a value between 0 and 1.

Example: Determining whether an email is spam or not spam.

Module:1: Semiconductors and Junction Diodes


Logistic Regression from Linear Regression

Modeling logistic regression from linear regression involves transforming the linear regression model to handle
binary outcomes (0 or 1) instead of continuous outcomes

Module:1: Semiconductors and Junction Diodes


Module:1: Semiconductors and Junction Diodes
Module:1: Semiconductors and Junction Diodes
Problem Solving:
The dataset of pass or fail in an exam of 5 students is given in the table. Use logistic regression as classifier to
answer the following questions.
Calculate the probability of pass for the student who studied 33 hours.
At least how many hours student should study that makes he will pass the course with probability of more
than 95%.
Assume the model suggested by optimizer for odds of passing the course is Hours Pass(1)/
Log (Odds) = -64+2×hours. Study Fail(0)
Solution
29 0
15 0
We use sigmoid function in
logistic regression 33 1
28 1
39 1

Module:1: Semiconductors and Junction Diodes


Module:1: Semiconductors and Junction Diodes
Module:1: Semiconductors and Junction Diodes
import numpy as np

# Logistic regression model parameters


intercept = -64
slope = 2

# Function to calculate probability of passing


def calculate_probability(hours):
log_odds = intercept + slope * hours
odds = np.exp(log_odds)
probability = odds / (1 + odds)
return probability

# Part I: Calculate the probability of passing for a student who studied 33 hours
hours_studied = 33
prob_pass_33_hours = calculate_probability(hours_studied)
print(f"Probability of passing for a student who studied {hours_studied} hours: {prob_pass_33_hours:.4f}")

Module:1: Semiconductors and Junction Diodes


# Part II: Calculate minimum study hours to achieve more than 95% probability of passing
target_probability = 0.95

# Solve for hours using the logistic regression equation


# We need to find hours such that:
# target_probability = 1 / (1 + exp(-(-64 + 2 * hours)))
# Rearrange to solve for hours:
# target_probability / (1 - target_probability) = exp(-64 + 2 * hours)
# log(target_probability / (1 - target_probability)) = -64 + 2 * hours
# (log(target_probability / (1 - target_probability)) + 64) / 2 = hours

log_odds_target = np.log(target_probability / (1 - target_probability))


hours_needed = (log_odds_target + 64) / 2
print(f"Minimum hours needed to pass with a probability greater than 95%: {hours_needed:.2f}")
Output Probability of passing for a student who studied 33 hours: 0.8808
Minimum hours needed to pass with a probability greater than 95%: 33.47

Module:1: Semiconductors and Junction Diodes


2.How would you find the slope and intercept coefficient in logistic regression for the given data sets and
implement the same using python.
Hours Pass(1)/
Study Fail(0)
29 0
15 0
33 1
28 1
39 1

Module:1: Semiconductors and Junction Diodes


Module:1: Semiconductors and Junction Diodes
Hours Pass(1)/
Study Fail(0)
29 0
15 0
33 1
28 1
39 1

Module:1: Semiconductors and Junction Diodes


*Choosing a learning rate is crucial in optimization
problems, including logistic regression. The learning rate
determines the size of the steps we take towards reaching
the minimum (or maximum) of our objective function.
Here's why we chose a learning rate of 0.001.

Finally, a learning rate of 0.0001 is too small, resulting in


very slow convergence. A learning rate of 0.001 strikes a
good balance between stability and speed, achieving
accurate parameter values learning rate of 0.01 is too
large, leading to instability and inaccurate coefficients.
Thus, a learning rate of 0.001 is chosen to ensure stable
and effective convergence to the optimal parameter
values for this logistic regression problem.*

Module:1: Semiconductors and Junction Diodes


Module:1: Semiconductors and Junction Diodes
Module:1: Semiconductors and Junction Diodes
Module:1: Semiconductors and Junction Diodes
Module:1: Semiconductors and Junction Diodes
Module:1: Semiconductors and Junction Diodes
Insufficient Iterations: The algorithm might not have had enough iterations to converge to the new optimal values
after the data change.
Small Learning Rate: A very small learning rate can make the convergence process extremely slow, potentially
making it appear as though the values haven't changed.
Small Impact of Change: If the change in data was minor or if the new data point was not significantly different
from existing points, the impact on the slope and intercept might be minimal.

Steps to Ensure Changes are Reflected


Increase the Number of Iterations: Ensure the algorithm runs for enough iterations to fully converge.
Check Learning Rate: Verify that the learning rate is appropriate for the dataset size and complexity.
Verify Data Change: Ensure the data change is substantial enough to impact the model.

Module:1: Semiconductors and Junction Diodes


A question then comes up: how can we determine the number of iterations needed to get the best values
for the slope and intercept?
Determining the optimal number of iterations for convergence in logistic regression involves monitoring the model's
performance during training. Here are steps and methods to help you identify the appropriate number of iterations:
1. Monitor the Cost Function
The cost function (or loss function) in logistic regression should decrease and eventually stabilize as the model
converges. By plotting the cost function value over iterations, you can visually inspect when the values start to
plateau, indicating convergence.
2. Check Gradient Magnitude
The gradients should approach zero as the model converges. You can monitor the magnitude of the gradients;
when they become very small (below a certain threshold), the model has likely converged.
3. Early Stopping
Early stopping involves halting the training process once the improvement in the cost function falls below a certain
threshold for a specified number of consecutive iterations. This prevents overfitting and ensures you do not run
unnecessary iterations.

Module:1: Semiconductors and Junction Diodes


You can observe a plot showing how the cost function
decreases over iterations and the number of iterations
required to reach convergence. This helps in
understanding and ensuring the model's optimal training
duration.

Note:
In logistic regression, the cost function (often referred to as
the log-likelihood function) and its gradient provide a way
to evaluate and optimize the model parameters.
Cost Function (Log-Likelihood Function)
The cost function in logistic regression measures how well
the model's predicted probabilities align with the actual
class labels. It's derived from the likelihood of the observed
data given the model parameters. Specifically, for logistic
regression, we often use the negative log-likelihood as the
cost function, which we aim to minimize.

Module:1: Semiconductors and Junction Diodes

You might also like