Lecture 5 Encoding
Lecture 5 Encoding
Lecture – 5
Encoding of Data DISCOVER . LEARN . EMPOWER
1
Machine Learning: Course Objectives
COURSE OBJECTIVES
The Course aims to:
1. Understand and apply various data handling and visualization techniques.
2. Understand about some basic learning algorithms and techniques and their
applications, as well as general questions related to analysing and handling large data
sets.
3. To develop skills of supervised and unsupervised learning techniques and
implementation of these to solve real life problems.
4. To develop basic knowledge on the machine techniques to build an intellectual
machine for making decisions behalf of humans.
5. To develop skills for selecting an algorithm and model parameters and apply them for
designing optimized machine learning applications.
2
COURSE OUTCOMES
Understand about some basic learning on algorithms and analysing their applications, as
CO2
well as general questions related to analysing and handling large data sets.
Analyse the performance of machine learning model and apply optimization techniques to
CO5
improve the performance of the model.
3
Unit-1 Syllabus
Data Visualization Different types of plots, Plotting fundamentals using Matplotlib, Plotting
fundamentals using Seaborn.
4
SUGGESTIVE READINGS
TEXT BOOKS:
• T1: Tom.M.Mitchell, “Machine Learning”, McGraw Hill, International Edition, 2018
• T2: Ethern Alpaydin, “Introduction to Machine Learning”. Eastern Economy Edition, Prentice Hall of
India, 2015.
• T3: Andreas C. Miller, Sarah Guido, “Introduction to Machine Learning with Python”, O’REILLY
(2018).
REFERENCE BOOKS:
• R1 Sebastian Raschka, Vahid Mirjalili, “Python Machine Learning”, Packt Publisher (2019)
• R2 Aurélien Géron, “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow”, Wiley,
2nd Edition, 2022
• R3 Christopher Bishop, “Pattern Recognition and Machine Learning”, Illustrated Edition, Springer,
2016.
5
Index
• Categorical Data
• Encoding
• Categorical Encoding
• Types of Categorical Encoding
• Label Encoding
• One-Hot Encoding
• Ordinal Encoding
• Why?
• Transform it.
• How to transform?
• For instance, if the value of the categorical variable has six different
classes, we will use 0, 1, 2, 3, 4, and 5.
• #create DataFrame
• df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'],
• 'points': [25, 12, 15, 14, 19, 23, 25, 29]})
• #view DataFrame
• print(df)
• #create DataFrame
• df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'B', 'C', 'C'],
• 'points': [25, 12, 15, 14, 19, 23, 25, 29]})
• #view final df
• print(final_df)
By: Prof. (Dr.) Vineet Mehan 24
Output
team points 0 1 2
0 A 25 1.0 0.0 0.0
1 A 12 1.0 0.0 0.0
2 B 15 0.0 1.0 0.0
3 B 14 0.0 1.0 0.0
4 B 19 0.0 1.0 0.0
5 B 23 0.0 1.0 0.0
6 C 25 0.0 0.0 1.0
7 C 29 0.0 0.0 1.0
• #view final df
• print(final_df)
• For example, if we are encoding rankings of 1st place, 2nd place, etc,
there is an inherit order.
• # define data
• data = asarray([['red'], ['green'], ['blue']])
• print(data)
• # transform data
• result = encoder.fit_transform(data)
• print(result)
• Encoding
• Categorical Encoding
32
Task
• Apply the ordinal encoding technique on a suitable dataset and get
the required result. (BT-Level3)
For queries
Email: [email protected]