0% found this document useful (0 votes)
38 views4 pages

Practical # 9

Uploaded by

Alishba Aleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views4 pages

Practical # 9

Uploaded by

Alishba Aleem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Department of Software Engineering

Mehran University of Engineering and Technology, Jamshoro

Course: SWE – Data Analytics and Business Intelligence


Instructor Ms Sana Faiz Practical/Lab No. 09
Date CLOs 04
Signature Assessment Score

Topic To understand basics of python


Objectives To become familiar with Python libraries for Machine Learning
(SciKit)

Lab Discussion: Theoretical concepts and Procedural steps

Machine learning in data science


 Machine learning is a discipline that deals with programming the systems so as to make them
automatically learn and improve with experience. Here, learning implies recognizing and
understanding the input data and taking informed decisions based on the supplied data.
 It is very difficult to consider all the decisions based on all possible inputs. To solve this
problem, algorithms are developed that build knowledge from a specific data and past
experience by applying the principles of statistical science, probability, logic, mathematical
optimization, reinforcement learning, and control theory.
 The main purpose of machine learning is to explore and construct algorithms that can learn
from the previous data and make predictions on new input data.
 The input to a learning algorithm is training data, representing experience, and the output is
any expertise, which usually takes the form of another algorithm that can perform a task. The
input data to a machine learning system can be numerical, textual, audio, visual, or
multimedia. The corresponding output data of the system can be a floating-point number, for
instance, an integer representing a category or a class, for example, a pigeon or a sunflower
from image recognition.

Applications of Machine Learning

 Vision processing
 Language processing
 Forecasting things like stock market trends, weather
 Pattern recognition
 Games
 Data mining
 Expert systems
 Robotics

Concept of learning
 Learning is the process of converting experience into expertise or knowledge.
 Learning can be broadly classified into three categories, as mentioned below, based on the
nature of the learning data and interaction between the learner and the environment.
 Supervised Learning
 Unsupervised Learning
 Semi-supervised Learning

Machine learning Algorithms


 Similarly, there are four categories of machine learning algorithms as shown below −
 Supervised learning algorithm
 Unsupervised learning algorithm
 Semi-supervised learning algorithm
 Reinforcement learning algorithm

Supervised Learning
 Supervised learning is commonly used in real world applications, such as face and speech
recognition, products or movie recommendations, and sales forecasting.
 Supervised learning can be further classified into two types - Regression and Classification.
o Regression trains on and predicts a continuous-valued response, for example
predicting real estate prices.
o Classification attempts to find the appropriate class label, such as analyzing
positive/negative sentiment, male and female persons, benign and malignant
tumors, secure and unsecure loans etc.
 In supervised learning, learning data comes with description, labels, targets or desired outputs
and the objective is to find a general rule that maps inputs to outputs. This kind of learning
data is called labeled data. The learned rule is then used to label new data with unknown
outputs.
 Example
o Supervised learning involves building a machine learning model that is based
on labeled samples. For example, if we build a system to estimate the price of a
plot of land or a house based on various features, such as size, location, and so on,
we first need to create a database and label it. We need to teach the algorithm
what features correspond to what prices. Based on this data, the algorithm will
learn how to calculate the price of real estate using the values of the input
features.

Supervised learning algorithms and examples


 There are many supervised learning algorithms such as Logistic Regression, Neural
networks, Support Vector Machines (SVMs), and Naive Bayes classifiers.
 Common examples of supervised learning include classifying e-mails into spam and not-
spam categories, labeling webpages based on their content, and voice recognition.

Unsupervised Learning
 Unsupervised learning is used to detect anomalies, outliers, such as fraud or defective
equipment, or to group customers with similar behaviors for a sales campaign. It is the
opposite of supervised learning. There is no labeled data here.
 When learning data contains only some indications without any description or labels, it is up
to the coder or to the algorithm to find the structure of the underlying data, to discover hidden
patterns, or to determine how to describe the data. This kind of learning data is
called unlabeled data.

 Suppose that we have a number of data points, and we want to classify them into several
groups. We may not exactly know what the criteria of classification would be. So, an
unsupervised learning algorithm tries to classify the given dataset into a certain number of
groups in an optimum way.
 Unsupervised learning algorithms are extremely powerful tools for analyzing data and for
identifying patterns and trends. They are most commonly used for clustering similar input
into logical groups. Unsupervised learning algorithms include Kmeans, Random Forests,
Hierarchical clustering and so on.

Semi-supervised Learning
 If some learning samples are labeled, but some other are not labeled, then it is semi-
supervised learning. It makes use of a large amount of unlabeled data for training and a
small amount of labeled data for testing.
 Semi-supervised learning is applied in cases where it is expensive to acquire a fully labeled
dataset while more practical to label a small subset. For example, it often requires skilled
experts to label certain remote sensing images, and lots of field experiments to locate oil at a
particular location, while acquiring unlabeled data is relatively easy.

Reinforcement Learning

 Here learning data gives feedback so that the system adjusts to dynamic conditions in order
to achieve a certain objective. The system evaluates its performance based on the feedback
responses and reacts accordingly. The best known instances include self-driving cars and
chess master algorithm AlphaGo.

SCiKIT
o Simple and efficient tools for data mining and data analysis
o Accessible to everybody, and reusable in various contexts
o Built on NumPy, SciPy, and matplotlib
o Open source, commercially usable - BSD license
o Scikit-learn is a machine learning library for Python. It features several
regression, classification and clustering algorithms including SVMs, gradient
boosting, k-means, random forests and DBSCAN.
Scikit features
o Classification
 Identifying to which category an object belongs to.
 Applications: Spam detection, Image recognition.
 Algorithms: SVM, nearest neighbors, Naïve Bayes
o Regression
 Predicting a continuous-valued attribute associated with an object.
 Applications: Drug response, Stock prices.
 Algorithms: SVR, ridge regression, Lasso
o Clustering
 Automatic grouping of similar objects into sets.
 Applications: Customer segmentation, Grouping experiment outcomes
 Algorithms: k-Means, spectral clustering, mean-shift

o Dimensionality reduction
 Reducing the number of random variables to consider.
 Applications: Visualization, Increased efficiency
 Algorithms: PCA, feature selection, non-negative matrix factorization.

o Model selection
 Comparing, validating and choosing parameters and models.
 Goal: Improved accuracy via parameter tuning
 Modules: grid search, cross validation, metrics.
o Preprocessing
o Feature extraction and normalization.
o Application: Transforming input data such as text for use with machine
learning algorithms.
o Modules: preprocessing, feature extraction.

Steps Involved in Machine Learning

 A machine learning project involves the following steps −


o Defining a Problem
o Preparing Data
o Evaluating Algorithms
o Improving Results
o Presenting Results

 Installing pandas and scikit-learn


o Goto project settting/project interpreter
o Click + button
o Search for scikit-learn and pandas
o Install both libraries

You might also like