0% found this document useful (0 votes)

19 views7 pages

Assignment Part A

Uploaded by

Mamoona Jabbar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views7 pages

Assignment Part A

Uploaded by

Mamoona Jabbar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Assignment

Final Exams

Muhammad Assad Jabbar

Roll No. 67011
Registration No. 2020-GCUF-09067
MS Computer Science (1st Semester)

DEPARTMENT OF COMPUTER SCIENCE,

GOVERNMENT COLLEGE UNIVERSITY, FAISALABAD
PAKISTAN
ASSIGNMENT PART A

SECTION-1

Q 1: What is linear regression?

Linear Regression is a linear model that assumes a linear relationship between input variables
(independent variables ‘x’) and output variable (dependent variable ’y’) such that ‘y’ can be
calculated from a linear combination of input variables(x).

Q 5: What is multiple regression?

Multiple regression is an extension of simple linear regression. It is used when we want to

predict the value of a variable based on the value of two or more other variables. The variable
we want to predict is called the dependent variable. The variables we are using to predict the
value of the dependent variable are called the independent variables.

Q 3: What is polynomial regression?

Polynomial Regression is a special case of Linear Regression where we fit the polynomial
equation on the data with a curvilinear relationship between the dependent and independent
variables.

Polynomial Regression does not require the relationship between the independent and
dependent variables to be linear in the data set.

SECTION-2

Q 5: What is Supervised learning?

Supervised learning provides you with a powerful tool to classify and process data using
machine language. With supervised learning you use “labeled” data, which is a data set
that has been classified, to infer a learning algorithm. The data set is used as the basis for
predicting the classification of other unlabeled data through the use of machine learning
algorithms. Supervised learning is good at classification and regression problems, such as
determining what category a news article belongs to or predicting the volume of sales for
a given future date. In supervised learning, the aim is to make sense of data within the
context of a specific question.
Q 6: What is Unsupervised Learning?

Unsupervised learning is a machine learning technique, where you do not need to supervise
the model. Instead, you need to allow the model to work on its own to discover information.
It mainly deals with the unlabelled data.

Q 1: What is the difference between user and item based collaborative

filtering?

1. User Based Collaborative Filtering:

Items recommendation rating for a user is calculated depending on that items rating by other
similar users.

 The ratings are predicted using the ratings of neighboring users.

 Neighborhoods are defined by similarities among users.

 Pearson Correlation provides superior results.

2. Item Based Collaborative Filtering:

Item rating is predicted based on how similar items have been rated by that user.

 The ratings are predicted using the user’s own ratings on neighboring (closely related)
items.

 Neighborhoods are defined by similarities among items.

 Adjusted Cosine similarity provides superier results.

SECTION-3

Q 1: What is k fold cross validation?

Cross-validation is a resembling procedure used to evaluate machine learning models on a
limited data sample.

The procedure has a single parameter called k that refers to the number of groups that a given
data sample is to be split into. As such, the procedure is often called k-fold cross-validation.
When a specific value for k is chosen, it may be used in place of k in the reference to the
model, such as k=10 becoming 10-fold cross-validation.

Cross-validation is primarily used in applied machine learning to estimate the skill of a

machine learning model on unseen data. That is, to use a limited sample in order to estimate
how the model is expected to perform in general when used to make predictions on data not
used during the training of the model.

It is a popular method because it is simple to understand and because it generally results in a

less biased or less optimistic estimate of the model skill than other methods, such as a simple
train/test split.

The general procedure is as follows:

 Shuffle the dataset randomly.

 Split the dataset into k groups
 For each unique group:
1. Take the group as a hold out or test data set
2. Take the remaining groups as a training data set
3. Fit a model on the training set and evaluate it on the test set
4. Retain the evaluation score and discard the model
 Summarize the skill of the model using the sample of model evaluation scores

Importantly, each observation in the data sample is assigned to an individual group and stays
in that group for the duration of the procedure. This means that each sample is given the
opportunity to be used in the hold out set 1 time and used to train the model k-1 times.

SECTION-4

Q 1: What is Bayesian method?

Bayes' theorem, named after 18th-century British mathematician Thomas Bayes, is a
mathematical formula for determining conditional probability. Conditional probability is the
likelihood of an outcome occurring, based on a previous outcome occurring. Bayes' theorem
provides a way to revise existing predictions or theories (update probabilities) given new or
additional evidence. In finance, Bayes' theorem can be used to rate the risk of lending money
to potential borrowers.

Q 2: What is k mean clustering?

K-means clustering is a type of unsupervised learning, which is used when you have
unlabeled data (data without defined categories or groups). The goal of this algorithm is to
find groups in the data, with the number of groups represented by the variable K. The
algorithm works iteratively to assign each data point to one of K groups based on the features
that are provided. Data points are clustered based on feature similarity. The results of the K-
means clustering algorithm are:

 The centroids of the K clusters, which can be used to label new data
 Labels for the training data (each data point is assigned to a single cluster)

Rather than defining groups before looking at the data, clustering allows you to find and
analyze the groups that have formed organically. The "Choosing K" section below describes
how the number of groups can be determined.

Each centroid of a cluster is a collection of feature values which define the resulting groups.
Examining the centroid feature weights can be used to qualitatively interpret what kind of
group each cluster represents.

Q 4: What is ensemble learning?

Ensemble learning is the process by which multiple models, such as classifiers or experts, are
strategically generated and combined to solve a particular computational intelligence
problem. Ensemble learning is primarily used to improve the (classification, prediction,
function approximation, etc.) performance of a model, or reduce the likelihood of an
unfortunate selection of a poor one. Other applications of ensemble learning include
assigning a confidence to the decision made by the model, selecting optimal (or near optimal)
features, data fusion, incremental learning, nonstationary learning and error-correcting.

SECTION-5

Q 2 : What are classification metrics?

In binary classification, there are two possible output classes. In multi-class classification,
there are more than two possible classes.

There are many ways of measuring classification performance:

 Accuracy
 Confusion matrix
 Log-loss
 Precision and Recall
 F-Scores
 Receiver operating characteristic (ROC) curve
 Area under curve (AUC) ("curve" corresponds to the ROC curve)

Q 4 : What are ranking metrics?

Ranking related metrics. Ranking is a fundamental problem in machine learning, which tries
to rank a list of items based on their relevance in a particular task (e.g. ranking pages on
Google based on their relevance to a given query).

Q 1: How output of different algorithms can be measured?

Time efficiency- A measure of amount of time for an algorithm to execute.

Space efficiency- A measure of the amount of memory needed for an algorithm to execute.

Complexity theory- A study of algorithm performance Function dominance - a comparison

of cost functions
SECTION-6

Q 1: What is uniform distribution?

A uniform distribution is a type of distribution of probabilities where all outcomes are equally
likely; each variable has the same probability that it will be the outcome. A deck of cards has
within its uniform distributions because the probability that a heart, club, diamond, or spade
is pulled is the same.

Q 3: What is percentile?

The most common definition of a percentile is a number where a certain percentage of scores
fall below that number. You might know that you scored 67 out of 90 on a test. But that
figure has no real meaning unless you know what percentile you fall into. If you know that
your score is in the 90th percentile, that means you scored better than 90% of people who
took the test.

Q 4: What are moments?

For a random variable x, its Nth moment is the expected value of the Nth power of x, where
N is a positive integer. The Nth moment of the deviation of x from its mean is called "the Nth
central moment".

The 1st moment is the mean, the 2nd central moment is the variance.

Golan, Amos - Foundations of Info-Metrics - Modeling and Inference With Imperfect Information-Oxford University Press (2018) PDF
100% (1)
Golan, Amos - Foundations of Info-Metrics - Modeling and Inference With Imperfect Information-Oxford University Press (2018) PDF
489 pages
DS ML CompleteSlides PDF
No ratings yet
DS ML CompleteSlides PDF
211 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
KSMF
No ratings yet
KSMF
35 pages
Interview Questions Companie
No ratings yet
Interview Questions Companie
72 pages
ERROR and Confusion Matrix
No ratings yet
ERROR and Confusion Matrix
29 pages
SBST1303 Elementary Statistics
100% (1)
SBST1303 Elementary Statistics
244 pages
PWC
No ratings yet
PWC
24 pages
Interview Questions
No ratings yet
Interview Questions
23 pages
Fundamental Statistics For The Social and Behavioral Sciences 1st Edition by Tokunaga ISBN Test Bank
100% (45)
Fundamental Statistics For The Social and Behavioral Sciences 1st Edition by Tokunaga ISBN Test Bank
45 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
Unsupervised ML Clustering
No ratings yet
Unsupervised ML Clustering
15 pages
Statistic Inference Unit 2 Notes
No ratings yet
Statistic Inference Unit 2 Notes
34 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
100% (1)
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
41 pages
(KtabPDF Com) xrwA7TEBGp
No ratings yet
(KtabPDF Com) xrwA7TEBGp
32 pages
Unit 5
No ratings yet
Unit 5
8 pages
Intro To Machine Learning New
No ratings yet
Intro To Machine Learning New
18 pages
Interview Questions For DS & DA (ML)
100% (1)
Interview Questions For DS & DA (ML)
66 pages
Probability and Statistics Mansoura Day4
No ratings yet
Probability and Statistics Mansoura Day4
23 pages
Data Science Technical Interview Questions
No ratings yet
Data Science Technical Interview Questions
24 pages
Q. 1) What Is Class Condition Density? (3 Marks) Ans
No ratings yet
Q. 1) What Is Class Condition Density? (3 Marks) Ans
12 pages
Comparison of Classification Algorithms
No ratings yet
Comparison of Classification Algorithms
11 pages
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
23 pages
Data Science
No ratings yet
Data Science
44 pages
Agglomerative Is A Bottom-Up Technique, But Divisive Is A Top-Down Technique
No ratings yet
Agglomerative Is A Bottom-Up Technique, But Divisive Is A Top-Down Technique
8 pages
Chapter 5 Learning Deterministic Models
No ratings yet
Chapter 5 Learning Deterministic Models
28 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
5 pages
Q1-What's The Trade-Off Between Bias and Variance?
100% (1)
Q1-What's The Trade-Off Between Bias and Variance?
5 pages
2 Marks Adobe Scan 20-Mar-2024
No ratings yet
2 Marks Adobe Scan 20-Mar-2024
2 pages
Decision Trees. These Models Use Observations About Certain
No ratings yet
Decision Trees. These Models Use Observations About Certain
6 pages
AIML
No ratings yet
AIML
30 pages
Data Science Related Interview Question
100% (1)
Data Science Related Interview Question
77 pages
Fiches Machine Learning
No ratings yet
Fiches Machine Learning
21 pages
Machine Learning
No ratings yet
Machine Learning
37 pages
Data Science Interview Questions
100% (1)
Data Science Interview Questions
68 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Simplified Viva EDA
No ratings yet
Simplified Viva EDA
7 pages
Hydrology II - P.Nyenje
100% (1)
Hydrology II - P.Nyenje
86 pages
Unit 5 Bi
No ratings yet
Unit 5 Bi
3 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
Sample Q - A For Module 3 - 4
No ratings yet
Sample Q - A For Module 3 - 4
18 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
Poisson Distribution and Traffic Applications
100% (1)
Poisson Distribution and Traffic Applications
2 pages
Week 4 - Intro To ML
No ratings yet
Week 4 - Intro To ML
37 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Data Mining 4th Is
No ratings yet
Data Mining 4th Is
24 pages
Machine Learning Interview Questions & Answers - MIQ
No ratings yet
Machine Learning Interview Questions & Answers - MIQ
17 pages
JNTUK R20 ML UNIT-I Final
No ratings yet
JNTUK R20 ML UNIT-I Final
22 pages
Machine Learning QNA
No ratings yet
Machine Learning QNA
1 page
Evolutional Study On KNN and K-Means Algorithms (SP)
No ratings yet
Evolutional Study On KNN and K-Means Algorithms (SP)
9 pages
AIM
No ratings yet
AIM
8 pages
Aiml-Qb - Unit 3
No ratings yet
Aiml-Qb - Unit 3
6 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Convergence of Random Variables
No ratings yet
Convergence of Random Variables
7 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
CS1004 DataMining Unit 4 Notes
No ratings yet
CS1004 DataMining Unit 4 Notes
8 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
Experimental Design
100% (1)
Experimental Design
16 pages
Applied Data Science With Python-N
No ratings yet
Applied Data Science With Python-N
17 pages
Week 6 Atg Stat and Prob
No ratings yet
Week 6 Atg Stat and Prob
7 pages
Data Science Intervieew Questions
100% (1)
Data Science Intervieew Questions
16 pages
Machine Learning IQs
100% (1)
Machine Learning IQs
13 pages
Applied Statistics
No ratings yet
Applied Statistics
31 pages
Probabilty Help EXam Questions
No ratings yet
Probabilty Help EXam Questions
88 pages
Pa ZG512 Ec-3r First Sem 2022-2023
No ratings yet
Pa ZG512 Ec-3r First Sem 2022-2023
5 pages
Summary of Chapter "Random Variable"
No ratings yet
Summary of Chapter "Random Variable"
10 pages
BIO 401 FINAL MCQs AND QUESTION
No ratings yet
BIO 401 FINAL MCQs AND QUESTION
22 pages
Hurlstone Prelim 2021 With Solutions 63146897caab4
No ratings yet
Hurlstone Prelim 2021 With Solutions 63146897caab4
27 pages
BRT - Notes Part 1
No ratings yet
BRT - Notes Part 1
15 pages
Probility Assignment
No ratings yet
Probility Assignment
20 pages
Chapter 9
No ratings yet
Chapter 9
16 pages
Information Technology Project Management Providing Measurable Organizational Value 5th Edition Marchewka Solutions Manualinstant Download
100% (9)
Information Technology Project Management Providing Measurable Organizational Value 5th Edition Marchewka Solutions Manualinstant Download
42 pages
Research Methodology Syllabus
No ratings yet
Research Methodology Syllabus
2 pages
Toàn B Công TH C Môn SB
No ratings yet
Toàn B Công TH C Môn SB
17 pages
Spatial Pattern Analysis
No ratings yet
Spatial Pattern Analysis
16 pages
Chapter 14
No ratings yet
Chapter 14
43 pages
MCF2D
No ratings yet
MCF2D
43 pages
MA MSC Syllabus
No ratings yet
MA MSC Syllabus
17 pages
Lesson 4 Hypergeometric Poisson Distribution
No ratings yet
Lesson 4 Hypergeometric Poisson Distribution
16 pages
Application of Fractional Calculus in Statistics
No ratings yet
Application of Fractional Calculus in Statistics
9 pages
CDF, PDF
No ratings yet
CDF, PDF
2 pages
Bahria University, Karachi Campus Department of Business Studies
No ratings yet
Bahria University, Karachi Campus Department of Business Studies
3 pages
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
From Everand
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
Idea Link
No ratings yet
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet

Assignment Part A

Uploaded by

Assignment Part A

Uploaded by

Assignment

Muhammad Assad Jabbar

DEPARTMENT OF COMPUTER SCIENCE,

Q 1: What is linear regression?

Q 5: What is multiple regression?

Multiple regression is an extension of simple linear regression. It is used when we want to

Q 3: What is polynomial regression?

Q 5: What is Supervised learning?

Q 1: What is the difference between user and item based collaborative

1. User Based Collaborative Filtering:

 The ratings are predicted using the ratings of neighboring users.

 Neighborhoods are defined by similarities among users.

 Pearson Correlation provides superior results.

2. Item Based Collaborative Filtering:

 Neighborhoods are defined by similarities among items.

 Adjusted Cosine similarity provides superier results.

Q 1: What is k fold cross validation?

Cross-validation is primarily used in applied machine learning to estimate the skill of a

It is a popular method because it is simple to understand and because it generally results in a

The general procedure is as follows:

 Shuffle the dataset randomly.

Q 1: What is Bayesian method?

Q 2: What is k mean clustering?

Q 4: What is ensemble learning?

Q 2 : What are classification metrics?

There are many ways of measuring classification performance:

Q 4 : What are ranking metrics?

Q 1: How output of different algorithms can be measured?

Time efficiency- A measure of amount of time for an algorithm to execute.

Complexity theory- A study of algorithm performance Function dominance - a comparison

Q 1: What is uniform distribution?

Q 4: What are moments?

You might also like