0% found this document useful (0 votes)
71 views35 pages

SML - Question Bank-20.2.25

The document covers various statistical concepts including central tendency, inferential statistics, and regression analysis, emphasizing their importance in manufacturing and machine learning. It provides detailed explanations of statistical measures such as mean, median, mode, and variance, along with practical examples and calculations. Additionally, it discusses different statistical tests like Chi-Square, T-Test, ANOVA, and their applications in real-world scenarios.

Uploaded by

baip1066
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views35 pages

SML - Question Bank-20.2.25

The document covers various statistical concepts including central tendency, inferential statistics, and regression analysis, emphasizing their importance in manufacturing and machine learning. It provides detailed explanations of statistical measures such as mean, median, mode, and variance, along with practical examples and calculations. Additionally, it discusses different statistical tests like Chi-Square, T-Test, ANOVA, and their applications in real-world scenarios.

Uploaded by

baip1066
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

310503: Statistics and Machine Learning

Q. No. Question
1 Define and explain Central Tendency? Explain its importance in
manufacturing.8 marks
Central Tendency Definition: A statistical measure that identifies the typical or
central value in a dataset, representing where data tends to cluster.
Three Main Measures:
1. Mean (Average)
 Sum of values divided by count
 Used for normally distributed data
 Example: Average diameter of parts
2. Median (Middle)
 Middle value when data is ordered
 Best when outliers present
 Example: Middle value of component weights
3. Mode (Most Frequent)
 Most common value
 Used for categorical data
 Example: Most common defect type
Importance in Manufacturing:
1. Quality Control
 Setting specifications
 Monitoring processes
 Detecting deviations
 Measuring consistency
2. Process Optimization
 Setting target values
 Adjusting machines
 Reducing variation
 Improving efficiency
3. Decision Making
 Production planning
 Maintenance scheduling
 Resource allocation
 Product specifications
3. Cost Management
 Reducing waste
 Controlling inventory
 Optimizing materials
 Planning production
Practical Benefits:
 Improved product quality
 Reduced defects
 Lower costs
 Better process control
 Enhanced customer satisfaction
 More efficient operations
This understanding helps manufacturers maintain quality while optimizing their
processes and reducing costs.
2 Calculate the values for Mean, Median, Mode, and Mid-range for the given
dataset.8 marks

Mean = Σ(Score × Frequency) / ΣFrequency

Score (X) Frequency (f) Score × Frequency (X × f)


60 3 180
65 2 130
70 5 350
75 7 525
80 3 240

Σ(Score × Frequency) = 180 + 130 + 350 + 525 + 240 = 1425


ΣFrequency = 3 + 2 + 5 + 7 + 3 = 20
Mean = 1425 / 20 = 71.25
Median:
Score Frequency
(X) (f) Cumulative Frequency
60 3 3
65 2 5
70 5 10
75 7 17
80 3 20
Total frequency (N) is 20 (even).
Median Position = N / 2 = 20 / 2 = 10
The 10th value falls within the group with a score of 70.
Median = 70
Mode
The mode is the score with the highest frequency.
Highest frequency is 7, corresponding to a score of 75.
Mode = 75
Mid-range
Mid-range = (Max Score + Min Score) / 2
Mid-range = (80 + 60) / 2 = 140 / 2 = 70.0
3 Find the mean deviation of mean, variance and standard deviation for
given sample data x = (15, 22, 27, 11, 9, 21, 14,9).7 marks
Data Point (X) Deviation (X - μ) |(X - μ)| (X - μ) ^2
15 15 - 16 = -1 1 1
22 22 - 16 = 6 6 36
27 27 - 16 = 11 11 121
11 11 - 16 = -5 5 25
9 9 - 16 = -7 7 49
21 21 - 16 = 5 5 25
14 14 - 16 = -2 2 4
9 9 - 16 = -7 7 49
ΣX = 128 0 38 310
N=8
Mean (μ):
μ = ΣX / N = 128 / 8 = 16
Mean Deviation:
Σ|X - μ| / N = 38 / 8 = 4.75
Variance:
Σ(X - μ)² / N = 310 / 8 = 38.75
Standard Deviation (σ):
σ = √Variance = √38.75 ≈ 6.22
4 Explain the Inferential Statistics? What are the types? Explain them with
examples. 7 marks
Inferential Statistics uses sample data to draw conclusions about a population,
enabling generalizations, hypothesis testing, and predictions based on
probability theory.
Types of Inferential Statistics
1. Estimation
o Involves predicting population parameters.
o Point Estimation: Provides a single value as the estimate (e.g.,
average income is $50,000).
o Interval Estimation: Provides a confidence interval for the
parameter (e.g., 95% confidence interval for income is $48,000–
$52,000).
2. Hypothesis Testing
o Tests assumptions about population parameters using statistical
tests.
o Example: Testing whether the mean test score of two groups is
different using a t-test.
Examples
 Estimation: A survey estimates the average height of adults as 5'8" with
a 95% confidence interval of 5'7"–5'9".
(A 95% confidence interval is a range of values within which we are
95% confident that the true population parameter such as the mean lies)
Hypothesis Testing: A company tests if a new drug is more effective than the
old one (H0: No difference, H1: New drug is better).
5 Explain importance of statistical inference in Machine Learning.7 marks
Statistical inference is essential in Machine Learning, enabling conclusions,
model validation, and accurate predictions.
1. Generalization
 Role: Machine Learning models aim to generalize from sample data to
unseen data, with statistical inference quantifying the uncertainty in this
process.
 Example: Confidence intervals assess the reliability of model
predictions.
2. Feature Selection and Importance
Role:Identifies significant features by testing their relationship with the target
variable, e.g., using p-values in linear regression to find key predictors for
house prices.
3. Model Validation
 Role: Validates that model performance isn't due to chance using
statistical tests like Chi-Square or t-tests.
 Example: Uses statistical significance tests to compare classifier
accuracy and determine if one is truly better.
4. Understanding Data Distributions
 Role: Statistical inference tests data distribution assumptions (e.g.,
normality in linear regression) and aids in preprocessing.
 Example: Checks if a regression model residuals (residual means the
difference between the observed value and the predicted value) follow a
normal distribution.
5. Hyperparameter Tuning
 Role: Statistical inference aids in selecting the best hyper
parameters(like Learning Rate) by validating performance differences
across configurations.
 Example: Evaluating cross-validation results to confirm that the
accuracy gain from a specific parameter is statistically significant.
6. A/B Testing: compares two versions of a product to identify which performs
better based on specific metrics.
 Role: Used in ML-driven product optimization to compare two versions
of a system (e.g., webpage designs, recommendation algorithms).
 Example: Conducting an A/B test to infer which recommendation
algorithm increases user engagement.
7. Error Analysis
 Role: Statistical inference helps evaluate model errors to ensure they are
unbiased and not caused by over fitting.
 Example:Analyzing confusion matrices with metrics like sensitivity and
specificity (model correctly identify) to enhance classifier performance.
8. Decision-Making under Uncertainty
 Role: It offers probabilistic measures such as confidence intervals and p-
values to inform decision-making in critical ML applications.
 Example: In medical diagnosis systems, statistical inference quantifies
the confidence in predicted diseases.
In machine learning, it ensures reliable models, bridging theory and practice for
accurate predictions and decisions.
6 Differentiate between Descriptive Statistics & Inferential Statistics.7 marks
Feature Descriptive Statistics Inferential Statistics

Makes predictions or
Summarizes and describes
inferences about a
Definition the main features of a
population based on a
dataset.
sample.

Draws conclusions and tests


Provides a clear overview of
Purpose hypotheses about the
the data.
population.

Focuses only on the data at Generalizes findings to a


Data Scope
hand. larger population.

Measures of central
tendency (mean, median, Hypothesis testing,
Techniques mode), and variability confidence intervals,
(range, variance, standard regression, ANOVA, t-tests.
deviation).

Graphs, tables, summaries Statistical conclusions (e.g.,


Output
(e.g., mean age = 30 years). the drug is effective, p<0.5).

Deals with the entire dataset Infers about the population


Population
or sample. based on the sample data.

Average sales in a month: Predicting annual sales


Example
$10,000. based on monthly sales data.

7 Explain the concept of test in brief: Chi-Square Test, T-Test,


ANOVA Test, ANCOVA Test. Explain each type with proper example and
applications.8 marks
Statistical Tests and Their Types
1. Chi-Square Test
o Concept: Used to test the association between categorical
variables or the goodness of fit between observed and expected
frequencies.
o Types:
 Test of Independence: Examines relationships between
two categorical variables.
 Goodness-of-Fit Test: Checks if observed data matches
an expected distribution.
o Example:
 Test of Independence: Are gender and product preference
related?
 Goodness-of-Fit: Do dice rolls follow a uniform
distribution?
o Applications: Market research, genetics (e.g., testing Mendelian
ratios).
2. T-Test
o Concept: Compares the means of two groups to see if they are
significantly different.
o Types:
 Independent Samples t-Test: Compares means of two
independent groups.
 Paired t-Test: Compares means of the same group at
different times.
o Example:
 Independent: Test if male and female students differ in
math scores.
 Paired: Compare pre-test and post-test scores of students
after training.
o Applications: Clinical trials, education analysis.

3. ANOVA (Analysis of Variance)


o Concept: Tests differences in means across multiple groups.
o Types:
 One-Way ANOVA: Tests one independent variable with
multiple levels.
 Two-Way ANOVA: Tests two independent variables and
their interaction.
o Example:
 One-Way: Compare average exam scores across three
schools.
 Two-Way: Compare scores across schools and genders.
o Applications: Education, agriculture, and quality control.
4. ANCOVA (Analysis of Covariance)
o Concept: Combines ANOVA and regression to test group
differences while controlling for covariates.
o Example:
 Compare productivity across departments while
accounting for experience levels.
o Applications: Experimental designs, social sciences, medical
research.
Summary Table
Test Purpose Example Applications

Tests relationships Gender vs product Market research,


Chi-Square
or distributions. preference. genetics.

Clinical trials,
Compares means Pre- and post-
T-Test education
of two groups. training test scores.
analysis.
Test scores of Agriculture,
Compares means
ANOVA students across three education, quality
of multiple groups
schools. control.

Tests group Productivity by


differences department, Social sciences,
ANCOVA
controlling for controlling for medical research.
covariates. experience.

8 Define regression. Explain the Bi-variant regression and Multi-Variant


regression with example. 7 marks

Ans: Regression is a statistical method that predicts continuous values by modeling


the relationship between a dependent variable and one or more independent
variables based on historical data.
Bi-variate Regression:
Bi-variate (simple linear) regression involves only one independent variable (X)
and one dependent variable (Y).
Mathematical form: Y = β₀ + β₁X + ε where β₀ is y-intercept, β₁ is slope, and ε
is error term.
Example of Bi-variate Regression: Predicting house prices based on square
footage:
 X (Independent): House size in square feet
 Y (Dependent): House price
 Relationship: Price = 50,000 + 100(Square_Feet)
 This means: Base price (β₀) = $50,000
 Each additional square foot (β₁) increases price by $100
Multi-variate Regression:
Multivariate regression involves multiple independent variables (X₁, X₂, ..., Xₙ)
predicting one dependent variable (Y)
Mathematical form: Y = β₀ + β₁X₁ + β₂X₂ + ... + βₙXₙ + ε
Example of Multi-variate Regression:
Predicting student performance:
 Dependent (Y): Final exam score
 Independent variables:
o X₁: Study hours per week
o X₂: Previous GPA
o X₃: Attendance percentage
 Relationship: Final_Score = 20 + 2(Study_Hours) + 15(GPA) +
0.3(Attendance)
This means:
 Base score (β₀) = 20
 Each study hour increases score by 2 points
 Each GPA point increases score by 15 points
Each percentage of attendance increases score by 0.3 points
9 Calculate the lower quartile, upper quartile, quartile deviations, and
coefficient of quartile deviations for the given data set. 8 marks

Class Frequency (f) Cumulative Frequency (cf)


30-40 99 99
40-50 65 164
50-60 79 243
60-70 75 318
70-80 87 405
Total frequency (N) = 405
L = Lower boundary of the quartile class
N = Total frequency
F = Cumulative frequency before the quartile class
f = Frequency of the quartile class
h = Class interval
Position of Quartiles:
Q₁ position = N/4 = 405/4 = 101.25
Q₃ position = 3N/4 = 3(405)/4 = 303.75
Lower Quartile (Q₁):
Q₁ lies in class 30-40
Q₁ = 30 + ((101.25 - 0)/(99)) × 10 = 30 + 10.23 = 40.23
Upper Quartile (Q₃):
Q₃ lies in class 60-70
Q₃ = 60 + ((303.75 - 243)/(75)) × 10 = 60 + 8.1 = 68.1
Quartile Deviation (Q.D.):
Q.D. = (Q₃ - Q₁)/2
Q.D. = (68.1 - 40.23)/2 = 13.935
Coefficient of Quartile Deviation:
 Coefficient of Q.D. = (Q₃ - Q₁)/(Q₃ + Q₁) × 100
= (68.1 - 40.23)/(68.1 + 40.23) × 100
= 27.87/108.33 × 100
= 25.73%
10 Explain the different types of Naïve Bayes classifier. 10 marks
Naïve Baye’s classifiers are categorized based on the type of data they handle.
Each type is tailored for different data distributions and offers specific
advantages for various applications.
1. Gaussian Naïve Bayes
 Data Type: Continuous
 Assumption: Features follow a Gaussian (normal) distribution.
 Applications: Sensor data analysis, medical diagnostics, where features
are continuous measurements like temperature, weight, or sensor
readings.
2. Multinomial Naïve Bayes
 Data Type: Discrete
 Assumption: Features represent counts or frequencies.
 Applications: Text classification (spam filtering, sentiment analysis),
document categorization, where features are word frequencies in
documents.
3. Bernoulli Naïve Bayes
 Data Type: Binary
 Assumption: Features are binary (0 or 1), indicating the presence or
absence of a feature.
 Applications: Spam detection, sentiment analysis, where features are
binary like the presence or absence of certain words in a document.
4. Complement Naïve Bayes
 Data Type: Discrete
 Assumption: Similar to Multinomial NB, but focuses on the absence of
features in the complement of each class.
 Applications: Text classification, where it can improve performance
over Multinomial NB in some cases.
5. Categorical Naïve Bayes
 Data Type: Categorical
 Assumption: Features belong to a specific set of categories.
 Applications: Categorical data classification, where features are
categorical variables like colors, brands, or categories.
 Data Type: Is your data continuous, discrete, binary, or categorical?
 Data Distribution: Do your features follow a Gaussian distribution, or
are they better represented by other distributions?

Problem Domain: kind of problem to solve (text classification, image


recognition, etc.)

11 What is the importance of Prior probability? 8 marks


Evidence, likelihood, Posterior probability in Baye’s theorem?
Prior Probability (P(H))
1. Prior probability represents initial belief or knowledge about a
hypothesis before observing new evidence
2. It captures existing knowledge, historical data, or expert opinions
3. Important because it:
o Provides a starting point for probability calculations
o Helps incorporate domain knowledge into statistical analysis
o Can be updated as new information becomes available
o Prevents over fitting to new data by considering historical
context
Evidence/Normalizing Constant (P(E))
1. Represents the probability of observing the evidence regardless of the
hypothesis
2. Acts as a normalizing constant to ensure posterior probabilities sum to 1
3. Important because it:
o Makes probabilities comparable across different hypotheses
o Helps to scale the results to meaningful probability values
o Accounts for all possible ways evidence could occur
Likelihood (P (E|H))
1. Represents the probability of observing the evidence given that the
hypothesis is true
2. Measures how well the hypothesis explains the observed evidence
3. Important because it:
o Links hypothesis to actual observations
o Quantifies how well different hypotheses predict the data
o Helps evaluate competing explanations
o Updates beliefs based on new evidence
Posterior Probability (P (H|E))
1. Final probability of the hypothesis after considering both prior beliefs
and new evidence
2. Represents updated knowledge after incorporating all available
information
3. Important because it:
o Provides the final updated belief after considering all factors
o Enables evidence-based decision making
o Can become the new prior for future updates
o Balances prior knowledge with new evidence
The relationship between these components through Bayes' theorem allows for:
 Systematic updating of beliefs based on evidence
 Combining subjective prior knowledge with objective data
 Making probabilistic predictions and decisions
 Learning from experience by iteratively updating probabilities
P(H|E) = P(E|H) × P(H)/ P(E)
Posterior = Likelihood * Prior / Evidence
12 Explain measure of dispersion. 8 marks
Measures of dispersion:They describe the variability of a dataset, showing
how much data values differ from the central value (mean or median).They are:
1. Range: The difference between the maximum and minimum values in
the dataset. It provides a simple indication of spread.
Range=Max Value−Min Value
2. Variance: The average of the squared differences from the mean. It
shows the extent to which each data point deviates from the mean.
3. Standard Deviation: The square root of the variance. It gives a measure
of the average distance between each data point and the mean, making it
easier to interpret than variance.

4. Interquartile Range (IQR): The range between the 25th percentile (Q1)
and the 75th percentile (Q3), capturing the spread of the middle 50% of
the data.

These measures reveal data variability and distribution, aiding analysis and
decision-making.
13 What is Hypothesis testing? Elaborate on the steps of hypothesis testing
and Types of Hypothesis testing.7 marks
Hypothesis testing is a statistical method used to determine if there is enough
evidence in a sample to support or reject a hypothesis about a population. The
process involves:
1. Stating Hypotheses: Define the null (H₀) and alternative (H₁)
hypotheses.
2. Choosing Significance Level (α): Set a threshold (e.g., 0.05) for
rejecting the null hypothesis.
3. Selecting the Test: Choose the appropriate test (e.g., t-test, chi-square)
and calculate the test statistic.
4. Analyzing Data: Compute the test statistic and p-value.
5. Making a Decision: Reject the null hypothesis if p ≤ α, otherwise fail to
reject it.
6. Conclusion: Based on the test result, conclude whether there is
sufficient evidence to support the alternative hypothesis.
Types of Hypothesis Testing:
1. One-tailed vs. Two-tailed: Tests if a parameter is greater or less than a
value (one-tailed) or different (two-tailed).
Eg: One-tailed: A company claims that their new product lasts longer than 100
hours. A one-tailed test would check if the product lasts greater than 100 hours.
Two-tailed: A university claims that their new teaching method improves
scores, but you want to test if it results in either a higher or lower average
score than the previous method (i.e., different from 75%).

2. Parametric vs. Non-parametric: Parametric tests assume data follows


a known distribution; non-parametric tests do not.
Parametric Test:
 Eg.: You want to test whether the average salary of employees in a
company is 50,000 dollars. For this, you assume that the salary
distribution follows a normal distribution. A t-test is a parametric test
because it assumes normality of the data.
 Key point: Parametric tests assume that the data follows a specific
distribution (like normal distribution).
Non-parametric Test:
 Example: To compare the median salary of employees in two different
departments, but not to assume the salaries follow a normal distribution.
 Key point: Non-parametric tests do not assume any specific
distribution for the data.

3. One-sample vs. Two-sample: One-sample tests compare a sample to


a known value, while two-sample tests compare two independent
samples.

 One-sample Test:

 Example: You want to test if the average height of a group of students


is equal to 5.5 feet. You compare the sample mean of your group to a
known value (5.5 feet). You would use a one-sample t-test for this
comparison.
 Key point: A one-sample test compares one sample against a known or
hypothesized value (e.g., a population mean or a target value).

 Two-sample Test:

 Example: You want to compare the average scores of two different


classes on an exam. The two groups (classes) are independent of each
other. You would use a two-sample t-test to check if there is a
significant difference between their means.
 Key point: A two-sample test compares two independent samples to
determine if their means are different from each other.

4. Chi-square Tests: Used for categorical data to test associations or


goodness of fit.
Hypothesis testing helps researchers make decisions based on data, offering a
structured way to assess the validity of assumptions.
14 With the help of examples, explain different types of probabilistic Models
and give their applications. 7 marks
1. Bayesian Networks:
Description: A model that shows how variables are connected
and how one affects another using probability.
o Example: In medical diagnosis, it helps determine the likelihood
of diseases based on symptoms.
o Application: Used in healthcare, decision-making systems, and
machine learning.
2. Markov Models:
o Description: A model where the future depends only on the
current state, not the past.
o Example: Predicting tomorrow's weather based on today’s
weather.
o Application: Weather forecasting, stock market prediction, and
speech recognition.
3. Hidden Markov Models (HMMs):
o Description: Similar to Markov Models, but with hidden
(unseen) states that affect observable data.
o Example: In speech recognition, it infers phonemes (hidden
states) from audio signals (observable data).
o Application: Used in speech recognition, part-of-speech tagging,
and genetics.
4. Gaussian Mixture Models (GMMs):
o Description: A model that assumes data comes from multiple
normal distributions.
o Example: Grouping customers into different categories based on
buying behavior.
o Application: Used in clustering, anomaly detection, and image
segmentation.
5. Naive Bayes Classifier:
o Description: A classifier that uses Bayes' Theorem, assuming
features are independent.
o Example: Classifying emails as spam or not based on certain
words.
o Application: Text classification, spam detection, and sentiment
analysis.
6. Poisson Processes:
o Description: A model for events happening randomly over time
or space at a constant rate.
o Example: Counting customer arrivals at a store in an hour.
o Application:Used in traffic analysis, telecommunications, and
queuing systems.
7. Linear and Logistic Regression:
o Description: Models to predict values; linear for continuous
data, logistic for binary data.
o Example: Predicting house prices (linear) or the likelihood of a
purchase (logistic).
o Application: Used in forecasting, risk analysis, and customer
behavior.
8. Monte Carlo Simulation:
o Description: A method that uses random sampling to simulate
outcomes and estimate probabilities.
o Example: Estimating pi by randomly placing points inside a
square.
o Application: Used in finance, risk analysis, and optimization.
Summary:
Probabilistic models are used in various fields like healthcare, finance, and
machine learning to predict outcomes and make decisions based on uncertain
data.
15 What are hidden variables in a probabilistic model? 8 marks
Hidden Variables (Latent Variables):
Definition:Hidden variables are unobservable factors that influence measurable
outcomes.
Inference:These unobserved variables are inferred from known variables in the
model.
Purpose:
Capturing Underlying Factors:Hidden variables represent unobservable
influences that reveal the underlying structure of the data.
Improving Predictions: By incorporating hidden variables, models can make
more accurate predictions and gain a better understanding of complex
relationships between variables.
Common Techniques Using Hidden Variables:
Hidden Markov Models (HMMs):
Example: In speech recognition, hidden states like phonemes (distinct sounds
in speech) influence the audio signals that are observed. While we can't directly
measure the phoneme itself, we infer the sequence of phonemes (hidden
variables) from the audio signals (observed data).
Hidden Variables: Phonemes (e.g., "ah", "eh", "oo").
Observed Variables: Audio signals, such as frequency or sound patterns.
Inference:The phoneme sequence is inferred from audio data using algorithms
like the Viterbi algorithm, a dynamic programming technique to find the most
likely sequence of hidden states in HMMs.
Latent Variable Models:
Example:In customer behavior modeling, latent variables like preferences
influence purchases, inferred from data such as past purchases or browsing
history.
Hidden Variables: Customer preferences (e.g., interest in electronics, fashion).
Observed Variables: Purchases, product clicks, or search history.
Inference: These preferences are inferred from the observable data patterns.
Benefits:
Rich Understanding:Hidden variables enhance understanding by accounting
for unobservable influences on the data.
Accurate Predictions:Incorporating unobservable factors improves model
accuracy and explains complex systems more effectively.
16 What is Z-score? Find the Z-score if x = {3,13,8,21,17,11}.8 marks
The Z-score measures how many standard deviations a data point is from the
mean, helping compare data points across distributions and identify outliers.
Calculate the Mean (μ):
 Sum of all data points: 3 + 13 + 8 + 21 + 17 + 11 = 73
 Number of data points: 6
 Mean (μ) = 73 / 6 = 12.17 (approximately)
Standard
Data Mean Deviation from Deviation
Point (X) (μ) Mean (X - μ) (σ) Z-Score (Z)
-9.17 / 6.40 = -
3 12.17 3 - 12.17 = -9.17 6.4 1.43
13 - 12.17 =
13 12.17 0.83 6.4 0.83 / 6.40 = 0.13
-4.17 / 6.40 = -
8 12.17 8 - 12.17 = -4.17 6.4 0.65
21 - 12.17 =
21 12.17 8.83 6.4 8.83 / 6.40 = 1.38
17 - 12.17 =
17 12.17 4.83 6.4 4.83 / 6.40 = 0.75
11 - 12.17 = - -1.17 / 6.40 = -
11 12.17 1.17 6.4 0.18

The table includes all the necessary components for calculating the Z-score:
 Data Point (X): The individual values in the dataset.
 Mean (μ): The average of all the data points.
 Deviation from Mean (X - μ): The difference between each data point
and the mean.
 Standard Deviation (σ): A measure of the spread of the data.
 Z-Score (Z): The standardized score representing how many standard
deviations a data point is from the mean.
Interpretation of Z-Scores:
 A positive Z-score indicates that the data point is above the mean.
 A negative Z-score indicates that the data point is below the mean.
 The magnitude of the Z-score indicates how far the data point is from
the mean in terms of standard deviations.
Example:
 The data point 3 has a Z-score of -1.43, meaning it is 1.43 standard
deviations below the mean.
The data point 21 has a Z-score of 1.38, meaning it is 1.38 standard deviations
above the mean.
17 What is Naive Bayes classifier?
Explain with an example and give its application.7 marks
A Naive Bayes classifier is a probabilistic machine learning algorithm based on
Bayes' theorem with a strong (naive) independence assumption between the
features. It is a popular choice for classification tasks due to its simplicity and
effectiveness, especially in text classification.
1. Bayes' Theorem:
o The core of the algorithm lies in Bayes' theorem, which
calculates the probability of a class given the observed features:
2. P(Class|Features) = (P(Features|Class) * P(Class)) / P(Features)
o Where:
 P(Class|Features): Posterior probability (probability of
the class given the features)
 P(Features|Class): Likelihood (probability of the features
given the class)
 P(Class): Prior probability (probability of the class)
 P(Features): Evidence (probability of the features)
3. Naive Independence Assumption:
o The "naive" part comes from the assumption that all features are
independent of each other given the class. This simplifies the
calculation of the likelihood:
P(Features|Class) = P(Feature1|Class) * P(Feature2|Class) * ... *
P(FeatureN|Class
Applications:
Text Classification:
o Spam filtering
o Sentiment analysis
o Topic categorization
o Document classification
Image Recognition:
o Object recognition
o Facial recognition
Medical Diagnosis:
o Disease prediction
o Diagnosis of medical conditions
Advantages:
 Simple and easy to implement
 Fast training and prediction
 Effective in high-dimensional data
Disadvantages:
 Naive independence assumption may not always hold true
Can be sensitive to irrelevant features
18 Find the line of regression for the following data:
X = [2, 4, 6, 8], Y = [3, 5, 7, 10].7 marks

x y x^2 xy
2 3 4 6
4 5 16 20
6 7 36 42
8 10 64 80
Calculate the sums of each column:
 ∑x = 20
 ∑y = 25
 ∑x^2 = 120
 ∑xy = 148
Calculate the slope (m) and intercept (b) using the following formulas:
 m = (n * ∑xy - ∑x * ∑y) / (n * ∑x^2 - (∑x)^2)
 b = (∑y - m * ∑x) / n
Where n is the number of data points (in this case, n = 4).
Substitute the values from Step 3 into the formulas:
 m = (4 * 148 - 20 * 25) / (4 * 120 - 20^2) = 1.15
 b = (25 - 1.15 * 20) / 4 = 0.5
Write the equation of the regression line:
The equation of the regression line is y = 1.15x + 0.5.
19 Explain the significance of Chi-Square test in industrial engineering
applications. 6 marks
Quality Control: A core function of industrial engineering is optimizing
quality in manufacturing processes. This Chi-Square test helps identify if
certain machines are associated with higher defect rates.
Process Improvement: If the test shows a relationship, it signals a need to
investigate why a specific machine might be producing more defects. This
could lead to:
 Machine maintenance or adjustments
 Operator training
 Redesign of the manufacturing process
Resource Allocation: The results can inform decisions on how to best allocate
resources (maintenance, operator time, etc.) to different machines.
20 Chi-Square test
A manufacturing company wants to assess whether there's a relationship
between the type of machine used in a production process (Machine A,
Machine B, Machine C) and the occurrence of defects. Data collected over a
week is as follows:

Machine No Minor Major


Type Defect Defect Defect
Machine A 280 15 5
Machine B 250 30 20
Machine C 265 25 10

Perform a Chi-Square test to determine if the type of machine and the


occurrence of defects are independent at a significance level of 0.01.
State the Hypotheses:
 Null Hypothesis (H0): Machine type and defect occurrence are
independent.
 Alternative Hypothesis (H1): Machine type and defect occurrence are
not independent.
2. Set the Significance Level (α): α = 0.01
3. Calculate Expected Frequencies:
 Expected Frequency = (Row Total * Column Total) / Grand Total
4. Compute the Chi-Square Test Statistic:
 χ² = Σ [ (Observed Frequency - Expected Frequency)² / Expected
Frequency ]
5. Determine Degrees of Freedom:
 df = (Number of Rows - 1) * (Number of Columns - 1)
6. Find the Critical Chi-Square Value:
 Use a Chi-Square distribution table with the calculated df and α.
7. Compare and Make a Decision:
 If the calculated Chi-Square statistic is greater than the critical value,
reject the null hypothesis. Otherwise, fail to reject the null hypothesis.
No Min Maj Expect
De or or ed - Expected Expected
Machine fec Defe Defe Row No - Minor - Major Chi-Square
Type t ct ct Total Defect Defect Defect Contribution
Machine 28 0.84 + 2.91 +
A 0 15 5 300 265 23.33 11.67 3.89 = 7.64
Machine 25 0.84 + 1.88 +
B 0 30 20 300 265 23.33 11.67 5.84 = 8.56
Machine 26 0 + 0.12 + 0.23
C 5 25 10 300 265 23.33 11.67 = 0.35
Column 79 Total Chi-
Total 5 70 35 900 Square = 16.55
Degrees of Freedom (df): 4
Critical Chi-Square Value (α = 0.01): 13.28
Since the calculated Chi-Square (16.55) exceeds the critical value (13.28), we
reject the null hypothesis. There's significant evidence (at α = 0.01) of a
relationship between machine type and defect occurrence.
21 T – Test: Example 8 marks
A manufacturing plant is implementing a new process for producing a specific
component. They want to compare the production rate of the new process to the
old process to determine if the new process is significantly faster.
To test this, they randomly select 6 workers and record their production rates
(units per hour) using both the old and new processes. The data collected is as
follows:
Perform a paired t-test to determine if the new process results in a significantly
higher production rate than the old process at a significance level of 0.05.
Worker Old Process (units/hour) New Process (units/hour)

1 12 14

2 10 13

3 11 12

4 13 15

5 10 11

6 12 13
Let's perform a paired t-test to determine if the new process results in a
significantly higher production rate than the old process.
Assumptions:
 The data is paired, meaning the production rates for each worker are
measured under both the old and new processes.
 The differences in production rates are normally distributed.
Hypothesis:
 Null Hypothesis (H0): The mean difference in production rates
between the new and old processes is zero (i.e., no significant
difference).
 Alternative Hypothesis (H1): The mean difference in production rates
between the new and old processes is greater than zero (i.e., the new
process is significantly faster).
Calculations:
1. Calculate the difference in production rates for each worker:
o Worker 1: 14 - 12 = 2
o Worker 2: 13 - 10 = 3
o Worker 3: 12 - 11 = 1
o Worker 4: 15 - 13 = 2
o Worker 5: 11 - 10 = 1
o Worker 6: 13 - 12 = 1
2. Calculate the mean and standard deviation of the differences:
o Mean: (2 + 3 + 1 + 2 + 1 + 1) / 6 = 1.67
3. Calculate the squared deviations from the mean:
 Worker 1: (2 - 1.67)^2 = 0.1089
 Worker 2: (3 - 1.67)^2 = 1.7689
 Worker 3: (1 - 1.67)^2 = 0.4489
 Worker 4: (2 - 1.67)^2 = 0.1089
 Worker 5: (1 - 1.67)^2 = 0.4489
 Worker 6: (1 - 1.67)^2 = 0.4489
4. Calculate the sum of squared deviations:
 Sum = 0.1089 + 1.7689 + 0.4489 + 0.1089 + 0.4489 + 0.4489 = 3.3424
5. Calculate the sample variance:
 Variance = Sum of squared deviations / (n - 1) = 3.3424 / (6 - 1) =
0.6685
6. Calculate the sample standard deviation:
 Standard Deviation = √Variance = √0.6685 ≈ 0.82
Calculate the t-statistic:

Here:

 Observed Mean Difference: The average difference calculated from the


data (e.g., 1.67 in this case).
 Hypothesized Mean Difference: The value assumed under the null
hypothesis (which is 0 in this case).

o t = (mean difference - 0) / (standard deviation / sqrt(n))


o t = (1.67 - 0) / (0.82 / sqrt(6))
o t = 5.0
Measure of how many standard errors the sample mean difference is from the
hypothesized mean difference (in this case, 0).
Determine the degrees of freedom:
o Degrees of freedom = n - 1 = 6 - 1 = 5
Find the critical t-value:
o Using a t-distribution table with 5 degrees of freedom and a
significance level of 0.05, the critical t-value is 2.015.
Decision:
 Since the calculated t-statistic (5.0) is greater than the critical t-
value (2.015), we reject the null hypothesis.
Conclusion:
There is sufficient evidence to conclude that the new process results in a
significantly higher production rate than the old process at a significance
level of 0.05.
22 ANOVA (Analysis of Variance) Test example: one way ANOVA
Variance is a statistical measure that quantifies the spread or dispersion of a set
of data points around the mean. It tells us how much the values in a dataset differ
from the average value.
A researcher wants to compare the effectiveness of three different teaching
methods on student performance.
Objective:
To test the null hypothesis that there are no differences between the group
means against the alternative hypothesis that at least one group mean is
different from the others.
Hypotheses:
 Null Hypothesis (H₀): μ₁ = μ₂ = μ₃ = ... = μₖ (All group means are equal)
 Alternative Hypothesis (H₁): At least one group mean is different
Data:
Suppose the researcher collects the following data on student test scores from
three different teaching methods:
Teaching Method Test Scores
Method A 85, 88, 82, 90, 87
Method B 78, 82, 75, 80, 79
Method C 92, 95, 89, 91, 94
Ans. Steps to Perform One-Way ANOVA:
1. Calculate the Group Means:
o Mean of Method A (μ₁)
o Mean of Method B (μ₂)
o Mean of Method C (μ₃)
2. Calculate the Overall Mean (Grand Mean, μ):
o The mean of all the data points combined.
3. Calculate the Sum of Squares Between Groups (SSB):
o Measures the variability between the groups means.
4. Calculate the Sum of Squares Within Groups (SSW):
o Measures the variability within each group.
5. Calculate the Total Sum of Squares (SST):
o SST = SSB + SSW
6. Calculate the Degrees of Freedom:
o Degrees of Freedom Between Groups (dfB) = k - 1 (where k is
the number of groups)
o Degrees of Freedom Within Groups (dfW) = N - k (where N is
the total number of observations)
o Total Degrees of Freedom (dfT) = N - 1
7. Calculate the Mean Squares:
o Mean Square Between Groups (MSB) = SSB / dfB
o Mean Square Within Groups (MSW) = SSW / dfW
8. Calculate the F-Statistic:
o F = MSB / MSW
9. Determine the Critical Value:
o Use the F-distribution table to find the critical value for the given
degrees of freedom and significance level (commonly α = 0.05).
10. Make a Decision:
o If the calculated F-statistic is greater than the critical value,
reject the null hypothesis.
o If the calculated F-statistic is less than or equal to the critical
value, fail to reject the null hypothesis.
Based on the F-statistic and the critical value, the researcher can conclude
whether there are statistically significant differences between the means of the
groups. If the null hypothesis is rejected, it suggests that at least one teaching
method leads to significantly different student performance compared to the
others.
Hence,
Step 1: Calculate Group Means and Grand Mean
First, compute the mean for each group and the overall grand mean.
Number of
Group Scores Sum of Scores Observations Group Mean (μᵢ)
(n)
Method 85, 88, 82, 85 + 88 + 82 + 90 + 87
5 432 / 5 = 86.4
A 90, 87 = 432

Method 78, 82, 75, 78 + 82 + 75 + 80 + 79


5 394 / 5 = 78.8
B 80, 79 = 394

Method 92, 95, 89, 92 + 95 + 89 + 91 + 94


5 461 / 5 = 92.2
C 91, 94 = 461

Total 432 + 394 + 461 Grand Mean (μ) =


15
= 1287 1287 / 15 = 85.8
Step 2: Calculate Sum of Squares Between Groups (SSB)
SSB measures the variability between the group means and the grand mean.
The formula is: SSB = ∑ni(μi − μ)^2

Group nᵢ μᵢ (μᵢ - μ) (μᵢ - μ)² nᵢ(μᵢ - μ)²


86.4 - 85.8 =
Method A 5 86.4 (0.6)² = 0.36 5 × 0.36 = 1.8
0.6
78.8 - 85.8 = -
Method B 5 78.8 (-7.0)² = 49.0 5 × 49.0 = 245.0
7.0
92.2 - 85.8 =
Method C 5 92.2 (6.4)² = 40.96 5 × 40.96 = 204.8
6.4
SSB = 1.8 + 245.0 +
Total
204.8 = 451.6

Step 3: Calculate Sum of Squares Within Groups (SSW)


SSW measures the variability within each group. The formula is:

𝑆𝑆𝑊 = ∑(𝑥𝑖𝑗 − 𝜇𝑖)^2

Group Scores (x - μᵢ) (x - μᵢ)² Sum of (x - μᵢ)²


(85-86.4), (88-
1.96 + 2.56 + 19.36
Method 85, 88, 86.4), (82-86.4), (-1.4)², (1.6)², (-4.4)²,
+ 12.96 + 0.36
A 82, 90, 87 (90-86.4), (87- (3.6)², (0.6)²
= 37.2
86.4)
(78-78.8), (82-
0.64 + 10.24 +
Method 78, 82, 78.8), (75-78.8), (-0.8)², (3.2)², (-3.8)²,
14.44 + 1.44 + 0.04
B 75, 80, 79 (80-78.8), (79- (1.2)², (0.2)²
= 26.8
78.8)
(92-92.2), (95-
Method 92, 95, 92.2), (89-92.2), (-0.2)², (2.8)², (-3.2)², 0.04 + 7.84 + 10.24
C 89, 91, 94 (91-92.2), (94- (-1.2)², (1.8)² + 1.44 + 3.24 = 22.8
92.2)
SSW = 37.2 + 26.8
Total
+ 22.8 = 86.8
Step 4: Calculate Total Sum of Squares (SST)
SST measures the total variability in the data. The formula is:
𝑆𝑆𝑇 = 𝑆𝑆𝐵 + 𝑆𝑆𝑊

SST = 451.6 + 86.8 = 538.4


Step 5: Degrees of Freedom
 Degrees of Freedom Between Groups (dfB): dfB=k−1=3−1=2
 Degrees of Freedom Within Groups (dfW): dfW=N−k=15−3=12
 Total Degrees of Freedom (dfT): dfT=N−1=15−1=14
Step 7: F-Statistic
o MSB = SSB / dfB = 451.6/ 2 = 225.8

o MSW = SSW / dfW = 86.8/ 12 =7.233

𝑀𝑆𝑊
𝐹= = 225.8/7.233 = 31.21
𝑀𝑆𝐵
Step 8: Critical Value and Decision
 For dfB=2, dfW=12, and α=0.05, the critical value from the F-
distribution table is 3.89.
 Since the calculated F-statistic (31.21) > critical value (3.89), we reject
the null hypothesis.
23 Sample Problem for Two-Way ANOVA: example.
Scenario:
A researcher is studying the effect of two factors, Teaching
Method and Gender, on student test scores. The researcher wants to determine
if there are significant differences in test scores based on the teaching method,
gender, and whether there is an interaction effect between teaching method and
gender.

Factors:

Teaching Method (Factor A): Method 1, Method 2, Method 3

Gender (Factor B): Male, Female

Objective:To test the following hypotheses:

Main Effect of Teaching Method: Are there differences in test scores based on
the teaching method?

Main Effect of Gender: Are there differences in test scores based on gender?

Data:
The researcher collects the following test scores:
Teaching Method Gender Test Scores
Method 1 Male 85, 88, 82
Method 1 Female 90, 87, 91
Method 2 Male 78, 82, 75
Method 2 Female 80, 79, 83
Method 3 Male 92, 95, 89
Method 3 Female 91, 94, 93
24 ANCOVA (Analysis of Covariance)
https://colab.research.google.com/drive/1qqVeaQEUwQM7S67cH0glb2Ji6
ecZN04z?usp=sharing

Covariance is a statistical measure that indicates the degree to which two


variables change together. It helps determine whether an increase in one
variable corresponds to an increase or decrease in another variable.
ANCOVA is a statistical technique that combines ANOVA (Analysis of
Variance) and regression. It is used to compare the means of different groups
while controlling for the effects of one or more continuous covariates that might
influence the dependent variable.
Key Points about ANCOVA:
1. Purpose: Adjusts the dependent variable for the effects of covariates
before comparing group means.
2. Assumptions:
o Linearity: The relationship between the covariate and dependent
variable should be linear.
o Homogeneity of Regression Slopes: The effect of the covariate
should be the same across all groups.
o Normality & Homoscedasticity: The residuals should be
normally distributed and have equal variances.
o Independence: Observations should be independent of each
other.
3. Example Use Case: Comparing the effectiveness of three different
teaching methods on student test scores while controlling for students'
prior knowledge (covariate).
An example.
Scenario:
Suppose, to test the effectiveness of three different teaching methods on
students' final exam scores, while controlling for their midterm exam scores
(covariate).
Step 1: Define Variables
 Independent Variable (Categorical): Teaching Method (Method A,
Method B, Method C)
 Dependent Variable (Continuous): Final Exam Score
 Covariate (Continuous): Midterm Exam Score (to control for prior
knowledge)
Step 2: Data Example
Final Exam Score
Teaching Midterm Score
Student (Dependent
Method (Covariate)
Variable)

1 A 75 80

2 A 82 85

3 B 78 82

4 B 79 83

5 C 74 78

6 C 80 81

You might also like