0% found this document useful (0 votes)

35 views17 pages

Ticket 2

The document discusses various machine learning concepts including evaluation metrics, regularization, unsupervised learning, ensemble methods, and data preprocessing. It emphasizes the importance of Recall in classifying defective parts, the use of Lasso regularization to prevent overfitting, and the application of K-Means clustering for customer segmentation. Additionally, it highlights the benefits of ensemble methods like Random Forest and Gradient Boosting for improving image classification performance.

Uploaded by

gunelaslanova106

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views17 pages

Ticket 2

Uploaded by

gunelaslanova106

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Machine Learning

MBA (Artificial Intelligence) E27-24

Ülvi

Ticket 2
1. Evaluation Metrics: You are building a machine learning model to classify defective parts
in a manufacturing process. The cost of missing a defective part is high. Which
evaluation metric would you prioritize, and why?

2. Regularization: You are working on a model for predicting customer churn, but you find
that some features are causing overfitting. How would you use Lasso regularization to
simplify the model while maintaining performance?

3. Unsupervised Learning: A retail store wants to identify customer segments based on

shopping patterns. Explain how you would use the k-means clustering algorithm for this
task, and how you would determine the optimal number of clusters.

4. Ensemble Methods: You are working on a model for classifying images of plants.
Describe how you would use ensemble methods such as Random Forest or Gradient
Boosting to improve model performance.

5. Data Preprocessing: You have a text dataset containing product reviews in multiple
languages. How would you preprocess this data to train a sentiment analysis model?
Answer 1
When building a machine learning model to classify defective parts in a manufacturing process,
you need to prioritize metrics that address the high cost of missing defective parts (False
Negatives). The key metric to focus on is Recall, along with other supporting metrics.

Key Metrics to Use

1. Recall (Sensitivity or True Positive Rate)

Recall measures the proportion of defective parts correctly identified:
2. Recall = True Positives / (True Positives + False Negatives)

Why prioritize Recall?

o High Recall ensures most defective parts are detected.

o Reduces False Negatives, which is critical when missing defective parts has
significant consequences.
3. Precision
Precision measures the proportion of predicted defective parts that are actually defective:
4. Precision = True Positives / (True Positives + False Positives)

Why not focus only on Precision?

oWhile Precision is important, the cost of False Negatives is higher in this

scenario. A balance with Recall is necessary.
5. F1-Score
F1-Score is the harmonic mean of Precision and Recall, offering a balance between the
two:
6. F1-Score = 2 * (Precision * Recall) / (Precision + Recall)

Why use F1-Score?

o It helps balance Precision and Recall, ensuring neither metric is neglected.

7. Confusion Matrix
The confusion matrix provides a detailed breakdown of predictions:
8. True Positives (TP): Correctly identified defective parts.
9. True Negatives (TN): Correctly identified non-defective parts.
10. False Positives (FP): Non-defective parts marked as defective.
11. False Negatives (FN): Defective parts missed by the model.

Practical Implementation
Here's how you can calculate these metrics in Python:

from sklearn.metrics import recall_score, precision_score, f1_score,

confusion_matrix

# Ground truth and predictions

y_true = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0] # Actual labels
y_pred = [1, 0, 1, 1, 1, 1, 0, 0, 0, 0] # Model predictions

# Calculate metrics
recall = recall_score(y_true, y_pred)
precision = precision_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)
conf_matrix = confusion_matrix(y_true, y_pred)

print(f"Recall: {recall}")
print(f"Precision: {precision}")
print(f"F1-Score: {f1}")
print(f"Confusion Matrix:\n{conf_matrix}")

Strategies for Prioritizing Recall

1. Adjust Classification Threshold

Lower the threshold to classify more parts as defective:
2. y_pred_prob = model.predict_proba(X_test)[:, 1]
3. threshold = 0.3 # Adjust threshold
4. y_pred = (y_pred_prob >= threshold).astype(int)
5. Class Weights
Assign higher weights to the defective class to penalize False Negatives more:
6. model = RandomForestClassifier(class_weight={'defective': 10, 'non-
defective': 1})
7. Resampling Techniques
Handle class imbalance by oversampling the minority class:
8. from imblearn.over_sampling import SMOTE
9. X_resampled, y_resampled = SMOTE().fit_resample(X_train, y_train)

Why Prioritize Recall?

 False Negatives (Missed Defective Parts):

Missing defective parts can lead to:
o Product failures.
o Customer dissatisfaction.
o Safety risks.
 False Positives (Non-Defective Parts Flagged as Defective):
While inconvenient, False Positives have a lower cost since parts can be rechecked
manually.
Conclusion

For defective part classification:

1. Prioritize Recall to minimize False Negatives.

2. Use F1-Score and the Confusion Matrix for a balanced evaluation.
3. Adjust thresholds and use class weights to align the model with business objectives.

Answer 2
Problem: Overfitting occurs when the model performs well on the training data but poorly on
the test data, often due to irrelevant or redundant features.

To address this, Lasso Regularization (L1 Regularization) can be used. Lasso not only reduces
overfitting by penalizing large coefficients but also performs feature selection by driving some
coefficients to exactly zero, effectively removing less important features from the model.

Steps to Apply Lasso Regularization

1. Understand Lasso Regularization

Lasso adds a penalty term to the loss function of the model:

Loss = (Prediction Error) + λ * Σ|βj|

Where:

 βj are the coefficients of the features.

 λ is the regularization parameter:
o Larger λ values increase the penalty, leading to more coefficients shrinking to
zero.
o Smaller λ values reduce the penalty, retaining more features.

2. Preprocess the Data

Lasso is sensitive to feature scaling, so it's essential to normalize the features:

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

3. Train the Model with Lasso Regularization

Lasso can be applied using a linear regression model (Lasso) or logistic regression for
classification (LogisticRegression with L1 penalty).

For regression:

from sklearn.linear_model import Lasso

from sklearn.model_selection import GridSearchCV

# Define the Lasso model

lasso = Lasso()

# Use Grid Search to find the optimal λ (alpha in sklearn)

param_grid = {'alpha': [0.01, 0.1, 1, 10, 100]} # Regularization strength
grid_search = GridSearchCV(lasso, param_grid, cv=5,
scoring='neg_mean_squared_error')
grid_search.fit(X_train_scaled, y_train)

# Best model
best_lasso = grid_search.best_estimator_
print("Best alpha:", grid_search.best_params_['alpha'])

# Train the model

best_lasso.fit(X_train_scaled, y_train)

For classification:

from sklearn.linear_model import LogisticRegression

lasso_logistic = LogisticRegression(penalty='l1', solver='liblinear', C=1.0)

# C is 1/λ
lasso_logistic.fit(X_train_scaled, y_train)

4. Evaluate the Model

Evaluate the model on the test set to ensure performance is maintained:

from sklearn.metrics import accuracy_score, precision_score, recall_score

# Predict on the test set

y_pred = best_lasso.predict(X_test_scaled)

# Metrics
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Precision:", precision_score(y_test, y_pred))
print("Recall:", recall_score(y_test, y_pred))
5. Inspect Feature Importance

Lasso shrinks less important feature coefficients to zero. You can inspect which features are
retained:

# Feature importance
coefficients = best_lasso.coef_
feature_importance = pd.DataFrame({
'Feature': feature_names,
'Coefficient': coefficients
})
# Filter non-zero coefficients
important_features = feature_importance[feature_importance['Coefficient'] !=
0]
print(important_features)

Tuning and Interpretation

1. Choosing λ (alpha):
o Use cross-validation to find the optimal value.
o Larger λ simplifies the model but risks underfitting.
o Smaller λ retains more features but risks overfitting.
2. Impact on Features:
o Features with coefficients reduced to zero are effectively removed.
o Helps simplify the model by retaining only the most relevant predictors for
customer churn.
3. Model Monitoring:
o After applying Lasso, monitor the model’s performance to ensure it generalizes
well on unseen data.
o Regularly update the model if new patterns or features emerge in the data.

Why Use Lasso for Customer Churn?

 Feature Selection: Automatically eliminates less relevant features, simplifying the

model.
 Overfitting Prevention: Penalizes complex models, ensuring better generalization to
unseen data.
 Interpretability: Retains only the most critical predictors of churn, making the model
easier to interpret.
Answer 3
The retail store can use K-Means Clustering, an unsupervised learning algorithm, to segment
customers based on shopping patterns. This technique groups customers with similar behaviors
into clusters, allowing the store to target each group more effectively.

1. Understanding K-Means Clustering

 Objective: Partition the data into k clusters, where each customer belongs to the cluster
with the nearest centroid.
 How it Works:
1. Initialize k centroids randomly.
2. Assign each data point to the nearest centroid.
3. Update the centroids based on the mean of assigned points.
4. Repeat steps 2–3 until the centroids stabilize or a stopping criterion is met.

2. Steps to Use K-Means for Customer Segmentation

Step 1: Collect and Preprocess Data

 Input Data: Examples of customer data could include:

o Average transaction amount.
o Frequency of visits.
o Product categories purchased.
o Time spent shopping.
 Data Preprocessing:
o Handle missing values by imputing or removing incomplete records.
o Normalize the data to ensure all features are on the same scale:
o from sklearn.preprocessing import StandardScaler
o
o scaler = StandardScaler()
o X_scaled = scaler.fit_transform(X)

Step 2: Apply K-Means Clustering

 Fit the K-Means algorithm to the data:

 from sklearn.cluster import KMeans

 kmeans = KMeans(n_clusters=3, random_state=42) # Start with 3 clusters
as an example
 kmeans.fit(X_scaled)
 labels = kmeans.labels_ # Cluster assignments for each customer
 Add cluster labels to the dataset for interpretation:
 import pandas as pd
 data['Cluster'] = labels

Step 3: Evaluate and Interpret Results

 Analyze the characteristics of each cluster:

 cluster_summary = data.groupby('Cluster').mean()
 print(cluster_summary)
 Visualize clusters (if data is 2D or 3D):
 import matplotlib.pyplot as plt

 plt.scatter(X_scaled[:, 0], X_scaled[:, 1], c=labels, cmap='viridis')
 plt.title('Customer Segments')
 plt.xlabel('Feature 1')
 plt.ylabel('Feature 2')
 plt.show()

3. Determining the Optimal Number of Clusters (k)

Finding the right number of clusters is crucial. Here are common methods:

A. Elbow Method

 Plot the Sum of Squared Distances (SSD) for different values of k.

 Choose the k where the SSD curve bends (elbow point):
 ssd = []
 for k in range(1, 11):
 kmeans = KMeans(n_clusters=k, random_state=42)
 kmeans.fit(X_scaled)
 ssd.append(kmeans.inertia_) # Sum of squared distances to the
nearest centroid

 plt.plot(range(1, 11), ssd, marker='o')
 plt.title('Elbow Method')
 plt.xlabel('Number of Clusters')
 plt.ylabel('SSD')
 plt.show()

B. Silhouette Score

 Measures how well-separated the clusters are. Higher scores indicate better-defined
clusters.
 Calculate for different values of k:
 from sklearn.metrics import silhouette_score

 for k in range(2, 11):
 kmeans = KMeans(n_clusters=k, random_state=42)
 kmeans.fit(X_scaled)
 score = silhouette_score(X_scaled, kmeans.labels_)
 print(f'For k={k}, Silhouette Score = {score}')

C. Gap Statistic

 Compares the total within-cluster variation to that of a null reference distribution.

 This method is more robust but requires additional computation.

4. Advantages of K-Means for Customer Segmentation

1. Simplicity: Easy to implement and computationally efficient.

2. Interpretability: Clusters provide actionable insights for marketing strategies.
3. Scalability: Suitable for large datasets.

5. Limitations and Considerations

1. Sensitivity to Initialization:
o Use the k-means++ initialization to improve performance.
o Example:
o kmeans = KMeans(n_clusters=3, init='k-means++', random_state=42)
2. Requires Numeric Data:
o Encode categorical variables using techniques like one-hot encoding.
3. Assumes Spherical Clusters:
o If the clusters are not spherical, consider alternatives like Gaussian Mixture
Models or DBSCAN.
4. Scaling:
o Always standardize or normalize features to prevent dominance by larger scale
variables.

6. Example Output

After applying K-Means, you might identify clusters such as:

 Cluster 1: High-spending customers with frequent purchases.

 Cluster 2: Budget-conscious customers with infrequent purchases.
 Cluster 3: Loyal customers who shop moderately but regularly.

These insights can guide targeted marketing, personalized offers, and inventory management.
Conclusion

To segment customers based on shopping patterns:

1. Preprocess and scale the data.

2. Use K-Means to group customers into meaningful clusters.
3. Determine the optimal number of clusters using methods like the Elbow Method or
Silhouette Score.
4. Interpret and act on the results to enhance customer experience and business strategies.

Answer 4

Using Ensemble Methods for Image Classification: Random Forest and Gradient
Boosting

Ensemble methods like Random Forest and Gradient Boosting are powerful techniques to
improve model performance by combining multiple weak learners into a strong learner. In the
context of classifying images of plants, ensemble methods can be used to enhance accuracy,
robustness, and generalization.

1. Why Use Ensemble Methods for Image Classification?

 Diversity: Combine multiple weak models to reduce bias and variance.

 Robustness: Handle noise and overfitting better than individual models.
 Improved Accuracy: Leverage the strengths of multiple learners.

2. Steps to Use Ensemble Methods

Step 1: Preprocess Image Data

 Convert raw images into numerical features using one of the following:
o Manual Feature Extraction: Use texture, color histograms, or shape descriptors.
o Deep Features: Use a pre-trained convolutional neural network (e.g., ResNet,
VGG) to extract features.
 from tensorflow.keras.applications import ResNet50
 from tensorflow.keras.applications.resnet50 import preprocess_input
 import numpy as np

 # Load a pre-trained model for feature extraction
 model = ResNet50(weights='imagenet', include_top=False, pooling='avg')

 # Preprocess the image
 img = preprocess_input(image_array)
 features = model.predict(img) # Extract deep features

 Normalize Features: Scale the extracted features for better performance in ensemble
models:
 from sklearn.preprocessing import StandardScaler

 scaler = StandardScaler()
 X_scaled = scaler.fit_transform(features)

Step 2: Random Forest for Classification

Random Forest builds multiple decision trees and combines their outputs (majority voting for
classification).

Advantages:

 Handles high-dimensional features well.

 Robust to overfitting due to bagging (bootstrap aggregation).

Implementation:

from sklearn.ensemble import RandomForestClassifier

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Split data
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y,
test_size=0.2, random_state=42)

# Train Random Forest

rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)

# Evaluate model
y_pred = rf_model.predict(X_test)
print("Random Forest Accuracy:", accuracy_score(y_test, y_pred))

Tuning Parameters:

 n_estimators: Number of trees.

 max_depth: Maximum depth of each tree.
 max_features: Number of features considered for each split.
 Use Grid Search for hyperparameter tuning:
 from sklearn.model_selection import GridSearchCV

 param_grid = {
 'n_estimators': [100, 200, 300],
 'max_depth': [10, 20, None],
 'max_features': ['sqrt', 'log2']
 }

 grid_search = GridSearchCV(RandomForestClassifier(random_state=42),
param_grid, cv=5)
 grid_search.fit(X_train, y_train)
 print("Best Parameters:", grid_search.best_params_)

Step 3: Gradient Boosting for Classification

Gradient Boosting improves model performance by sequentially adding weak learners (trees) and
correcting errors made by previous models.

Advantages:

 Focuses on minimizing prediction errors iteratively.

 Provides superior accuracy in many scenarios.

Implementation with Gradient Boosting:

from sklearn.ensemble import GradientBoostingClassifier

# Train Gradient Boosting

gb_model = GradientBoostingClassifier(n_estimators=100, learning_rate=0.1,
max_depth=3, random_state=42)
gb_model.fit(X_train, y_train)

# Evaluate model
y_pred = gb_model.predict(X_test)
print("Gradient Boosting Accuracy:", accuracy_score(y_test, y_pred))

Tuning Parameters:

 n_estimators: Number of trees.

 learning_rate: Step size for each iteration.
 max_depth: Depth of each tree.

Using XGBoost for Faster Performance: XGBoost is an optimized implementation of Gradient

Boosting:

from xgboost import XGBClassifier

xgb_model = XGBClassifier(n_estimators=100, learning_rate=0.1, max_depth=3,
random_state=42)
xgb_model.fit(X_train, y_train)

y_pred = xgb_model.predict(X_test)
print("XGBoost Accuracy:", accuracy_score(y_test, y_pred))

Step 4: Model Evaluation

 Use metrics like accuracy, precision, recall, and F1-score to evaluate performance.

from sklearn.metrics import classification_report, confusion_matrix

print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))

print("Classification Report:\n", classification_report(y_test, y_pred))

Step 5: Combining Ensemble Methods

You can combine multiple ensemble methods to leverage their strengths (stacking or blending):

from sklearn.ensemble import StackingClassifier

estimators = [
('rf', RandomForestClassifier(n_estimators=100, random_state=42)),
('gb', GradientBoostingClassifier(n_estimators=100, random_state=42))
]

stack_model = StackingClassifier(estimators=estimators,
final_estimator=XGBClassifier())
stack_model.fit(X_train, y_train)

# Evaluate stacked model

y_pred = stack_model.predict(X_test)
print("Stacking Accuracy:", accuracy_score(y_test, y_pred))

4. When to Use Random Forest or Gradient Boosting

 Random Forest:
o Handles high-dimensional data well.
o Robust to overfitting and noise.
o Less sensitive to hyperparameter tuning.

 Gradient Boosting:
o Delivers higher accuracy in many cases.
o Requires careful tuning of hyperparameters.
o Better suited for complex patterns in data.
5. Conclusion

To classify images of plants:

1. Extract and preprocess features from images.

2. Train models using Random Forest or Gradient Boosting.
3. Use hyperparameter tuning and evaluation metrics to optimize performance.
4. Combine models using stacking for further performance gains.

By leveraging ensemble methods, you can build a robust and accurate classifier for plant image
recognition.

Answer 5
When dealing with a multilingual text dataset for sentiment analysis, preprocessing is essential to
ensure the text is clean, consistent, and suitable for the model. Below is a detailed step-by-step
guide:

1. Understand the Dataset

 Inspect the data: Examine the structure, languages, and labels.

 Common challenges:
o Multiple languages.
o Variations in text quality (e.g., typos, special characters).
o Imbalanced classes in sentiment labels (e.g., positive, negative, neutral).

2. Steps for Preprocessing

Step 1: Handle Missing Data

 Check for missing or empty reviews and remove them:

 import pandas as pd
 # Remove rows with missing reviews
 data = data.dropna(subset=['review'])

Step 2: Language Detection

 Use a library like langdetect or langid to identify the language of each review:
 from langdetect import detect

 data['language'] = data['review'].apply(lambda x: detect(x))
 If your model will only support specific languages, filter out unsupported ones:
 supported_languages = ['en', 'es', 'fr']
 data = data[data['language'].isin(supported_languages)]

Step 3: Text Normalization

 Lowercasing: Convert text to lowercase to reduce variability.

 data['review'] = data['review'].str.lower()
 Remove special characters, punctuation, and numbers:
 import re

 data['review'] = data['review'].apply(lambda x: re.sub(r'[^a-zA-Z\s]',
'', x))
 Tokenization: Split text into individual words.
 from nltk.tokenize import word_tokenize

 data['tokens'] = data['review'].apply(lambda x: word_tokenize(x))

Step 4: Stopword Removal

 Remove common words that do not contribute to sentiment (e.g., "and", "the").
 Use language-specific stopword lists:
 from nltk.corpus import stopwords

 stop_words = set(stopwords.words('english')) # Adjust for each language
 data['tokens'] = data['tokens'].apply(lambda x: [word for word in x if
word not in stop_words])

Step 5: Lemmatization

 Reduce words to their base form to normalize variations.

 Use language-specific lemmatizers:
 from nltk.stem import WordNetLemmatizer

 lemmatizer = WordNetLemmatizer()
 data['tokens'] = data['tokens'].apply(lambda x:
[lemmatizer.lemmatize(word) for word in x])

Step 6: Convert Tokens Back to Text

 Reassemble processed tokens into cleaned text for model training:

 data['processed_review'] = data['tokens'].apply(lambda x: ' '.join(x))

3. Handling Multilingual Data

A. Translate to a Single Language

 If supporting one language, translate all reviews using an API like Google Translate or
Hugging Face Transformers:
 from transformers import MarianMTModel, MarianTokenizer

 model_name = 'Helsinki-NLP/opus-mt-es-en' # Spanish to English
 tokenizer = MarianTokenizer.from_pretrained(model_name)
 model = MarianMTModel.from_pretrained(model_name)

 def translate(text):
 inputs = tokenizer(text, return_tensors="pt", truncation=True)
 outputs = model.generate(**inputs)
 return tokenizer.decode(outputs[0], skip_special_tokens=True)

 data['translated_review'] = data['review'].apply(translate)

B. Use Multilingual Embeddings

 Instead of translating, use models that support multiple languages natively (e.g.,
mBERT, XLM-R).

4. Vectorize the Text

To feed the text into a machine learning model, convert it into numerical format.

A. Bag-of-Words (BoW)

 Represent text as a vector of word frequencies.

 from sklearn.feature_extraction.text import CountVectorizer

 vectorizer = CountVectorizer()
 X = vectorizer.fit_transform(data['processed_review'])

B. Term Frequency-Inverse Document Frequency (TF-IDF)

 Weigh words by importance across documents.

 from sklearn.feature_extraction.text import TfidfVectorizer

 tfidf = TfidfVectorizer()
 X = tfidf.fit_transform(data['processed_review'])

C. Pretrained Word Embeddings

 Use pretrained embeddings like GloVe, FastText, or Word2Vec.

 Example with GloVe:
 import gensim.downloader as api

 glove = api.load("glove-wiki-gigaword-50")
 data['embedding'] = data['tokens'].apply(lambda x: [glove[word] for word
in x if word in glove])

D. Deep Learning Embeddings

 Use embeddings from transformer models like BERT or RoBERTa:

 from transformers import BertTokenizer, BertModel

 tokenizer = BertTokenizer.from_pretrained('bert-base-multilingual-
cased')
 model = BertModel.from_pretrained('bert-base-multilingual-cased')

 def get_bert_embeddings(text):
 inputs = tokenizer(text, return_tensors="pt", truncation=True,
padding=True)
 outputs = model(**inputs)
 return outputs.last_hidden_state.mean(dim=1)

 data['bert_embedding'] =
data['processed_review'].apply(get_bert_embeddings)

5. Train and Evaluate the Model

 Split the data into training and test sets:

 from sklearn.model_selection import train_test_split

 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
 Train a model (e.g., Logistic Regression, SVM, or a deep learning model):
 from sklearn.linear_model import LogisticRegression

 model = LogisticRegression()
 model.fit(X_train, y_train)
 Evaluate the model:
 from sklearn.metrics import classification_report

 y_pred = model.predict(X_test)
 print(classification_report(y_test, y_pred))

6. Summary of Preprocessing Steps

1. Handle missing data.

2. Detect and filter languages if necessary.
3. Normalize text (lowercase, remove punctuation, tokenize).
4. Remove stopwords and lemmatize words.
5. Translate text or use multilingual models for consistency.
6. Convert text into numerical features using methods like TF-IDF or embeddings.

Kaggle Competitions - How To Win
No ratings yet
Kaggle Competitions - How To Win
74 pages
Lecture 4 Evaluation
No ratings yet
Lecture 4 Evaluation
58 pages
Minor Project
No ratings yet
Minor Project
21 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
22 pages
Kaggle Course Notes
No ratings yet
Kaggle Course Notes
87 pages
Progress of GRADIENT BOOSTING ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
No ratings yet
Progress of GRADIENT BOOSTING ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
10 pages
ML Lab-1
No ratings yet
ML Lab-1
32 pages
Northbay Summarizes Data Pre-Processing Algorithms
No ratings yet
Northbay Summarizes Data Pre-Processing Algorithms
10 pages
Unit 2
No ratings yet
Unit 2
23 pages
Part C
No ratings yet
Part C
15 pages
Eldar: Name: Ticket:N3 Group:E27-24
No ratings yet
Eldar: Name: Ticket:N3 Group:E27-24
10 pages
E27-24 Machine Learning (Final) Ticket 4: Ridge Regression
No ratings yet
E27-24 Machine Learning (Final) Ticket 4: Ridge Regression
9 pages
Slides On DataI
No ratings yet
Slides On DataI
33 pages
Regression Linaire Python Tome II
No ratings yet
Regression Linaire Python Tome II
10 pages
Lab-11 Random Forest
No ratings yet
Lab-11 Random Forest
2 pages
Final Report
No ratings yet
Final Report
17 pages
ML Practical
No ratings yet
ML Practical
61 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Ticket N 2 Question N1
No ratings yet
Ticket N 2 Question N1
5 pages
ML Manual
No ratings yet
ML Manual
24 pages
4th Assign
No ratings yet
4th Assign
6 pages
17 Ensemble Techniques Problem Statement
No ratings yet
17 Ensemble Techniques Problem Statement
28 pages
ML Viva Practice (Answers)
No ratings yet
ML Viva Practice (Answers)
4 pages
Group B: Machine Learning
No ratings yet
Group B: Machine Learning
25 pages
Understanding Artificial Intelligence Ethics and Safety PDF
No ratings yet
Understanding Artificial Intelligence Ethics and Safety PDF
97 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
AIML Short Term Internship Session 10 Summary-1719293295226
No ratings yet
AIML Short Term Internship Session 10 Summary-1719293295226
3 pages
ML and Deploying It Using Flask and Docker.
No ratings yet
ML and Deploying It Using Flask and Docker.
30 pages
AI Note
No ratings yet
AI Note
5 pages
ML Ex 5
No ratings yet
ML Ex 5
6 pages
Machine Learning Lecture1 - 26-27 Aug
No ratings yet
Machine Learning Lecture1 - 26-27 Aug
30 pages
ML CheatSheet
No ratings yet
ML CheatSheet
14 pages
ML Lab
No ratings yet
ML Lab
23 pages
TensorFlow Basics
100% (1)
TensorFlow Basics
38 pages
Data Preprocessing
No ratings yet
Data Preprocessing
9 pages
ML Assignment 2
No ratings yet
ML Assignment 2
3 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Model Evaluation - II
No ratings yet
Model Evaluation - II
12 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
Ashfatmaterial
No ratings yet
Ashfatmaterial
4 pages
B2 19bec113 19bec116 Loan Prediction
No ratings yet
B2 19bec113 19bec116 Loan Prediction
3 pages
Hanoi - 2021: (Document Title)
No ratings yet
Hanoi - 2021: (Document Title)
19 pages
Phase 3 IBM
No ratings yet
Phase 3 IBM
7 pages
Unit 4 - Question Bank and Answers
No ratings yet
Unit 4 - Question Bank and Answers
23 pages
Data Collection
No ratings yet
Data Collection
8 pages
Answer
No ratings yet
Answer
5 pages
Choosing Model and Tuning
No ratings yet
Choosing Model and Tuning
20 pages
Context: Description
No ratings yet
Context: Description
5 pages
TD2345
No ratings yet
TD2345
3 pages
DIT865 2018 Mar Solution
No ratings yet
DIT865 2018 Mar Solution
9 pages
Be A 65 Ads Exp 3
No ratings yet
Be A 65 Ads Exp 3
6 pages
Project Report-Micro Credit Loan
No ratings yet
Project Report-Micro Credit Loan
8 pages
Rev Insurance Business Report
No ratings yet
Rev Insurance Business Report
4 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
Capstone Project - Jaro-Prof. Babji
No ratings yet
Capstone Project - Jaro-Prof. Babji
5 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
4 pages
Final Project Report - Kelompok 4
No ratings yet
Final Project Report - Kelompok 4
6 pages
A3 Classification and Feature Engineering
No ratings yet
A3 Classification and Feature Engineering
2 pages
Early Prediction For Chronic Kidney Disease Detection A Progressive Approach To Health Management
No ratings yet
Early Prediction For Chronic Kidney Disease Detection A Progressive Approach To Health Management
34 pages
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
No ratings yet
Unit 5 - Machine Learning - WWW - Rgpvnotes.in
12 pages
Syllabus
No ratings yet
Syllabus
5 pages
Lecture 26
No ratings yet
Lecture 26
17 pages
QDT Final
No ratings yet
QDT Final
107 pages
Applied Computing and Informatics: Kumash Kapadia, Hussein Abdel-Jaber, Fadi Thabtah, Wael Hadi
No ratings yet
Applied Computing and Informatics: Kumash Kapadia, Hussein Abdel-Jaber, Fadi Thabtah, Wael Hadi
6 pages
Wine Quality Prediction Using Machine Learning Algorithms
100% (1)
Wine Quality Prediction Using Machine Learning Algorithms
4 pages
Online Analysis of Handwriting For Disease Diagnosis: A Review
No ratings yet
Online Analysis of Handwriting For Disease Diagnosis: A Review
7 pages
Naive Bayes
No ratings yet
Naive Bayes
26 pages
Back Propagation
No ratings yet
Back Propagation
21 pages
Ai ML
No ratings yet
Ai ML
57 pages
23 31 Network Intrusion Detection Using Wireshark and Machine Learning
No ratings yet
23 31 Network Intrusion Detection Using Wireshark and Machine Learning
9 pages
Classification of Stars, Galaxies and Quasars
No ratings yet
Classification of Stars, Galaxies and Quasars
8 pages
Paper On Computer Vision
No ratings yet
Paper On Computer Vision
7 pages
Machine Learning Based Social Media Bot Detection: A Comprehensive Literature Review
No ratings yet
Machine Learning Based Social Media Bot Detection: A Comprehensive Literature Review
40 pages
ML Lab
No ratings yet
ML Lab
9 pages
Integrated Computer Vision and Soft Computing System For Classifying The Pilling Resistance of Knitted Fabrics
No ratings yet
Integrated Computer Vision and Soft Computing System For Classifying The Pilling Resistance of Knitted Fabrics
7 pages
1 s2.0 S0039625722001163 Main
No ratings yet
1 s2.0 S0039625722001163 Main
25 pages
Unit 5
No ratings yet
Unit 5
10 pages
Convolutional Neural Networks: Computer Vision
No ratings yet
Convolutional Neural Networks: Computer Vision
41 pages
Calculate Confusion Matrices
No ratings yet
Calculate Confusion Matrices
5 pages
Youtube Comments Sentiment Analysis: Article
No ratings yet
Youtube Comments Sentiment Analysis: Article
12 pages
The Backpropagation Algorithm: Indian Institute of Technology Roorkee
No ratings yet
The Backpropagation Algorithm: Indian Institute of Technology Roorkee
19 pages
K-Medoids Clustering Using Partitioning Around Medoids For Performing Face Recognition
No ratings yet
K-Medoids Clustering Using Partitioning Around Medoids For Performing Face Recognition
12 pages
Fake News Detection Using Enhanced BERT
No ratings yet
Fake News Detection Using Enhanced BERT
8 pages
003 01 KNN - Intro W3L1
No ratings yet
003 01 KNN - Intro W3L1
8 pages
Fake News Classification
No ratings yet
Fake News Classification
8 pages
Intelligent Control System Course Plan
No ratings yet
Intelligent Control System Course Plan
4 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet