0% found this document useful (0 votes)
29 views4 pages

Algorithm To Deduce Parameter From Data

Uploaded by

hu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views4 pages

Algorithm To Deduce Parameter From Data

Uploaded by

hu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Algorithm to Deduce Parameter from Data

Optuna Hyperparameter Optimization Report

1. Introduction

● Objective: The goal of this project was to optimize the hyperparameters for multiple
outlier detection models using Optuna.

2. Dataset

● Description: The dataset used contains a scaled feature (OC,IC) for clustering and
outlier detection.
● Data Preparation: Data was scaled and preprocessed as required for each model
type.

3. Models and Hyperparameters

● DBSCAN:
○ eps: Maximum distance between two samples for one to be
considered as in the neighborhood of the other.
○ min_samples: Minimum number of samples in a neighborhood for a
point to be considered a core point.
● KMeans:
○ n_clusters: Number of clusters to form.
○ init: Method for initialization.
○ max_iter: Maximum number of iterations of the k-means algorithm
for a single run.
● Isolation Forest:
○ n_estimators: Number of base estimators in the ensemble.
○ max_samples: Number of samples to draw to train each base
estimator.
● ABOD:
○ n_neighbors: Number of neighbors to use for the angle-based
calculation.

4. Optimization Methodology

● Optuna Framework: Optuna was used to automate the hyperparameter tuning process.
The study was configured to maximize the outlier detection performance, evaluated
through metrics such as mean, Davies-Bouldin index, or other relevant outlier metrics.
● Search Space:
○ DBSCAN: eps and min_samples.
○ KMeans: n_clusters, init, and max_iter.
○ Isolation Forest: n_estimators and max_samples.
○ ABOD: n_neighbors.
● Optimization Algorithm: Optuna's Tree-structured Parzen Estimator (TPE)
was used for efficient exploration of the hyperparameter space.

5. Results

● Best Hyperparameters:
○ DBSCAN:
■ eps: [Optimal Value]
■ min_samples: [Optimal Value]
○ KMeans:
■ n_clusters: [Optimal Value]
■ init: [Optimal Method]
■ max_iter: [Optimal Value]
○ Isolation Forest:
■ n_estimators: [Optimal Value]
■ max_samples: [Optimal Value]
○ ABOD:
■ n_neighbors: [Optimal Value]
● Performance Metrics:
○ [Include relevant performance metrics for each model before and
after optimization.]

6. Conclusion

● Summary: This Algorithm works well for all modes except the clustering based
algorithm . Because in clustering based algorithms we need a threshold through which
we identify the anomalies.
Hyperopt Hyperparameter Optimization Report

1. Introduction

● Objective: The purpose of this project was to optimize the hyperparameters for several
outlier detection models using Hyperopt.

2. Dataset

● Description: The dataset comprises a scaled feature (OC,IC) used for clustering and
outlier detection.
● Data Preparation: Data was preprocessed and scaled appropriately for each model.

3. Models and Hyperparameters

● DBSCAN:
○ eps: The maximum distance between two samples for one to be
considered as in the neighborhood of the other.
○ min_samples: The minimum number of samples in a neighborhood
for a point to be considered a core point.
● KMeans:
○ n_clusters: Number of clusters to form.
○ init: Method for initialization.
○ max_iter: Maximum number of iterations of the k-means algorithm.
● Isolation Forest:
○ n_estimators: Number of base estimators in the ensemble.
○ max_samples: Number of samples to draw to train each base
estimator.
● ABOD:
○ n_neighbors: Number of neighbors to use for the angle-based
calculation.

4. Optimization Methodology

● Hyperopt Framework: Hyperopt was used to automate the hyperparameter tuning


process. The objective was to maximize outlier detection performance, assessed
through metrics such as the silhouette score, Davies-Bouldin index, or other relevant
metrics.
● Search Space:
○ DBSCAN: eps and min_samples.
○ KMeans: n_clusters, init, and max_iter.
○ Isolation Forest: n_estimators and max_samples.
○ ABOD: n_neighbors.
● Optimization Algorithm:
○ Search Algorithm: Hyperopt's Tree-structured Parzen Estimator (TPE) was
employed to efficiently explore the hyperparameter space.
○ Trials: The number of trials conducted to explore different combinations of
hyperparameters.

5. Results

● Best Hyperparameters:
○ DBSCAN:
■ eps: [Optimal Value]
■ min_samples: [Optimal Value]
○ KMeans:
■ n_clusters: [Optimal Value]
■ init: [Optimal Method]
■ max_iter: [Optimal Value]
○ Isolation Forest:
■ n_estimators: [Optimal Value]
■ max_samples: [Optimal Value]
○ ABOD:
■ n_neighbors: [Optimal Value]
● Performance Metrics:
○ [Include relevant performance metrics for each model before and
after optimization.]

6. Conclusion
● Summary: This Algorithm works well for all modes except the clustering based
algorithm . Because in clustering based algorithms we need a threshold through which
we identify the anomalies

You might also like