0% found this document useful (0 votes)
11 views30 pages

Final PPT PFD

This document presents a study on optimizing cloud-based application performance through machine learning, focusing on task prioritization and resource allocation. Key machine learning algorithms such as Random Forest, SVM, and Gradient Boosting Machines were evaluated for their effectiveness in classifying task priority based on system metrics. The findings suggest that these models can significantly enhance task scheduling efficiency and resource management in cloud computing environments.

Uploaded by

jonnyreddy111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views30 pages

Final PPT PFD

This document presents a study on optimizing cloud-based application performance through machine learning, focusing on task prioritization and resource allocation. Key machine learning algorithms such as Random Forest, SVM, and Gradient Boosting Machines were evaluated for their effectiveness in classifying task priority based on system metrics. The findings suggest that these models can significantly enhance task scheduling efficiency and resource management in cloud computing environments.

Uploaded by

jonnyreddy111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Performance Optimization of Cloud-Based

Applications Using Machine Learning

Presenter Names:
Guide Name: DR.B.SRINIVASA RAO
C.Pramod – VU21CSEN0100200
T.Krishna – VU21CSEN0101244
Ch.Harthik – VU21CSEN0101576
M.Deepak – VU21CSEN0101522
G.S.K.Bharath – VU21CSEN0101423

CSE, GST, Visakhapatnam


Contents
1. Abstract
2. Introduction
3. Literature Review
4. Requirement Analysis
5. Methodology/ Tools/Methods
to be used
6. Implementation
7. Screenshots of the project
8. Results & Discussions
9. Conclusion and Future Scope
10.References

CSE, GST, Visakhapatnam


Abstract

Efficient task prioritization in cloud computing is crucial for optimizing resource allocation and system
performance. This study evaluates multiple machine learning algorithms to classify task priority based on key
system metrics, including CPU usage, memory consumption, and power usage. Among the algorithms
examined, Random Forest Classifier, Support Vector Machine (SVM), and Gradient Boosting Machines (GBM)
emerged as the most effective models. Random Forest provides robustness and scalability while offering
feature importance insights. SVM excels in handling non-linear relationships, ensuring precise classification in
structured environments. GBM enhances accuracy through sequential error correction, effectively addressing
misclassified instances. Together, these models demonstrate high accuracy, scalability, and the ability to
capture complex patterns in cloud computing environments. By refining these models, task scheduling
efficiency and overall resource management in cloud systems can be significantly improved.
Introduction
• In today’s cloud computing environments, the efficient management of system resources is crucial to ensuring
optimal performance, reducing operational costs, and minimizing energy consumption.
• As the demand for cloud services continues to grow, so does the complexity of managing tasks across numerous
virtual machines, which in turn impacts task scheduling, energy consumption, and execution time.
• One of the most significant challenges faced by cloud providers is the dynamic allocation of resources, especially
when it comes to prioritizing tasks based on various performance metrics.
• Task prioritization plays a critical role in cloud computing environments, determining which tasks should be
executed first to optimize system performance and minimize resource wastage.
• Traditionally, task priority has been managed using predefined heuristics or simple algorithms, but with the advent
of machine learning (ML), there is an opportunity to improve this process by leveraging large volumes of data
collected from the cloud infrastructure.
Literature Review
Title Author Dataset used Model and About
Accuracy

Proposes an integrated scheduling


Design of an Improved Method for Task
Simulated cloud Reported a 20-25% framework combining Proximal Policy
Scheduling Using Proximal Policy
Gokulnath B.V. computing enhancement in task Optimization and Graph Neural
Optimization and Graph Neural
environment prioritization Networks to improve task scheduling
Networks. 2023
in cloud computing.

Simulated Annealing Approach to Cost- Introduces a Simulated Annealing


Based Multi-Quality of Service Job Monir Abdullah, Mohamed algorithm for job scheduling in cloud
Not specified Not specified
Scheduling in Cloud Computing Othman environments, aiming to optimize cost
Environment. 2021 and quality of service parameters.
Literature Review
Title Author Dataset used Model and Accuracy About

The study does not use a real-world dataset but


instead relies on a randomly generated
Using Logistic Regression to
workload with uniform distribution. The host
Improve Virtual Issa, M. B., Daraghmeh, M., synthetic workload Achieved superior
utilization history is used as input for the
Machines Management in Jararweh, Y., Al-Ayyoub, M., data generated performance compared to
Logistic Regression-based overloading
Cloud Computing Alsmirat, M., & Benkhelifa, using CloudSim baseline methods
prediction algorithm, and the upper utilization
Systems. 2022
threshold is determined using the Median
Absolute Deviation (MAD) method.

This project develops a resource-optimized task


Resource Optimization and
scheduling algorithm using a logistic regression-
Task Scheduling Using Concentrate on Encryption
Kalaiselvi, Dr. A.Chandrabose No dataset based deep recurrent network in green cloud
Logistic Regression for Cloud process
computing, integrating RSA encryption for
Computing. 2021
secure data transmission
Requirement Analysis
Hardware Requirements:

RAM:
Minimum: 4 GB (enough for basic development, but it may be slow).
Recommended: 8 GB or more (for smoother performance when testing and running multiple tools).

Processor (CPU):
Minimum: Intel Core i3 (7th gen) or AMD equivalent.
Recommended: Intel Core i5/i7 (10th gen) or AMD Ryzen 5/7 (for better multitasking).

Storage:
Minimum: 128 GB (SSD preferred for speed).
Recommended: 256 GB SSD or more (especially for larger projects or datasets).

Software Requirements:
Operating System:
Minimum: Windows 10, macOS Catalina (10.15), or any modern Linux (Ubuntu 20.04+).
Recommended: Windows 11 or macOS Monterey (12.0+).

Browser:
Works best with the latest versions of Google Chrome, Microsoft Edge, Mozilla Firefox, or Safari (for Mac).
Tools

1. Model Development & Experimentation


•Used VS CODE for data preprocessing, feature selection, and training machine learning models.
•Utilize libraries like Scikit-learn, XGBoost, and TensorFlow for model training and evaluation.
•Use Matplotlib and Seaborn for data visualization to understand feature importance and model performance.

2. Flask for API


•Once your model is trained, save it using joblib or pickle.
•Create a Flask API to serve the model and make predictions in real-time.
•Use Flask-RESTful to structure the API efficiently.
Methodology

Cloud Computing Task Management Using Machine Learning


This module implements machine learning models for cloud computing task management, focusing on:
1.Task Allocation (Classification): Determines whether a task should be allocated to a cloud resource.
2.Task Prioritization (Regression): Predicts task priority levels for efficient scheduling.
The workflow consists of data preprocessing, model implementation, and evaluation.

Data Processing
• The data processing phase begins with importing essential Python libraries, including Matplotlib and Seaborn for data
visualization, Scikit-learn and XGBoost for machine learning, and Pandas for data handling. The dataset, which contains
cloud computing task details, is loaded and explored through exploratory data analysis (EDA) to identify missing values
and preprocess the data accordingly.

• During preprocessing, categorical encoding is applied to the Task_Allocation feature using LabelEncoder, while
numerical features are standardized using StandardScaler to ensure consistent scaling. The dataset is then split into
training (80%) and testing (20%) sets to facilitate model training and evaluation.
Methodology
2.Machine Learning model implementation
Methodology
3. Model Evaluation

Classification Evaluation Metrics


The classification models are assessed using several performance metrics:
•Accuracy: Measures overall correctness.
•Precision: Evaluates the proportion of correct positive predictions.
•Recall: Assesses the ability to identify actual positives.
•F1 Score: Balances precision and recall for a comprehensive performance measure.
•Confusion Matrix: Visualizes correct versus incorrect classifications.
These metrics help determine which model provides the most reliable predictions for task allocation.

Regression Evaluation Metrics


The regression models are evaluated using:
•Mean Absolute Error (MAE): Measures average prediction error.
•Mean Squared Error (MSE): Penalizes larger errors to emphasize significant mispredictions.
•R² Score: Indicates how well the model explains the variance in task priority.
By analyzing these metrics, the best-performing regression model for task prioritization is identified.
Methodology

4. Results and Conclusion

For task allocation, SVM provides strong performance on structured data but struggles with large datasets, while Random
Forest offers high accuracy and interpretability. However, XGBoost outperforms both due to its iterative boosting
mechanism, which enhances accuracy and efficiency.

For task prioritization, SVR efficiently handles non-linearity, but Random Forest Regressor provides better
robustness and feature importance insights. Ultimately, XGBoost Regressor delivers the best accuracy due to its
advanced boosting strategy, making it the most suitable model for cloud-based task prioritization.
Future improvements could include hyperparameter tuning, feature selection, and real-time deployment using Flask to
enhance cloud computing task scheduling and allocation.
Implementation
Libraries Used

1.Matplotlib & Seaborn – For data visualization and analysis.


2.Pandas – For data manipulation and preprocessing.
3.Scikit-learn (sklearn)
1. train_test_split – Splitting dataset into training & testing.
2. StandardScaler – Normalizing data for better model performance.
3. SVR, RandomForestRegressor – Machine learning models used for prediction.
4. mean_absolute_error, mean_squared_error, r2_score – Model evaluation metrics.

4.XGBoost (XGBRegressor) – Gradient boosting model for better predictive accuracy.


Implementation
Preprocessing

Handling Missing Values: Checked for missing values using isnull().sum() and handled them accordingly.

Data Cleaning: Removed or imputed missing data if necessary.


Implementation
Model Training & Evaluation

Loaded dataset (dataset.csv) and selected features (X) and target variable (y).

Split the dataset into training (80%) and testing (20%) using train_test_split().

The model was trained on X_train, y_train.


Implementation
Predictions & Evaluation:

The trained model made predictions on X_test.

Performance was measured using:


• Mean Absolute Error (MAE) – Measures average absolute difference.
• Mean Squared Error (MSE) – Penalizes large errors more than MAE.
• R² Score – Measures how well the model explains variance in data.
Implementation

Comparison of Different Models

To evaluate the best-performing model, we compared multiple machine learning models based on three key metrics:

Mean Absolute Error (MAE) , Mean Squared Error (MSE) , R² Score.


Implementation
Screenshots
Screenshots
Screenshots
Screenshots
Results & Discussion
Results & Discussion
Conclusion
• In conclusion, after evaluating various machine learning algorithms for task prioritization in cloud computing environments,
Random Forest Classifier, Support Vector Machine (SVM), and Gradient Boosting Machines (GBM) emerge as the best-suited
algorithms for this task.
• Each of these models excels in handling complex, high-dimensional datasets like the one at hand, where system performance
metrics such as CPU usage, memory usage, and power consumption are used to predict task priority.
• Random Forest offers robustness and scalability, making it ideal for large datasets while providing feature importance insights.
SVM, with its ability to handle non-linear relationships, provides accurate classification in environments where clear decision
boundaries exist.
• Gradient Boosting, with its sequential error-correction mechanism, significantly boosts accuracy by focusing on misclassified
instances. Together, these algorithms offer high accuracy, scalability, and the ability to capture intricate patterns in cloud
system performance, making them the optimal choice for task priority classification.
• By refining these models further, we can enhance task scheduling efficiency and resource management in cloud computing
systems.
Future Scope
The study on machine learning-based task prioritization in cloud computing opens several avenues for future research
and development:
1.Model Optimization – Further hyperparameter tuning and feature selection techniques can enhance the performance
and efficiency of the selected models (Random Forest, SVM, and GBM).
2.Deep Learning Integration – Exploring deep learning approaches such as neural networks and reinforcement
learning can improve accuracy and adaptability in dynamic cloud environments.
3.Real-time Task Scheduling – Implementing these models in real-time cloud environments can enhance task
scheduling, reducing latency and optimizing resource utilization.
4.Scalability and Adaptability – Investigating how these models perform in large-scale cloud infrastructures with
varying workloads and resource constraints will help in their practical deployment.
References
1.Buyya, R., Calheiros, R. N., & Dastjerdi, A. V. (2016). Fog and Edge Computing: Principles and Paradigms. Wiley.

2. Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, 51(1),
107-113.

3. Li, K., Tang, M., Qiu, M., & Liu, K. (2019). Adaptive Resource Allocation in Cloud Computing Using Machine Learning Approaches.
Future Generation Computer Systems, 91, 87-94.

4. Xiao, Z., Song, W., & Chen, Q. (2013). Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment. IEEE
Transactions on Parallel and Distributed Systems, 24(6), 1107-1117.

5. Kumar, R., & Thakur, N. (2020). Machine Learning in Cloud Computing: A Systematic Review. Journal of Cloud Computing, 9(1), 1-20.

6. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., ... & Stoica, I. (2012). Resilient Distributed Datasets: A Fault-
Tolerant Abstraction for In-Memory Cluster Computing. In Proceedings of the USENIX Symposium on Networked Systems Design and
Implementation (NSDI), 2012.

7. Gholami, A., Keutzer, K., & Rastegari, M. (2021). A Survey of Quantization Methods for Efficient Neural Network Inference. Journal of
Machine Learning Research, 22(1), 1-48.

8. Calheiros, R. N., Ranjan, R., Beloglazov, A., De Rose, C. A., & Buyya, R. (2011). CloudSim: A Toolkit for Modeling and Simulation of
Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms. Software: Practice and Experience, 41(1), 23-50.

9. Meng, X., Liu, S., Guo, Y., & Sun, X. (2019). A Learning-Based Algorithm for Dynamic Resource Allocation in Cloud Data Centers.
IEEE Transactions on Cloud Computing, 8(2), 506-518.
Thank You

CSE, GST, Visakhapatnam

You might also like