Final PPT PFD
Final PPT PFD
Presenter Names:
Guide Name: DR.B.SRINIVASA RAO
C.Pramod – VU21CSEN0100200
T.Krishna – VU21CSEN0101244
Ch.Harthik – VU21CSEN0101576
M.Deepak – VU21CSEN0101522
G.S.K.Bharath – VU21CSEN0101423
Efficient task prioritization in cloud computing is crucial for optimizing resource allocation and system
performance. This study evaluates multiple machine learning algorithms to classify task priority based on key
system metrics, including CPU usage, memory consumption, and power usage. Among the algorithms
examined, Random Forest Classifier, Support Vector Machine (SVM), and Gradient Boosting Machines (GBM)
emerged as the most effective models. Random Forest provides robustness and scalability while offering
feature importance insights. SVM excels in handling non-linear relationships, ensuring precise classification in
structured environments. GBM enhances accuracy through sequential error correction, effectively addressing
misclassified instances. Together, these models demonstrate high accuracy, scalability, and the ability to
capture complex patterns in cloud computing environments. By refining these models, task scheduling
efficiency and overall resource management in cloud systems can be significantly improved.
Introduction
• In today’s cloud computing environments, the efficient management of system resources is crucial to ensuring
optimal performance, reducing operational costs, and minimizing energy consumption.
• As the demand for cloud services continues to grow, so does the complexity of managing tasks across numerous
virtual machines, which in turn impacts task scheduling, energy consumption, and execution time.
• One of the most significant challenges faced by cloud providers is the dynamic allocation of resources, especially
when it comes to prioritizing tasks based on various performance metrics.
• Task prioritization plays a critical role in cloud computing environments, determining which tasks should be
executed first to optimize system performance and minimize resource wastage.
• Traditionally, task priority has been managed using predefined heuristics or simple algorithms, but with the advent
of machine learning (ML), there is an opportunity to improve this process by leveraging large volumes of data
collected from the cloud infrastructure.
Literature Review
Title Author Dataset used Model and About
Accuracy
RAM:
Minimum: 4 GB (enough for basic development, but it may be slow).
Recommended: 8 GB or more (for smoother performance when testing and running multiple tools).
Processor (CPU):
Minimum: Intel Core i3 (7th gen) or AMD equivalent.
Recommended: Intel Core i5/i7 (10th gen) or AMD Ryzen 5/7 (for better multitasking).
Storage:
Minimum: 128 GB (SSD preferred for speed).
Recommended: 256 GB SSD or more (especially for larger projects or datasets).
Software Requirements:
Operating System:
Minimum: Windows 10, macOS Catalina (10.15), or any modern Linux (Ubuntu 20.04+).
Recommended: Windows 11 or macOS Monterey (12.0+).
Browser:
Works best with the latest versions of Google Chrome, Microsoft Edge, Mozilla Firefox, or Safari (for Mac).
Tools
Data Processing
• The data processing phase begins with importing essential Python libraries, including Matplotlib and Seaborn for data
visualization, Scikit-learn and XGBoost for machine learning, and Pandas for data handling. The dataset, which contains
cloud computing task details, is loaded and explored through exploratory data analysis (EDA) to identify missing values
and preprocess the data accordingly.
• During preprocessing, categorical encoding is applied to the Task_Allocation feature using LabelEncoder, while
numerical features are standardized using StandardScaler to ensure consistent scaling. The dataset is then split into
training (80%) and testing (20%) sets to facilitate model training and evaluation.
Methodology
2.Machine Learning model implementation
Methodology
3. Model Evaluation
For task allocation, SVM provides strong performance on structured data but struggles with large datasets, while Random
Forest offers high accuracy and interpretability. However, XGBoost outperforms both due to its iterative boosting
mechanism, which enhances accuracy and efficiency.
For task prioritization, SVR efficiently handles non-linearity, but Random Forest Regressor provides better
robustness and feature importance insights. Ultimately, XGBoost Regressor delivers the best accuracy due to its
advanced boosting strategy, making it the most suitable model for cloud-based task prioritization.
Future improvements could include hyperparameter tuning, feature selection, and real-time deployment using Flask to
enhance cloud computing task scheduling and allocation.
Implementation
Libraries Used
Handling Missing Values: Checked for missing values using isnull().sum() and handled them accordingly.
Loaded dataset (dataset.csv) and selected features (X) and target variable (y).
Split the dataset into training (80%) and testing (20%) using train_test_split().
To evaluate the best-performing model, we compared multiple machine learning models based on three key metrics:
2. Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, 51(1),
107-113.
3. Li, K., Tang, M., Qiu, M., & Liu, K. (2019). Adaptive Resource Allocation in Cloud Computing Using Machine Learning Approaches.
Future Generation Computer Systems, 91, 87-94.
4. Xiao, Z., Song, W., & Chen, Q. (2013). Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment. IEEE
Transactions on Parallel and Distributed Systems, 24(6), 1107-1117.
5. Kumar, R., & Thakur, N. (2020). Machine Learning in Cloud Computing: A Systematic Review. Journal of Cloud Computing, 9(1), 1-20.
6. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., ... & Stoica, I. (2012). Resilient Distributed Datasets: A Fault-
Tolerant Abstraction for In-Memory Cluster Computing. In Proceedings of the USENIX Symposium on Networked Systems Design and
Implementation (NSDI), 2012.
7. Gholami, A., Keutzer, K., & Rastegari, M. (2021). A Survey of Quantization Methods for Efficient Neural Network Inference. Journal of
Machine Learning Research, 22(1), 1-48.
8. Calheiros, R. N., Ranjan, R., Beloglazov, A., De Rose, C. A., & Buyya, R. (2011). CloudSim: A Toolkit for Modeling and Simulation of
Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms. Software: Practice and Experience, 41(1), 23-50.
9. Meng, X., Liu, S., Guo, Y., & Sun, X. (2019). A Learning-Based Algorithm for Dynamic Resource Allocation in Cloud Data Centers.
IEEE Transactions on Cloud Computing, 8(2), 506-518.
Thank You