0% found this document useful (0 votes)

19 views8 pages

CSD Final Report

The document describes a learning based caching algorithm that was designed and evaluated. It performed simulations of the algorithm on several workloads and found it outperformed other state-of-the-art algorithms. It also describes setting up an Nginx proxy to cache requests for wikipedia.org to reduce latency and availability.

Uploaded by

Mitul vardhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views8 pages

CSD Final Report

Uploaded by

Mitul vardhan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

CS559 Final Report

Team member - Mitul Vardhan(12141070)

Github Link: https://github.com/ramanujan123/CS559_PROJECT

Project Title : Implementation and Evaluation of ML-based

Caching Algorithm
Introduction
Cache replacement algorithms have undergone significant evolution over time, with
each iteration aiming to mitigate certain deficiencies found in its predecessors.
Nonetheless, despite considerable progress, contemporary caching algorithms still
present opportunities for enhancement. Recent advances in machine learning open up
new and attractive approaches for solving classic problems in computing systems. For
storage systems, cache replacement is one such problem because of its enormous
impact on performance.

There are some learning based algorithms for caching like LeCaR, which has access to
two simple policies,LRU and LFU. LeCaR uses regret minimization, a machine learning
technique that allows the dynamic selection of one of these policies upon a cache miss.
But, it has been shown that LeCaR underperforms many state-of-the-art algorithms
such as ARC and LIRS for many production workloads.

In this project, an attempt has been made to design an adaptive learning cache
replacement algorithm, which will perform better than the state-of-the-art. Simulation
experiments have been conducted on industry-level workloads and findings have been
comprehensively showcased. Also, a mirror of wikipedia.org was made by running a
caching Nginx Proxy in front of it. This would help reduce latency and availability of wiki
contents even in the times of DDoS attacks.

Design of Learning Based Caching Algorithm

● Initialization:
○ Initialize cache with a size of N.
○ Initialize learning rate randomly between 10^-3 and 1.
○ Define the window size for monitoring performance.
● Gradient-based Stochastic Hill Climbing with Random Restart:
○ At the end of every window of N requests, calculate the gradient of
performance (average hit-rate) with respect to the learning rate over the
previous two windows.
○ Use a gradient-based stochastic hill climbing approach with random
restart to adjust the learning rate.
● Direction of Learning Rate Adjustment:
○ If the gradient is positive, sustain the direction of change of the learning
rate. If negative, reverse the direction.
○ The direction of change determines whether to increase or decrease the
learning rate.
● Magnitude of Learning Rate Adjustment:
○ The amount of change in learning rate from the previous window
determines the magnitude of the change in learning rate for the next
window.
○ If the performance increases by increasing the learning rate, increase the
learning rate by multiplying it by the amount of learning rate change from
the previous window, and vice versa.
● Monitoring Performance:
○ If the learning rate doesn't change for consecutive windows and the
performance degrades continuously or becomes zero, record this.
○ If the performance degrades for 10 consecutive window sizes, reset the
learning rate to the initial value.
○ The objective is to restart learning when performance drops for a longer
period.
● Algorithm Loop:
○ Repeat steps 2-5 for each window of N requests until the termination
condition is met.

Please refer to the GitHub repo for the full implementation of the algorithm.

Computer System Design

In this project, a learning based caching algorithm has been designed and is used to
cache wikipedia.org using a virtual private server (VPS).
High Level Design of the system is given below:
Wikipedia's infrastructure provides regular database dumps and static HTML archives to
the public, and has permissive licensing that allows for rehosting with modification. In
this project, these publicly available dumps have been used to host Wikipedia mirrors.

A Nginx caching proxy was created in front of wikipedia.org and it would serve the
clients. If the user’s request is present in the cache server, it will be served via proxy
only. Otherwise, the request will be sent to the original wikipedia server. In this way, it
will help in reducing the latency and enhancing user experience. It also has the benefit
that if some DDoS attacks happen on the original wikipedia, then the user’s request can
be served from the proxy server if it is present in the cache.

Note: Due to unavailability of Cloud Platforms, proxy server in this project was
established on local machine’s virtual machine(VM).

Simulation Experiment and Comparison with SOTA Algorithms:

The algorithm was simulated on the following workloads:

1. MSR Cambridge :
2. FIU SRC_Map
3. CloudVPS
The results are shown below:

when learning rate was initialized at 0.05 and cache size was set at 7 and 10 :

when learning rate was initialized at 0.05 and cache size was set at 15 and 20 :
when the learning rate was initialized at 0.1 and cache size was set at 7 and 10:

when the learning rate was initialized at 0.1 and cache size was set at 15 and 20:
when the learning rate was initialized at 0.3 and cache size was set at 7 and 10:

when the learning rate was initialized at 0.3 and cache size was set at 15 and 20:
This is the result obtained for MSR Cambridge workload. Learning based algorithm
was compared with many state of the art algorithms and at varying learning rates. It can
be seen that the Learning based caching algorithm has performed better than many
standard algorithms.

Note: All these results are also available in the github repo.

Operational Issues Faced

1. Reward Design and Function:

Designing an effective reward function was challenging as it may lead to
suboptimal behavior if not carefully crafted.
Solution: Experimented with different reward shaping techniques to provide
more informative and stable feedback to the RL agent during training.

2. Training Stability and Convergence:

RL algorithms are sensitive to hyperparameters, training setup, and
environmental dynamics, leading to instability or slow convergence during
training.
Solution: Regularized the learning model with techniques like gradient clipping
to prevent divergence or instability.

3. Dependency on Local Virtual Machines due to Lack of Cloud Computing

Platform Access:
Inability to access cloud platforms, which require credit card credentials for
provisioning Virtual Private Servers (VPS), necessitated the use of local virtual
machines (VMs) to host the server environment. This introduced additional
complexities related to resource management, networking configuration and
security.
Solution: Optimize resource allocation and management for both the host server
and guest to ensure smooth operation without resource contention.
REFERENCES

[ 1 ] Y. Im, P. Prahladan, T. H. Kim, Y. G. Hong and S. Ha, "SNN-cache: A practical machine

learning-based caching system utilizing the inter-relationships of requests," 2018 52nd Annual
Conference on Information Sciences and Systems (CISS), Princeton, NJ, USA, 2018, pp. 1-6,
doi: 10.1109/CISS.2018.8362281.
keywords: {Metadata;Servers;Heuristic algorithms;Correlation;Machine learning algorithms;Data
mining;Real-time systems},

[ 2 ] N. Zaidenberg, L. Gavish and Y. Meir, "New caching algorithms performance evaluation,"

2015 International Symposium on Performance Evaluation of Computer and Telecommunication
Systems (SPECTS), Chicago, IL, USA, 2015, pp. 1-7, doi: 10.1109/SPECTS.2015.7285291.
keywords: {Benchmark testing;Algorithm design and analysis;Classification
algorithms;Indexes;Arrays;Servers;Complexity theory;Caching
algorithms;memcached;ARC;RMARK},

[ 3 ] Rodriguez, L. V., Yusuf, F. B., Lyons, S., Paz, E., Rangaswami, R., Liu, J., Zhao, M., and
Narasimhan, G. Learning cache replacement with cacheus. In FAST (2021), pp. 341–354.

[ 4 ] Jiayi Chen, Nihal Sharma, Tarannum Khan, Shu Liu, Brian Chang, Aditya Akella, Sanjay
Shakkottai, and Ramesh K Sitaraman. 2023. Darwin: Flexible Learning-based CDN Caching. In
Proceedings of the ACM SIGCOMM 2023 Conference (ACM SIGCOMM '23). Association for
Computing Machinery, New York, NY, USA, 981–999. https://doi.org/10.1145/3603269.3604863

[5] https://docs.sweeting.me/s/self-host-a-wikipedia-mirror and

https://github.com/pirate/wikipedia-mirror (used for creating Nginx server to host wikipedia.org)

[6] https://learn.microsoft.com/en-us/azure/cdn/media/cdn-overview/cdn-overview.png

Cache
No ratings yet
Cache
145 pages
Cache Replacement Policies
No ratings yet
Cache Replacement Policies
82 pages
Ch1 Cache Principles
No ratings yet
Ch1 Cache Principles
56 pages
Web Prefetching
No ratings yet
Web Prefetching
182 pages
LiuXixi 2020
No ratings yet
LiuXixi 2020
50 pages
PastPapers Harony P4 2024
No ratings yet
PastPapers Harony P4 2024
484 pages
Cache Presentation
No ratings yet
Cache Presentation
45 pages
Lab3 Suppl
No ratings yet
Lab3 Suppl
25 pages
Slides 5
No ratings yet
Slides 5
25 pages
Module4 CAche Performance
No ratings yet
Module4 CAche Performance
40 pages
Wu 2021
No ratings yet
Wu 2021
15 pages
Cache
No ratings yet
Cache
14 pages
Halp: Heuristic Aided Learned Preference Eviction Policy For Youtube Content Delivery Network
No ratings yet
Halp: Heuristic Aided Learned Preference Eviction Policy For Youtube Content Delivery Network
15 pages
Arc: A Self-Tuning, Low Overhead Replacement Cache
No ratings yet
Arc: A Self-Tuning, Low Overhead Replacement Cache
17 pages
Cache PPT
No ratings yet
Cache PPT
38 pages
DAA Miniproject
No ratings yet
DAA Miniproject
11 pages
Research Paper 2 2019
No ratings yet
Research Paper 2 2019
8 pages
Cluster Based Content Caching Driven by Popularity Prediction
No ratings yet
Cluster Based Content Caching Driven by Popularity Prediction
10 pages
Modelo Revista Henri Poincar
No ratings yet
Modelo Revista Henri Poincar
11 pages
FMS-3000 CJ1+ CJ2+ CJ3
No ratings yet
FMS-3000 CJ1+ CJ2+ CJ3
878 pages
An Imitation Learning Approach For Cache Replacement: B A C B D Hit C B A D Miss Cache Evict
No ratings yet
An Imitation Learning Approach For Cache Replacement: B A C B D Hit C B A D Miss Cache Evict
14 pages
PSAC Proactive Sequence-Aware Content Caching Via Deep Learning at The Network Edge
No ratings yet
PSAC Proactive Sequence-Aware Content Caching Via Deep Learning at The Network Edge
10 pages
CATIA - Assembly Modeling Exercises
No ratings yet
CATIA - Assembly Modeling Exercises
28 pages
10 Caches
No ratings yet
10 Caches
34 pages
Boosting Cache Performance by Access Time Measurements: Gil Einziger, Omri Himelbrand, Erez Waisbard
No ratings yet
Boosting Cache Performance by Access Time Measurements: Gil Einziger, Omri Himelbrand, Erez Waisbard
29 pages
Intelligent Web Caching Using Neurocomputing and Particle Swarm Optimization Algorithm
No ratings yet
Intelligent Web Caching Using Neurocomputing and Particle Swarm Optimization Algorithm
6 pages
A Crash Course in Caching - Part 1 - by Alex Xu
No ratings yet
A Crash Course in Caching - Part 1 - by Alex Xu
9 pages
Cache and Caching: Electrical and Electronic Engineering
No ratings yet
Cache and Caching: Electrical and Electronic Engineering
15 pages
Lab 8
No ratings yet
Lab 8
10 pages
CSE (CS) - JNTUA - R23 - B.tech. Cyber Security III & IV Year Course Structure & Syllabus (Repaired)
No ratings yet
CSE (CS) - JNTUA - R23 - B.tech. Cyber Security III & IV Year Course Structure & Syllabus (Repaired)
64 pages
Coacs: A Cooperative and Adaptive Caching System For Manets
No ratings yet
Coacs: A Cooperative and Adaptive Caching System For Manets
17 pages
APC: Self-Tuning, Low Overhead Replacement Cache
No ratings yet
APC: Self-Tuning, Low Overhead Replacement Cache
16 pages
Cache (Computing)
No ratings yet
Cache (Computing)
12 pages
Robust, Virtual Models
No ratings yet
Robust, Virtual Models
6 pages
Introduction To Programming Language C 2023
100% (1)
Introduction To Programming Language C 2023
44 pages
The Complete Guide To Prompt Engineering....
No ratings yet
The Complete Guide To Prompt Engineering....
47 pages
DLL 2022-2023
No ratings yet
DLL 2022-2023
25 pages
Cache and Caching: Electrical and Electronic Engineering
No ratings yet
Cache and Caching: Electrical and Electronic Engineering
15 pages
Dashboard in A Day
No ratings yet
Dashboard in A Day
40 pages
Elements of Cache Design
No ratings yet
Elements of Cache Design
6 pages
SDXPST 66 2515.3
No ratings yet
SDXPST 66 2515.3
816 pages
Peak 715
No ratings yet
Peak 715
11 pages
Wago App PLC Modbus
No ratings yet
Wago App PLC Modbus
94 pages
EEI3346 Final Written Paper - 9th January2022
No ratings yet
EEI3346 Final Written Paper - 9th January2022
10 pages
Hi-Fi Choice - March 2017
No ratings yet
Hi-Fi Choice - March 2017
132 pages
TCT2 - PDH Principles - 1688735713572
No ratings yet
TCT2 - PDH Principles - 1688735713572
52 pages
Unit 2: Chapter 3: Requirements Analysis and Specification 1. Requirements Gathering and Analysis
No ratings yet
Unit 2: Chapter 3: Requirements Analysis and Specification 1. Requirements Gathering and Analysis
21 pages
24-Bit, 4-Channel Simultaneous Sampling 1.5 MSPS Precision Alias Free ADC
No ratings yet
24-Bit, 4-Channel Simultaneous Sampling 1.5 MSPS Precision Alias Free ADC
86 pages
MS Xca
No ratings yet
MS Xca
30 pages
66f2333917152bc83a343f60 94216597565
No ratings yet
66f2333917152bc83a343f60 94216597565
2 pages
BCS 031 Previous Year Question Papers by Ignouassignmentguru
No ratings yet
BCS 031 Previous Year Question Papers by Ignouassignmentguru
51 pages
Aoop-A CH
No ratings yet
Aoop-A CH
34 pages
Documento de Fresh-1
No ratings yet
Documento de Fresh-1
2 pages
Vani Resum
No ratings yet
Vani Resum
3 pages
CSIT Nepal: Subject: Fundamental of Computer Programming Year: 2065
No ratings yet
CSIT Nepal: Subject: Fundamental of Computer Programming Year: 2065
8 pages
Zarin Tasnim
No ratings yet
Zarin Tasnim
11 pages
Assignment 4
No ratings yet
Assignment 4
5 pages
29 - Tarisa Fitria Fasya - Computer Engineering
No ratings yet
29 - Tarisa Fitria Fasya - Computer Engineering
4 pages
Exam1 f12
No ratings yet
Exam1 f12
15 pages
8255 Architecture and CWR
No ratings yet
8255 Architecture and CWR
3 pages
WSMA-Mid 1 Descriptive QP
No ratings yet
WSMA-Mid 1 Descriptive QP
3 pages
Who Is Arthur Noriega - Google Search
No ratings yet
Who Is Arthur Noriega - Google Search
1 page
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
From Everand
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
Sebastian Raschka
4/5 (20)
Building Scalable Systems with C: Optimizing Performance and Portability
From Everand
Building Scalable Systems with C: Optimizing Performance and Portability
Larry Jones
No ratings yet
Mainframe Mastery with DevOps: Integrating Legacy Systems with Agile Practices: Mainframes
From Everand
Mainframe Mastery with DevOps: Integrating Legacy Systems with Agile Practices: Mainframes
Ricardo Nuqui
No ratings yet
Optimized Caching Techniques: Application for Scalable Distributed Architectures
From Everand
Optimized Caching Techniques: Application for Scalable Distributed Architectures
Peter Jones
No ratings yet
Mastering GCP for Web Applications: A Well-Architected Approach to Cloud Excellence
From Everand
Mastering GCP for Web Applications: A Well-Architected Approach to Cloud Excellence
Chinmoy Mukherjee
No ratings yet
Mainframe Containerization Mastery: Mainframes
From Everand
Mainframe Containerization Mastery: Mainframes
Ricardo Nuqui
No ratings yet
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
Efficient Development with Neovim: Definitive Reference for Developers and Engineers
From Everand
Efficient Development with Neovim: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Confluent Certified Developer for Apache Kafka® Exam kit
From Everand
Confluent Certified Developer for Apache Kafka® Exam kit
PRIYANKA
No ratings yet
Advanced React Patterns
From Everand
Advanced React Patterns
Pedro Martins
No ratings yet
JavaScript Algorithms Step by Step: A Practical Guide with Examples
From Everand
JavaScript Algorithms Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
AI-Driven Web Apps: Practical Machine Learning for Software Developers
From Everand
AI-Driven Web Apps: Practical Machine Learning for Software Developers
Sivaramarajalu Ramadurai Venkataraajalu
No ratings yet
What's New in .NET 8? A Complete Guide to the Latest Features
From Everand
What's New in .NET 8? A Complete Guide to the Latest Features
Nitika
No ratings yet
Google Cloud Data Engineer 100+ Practice Exam Questions With Well Explained Answers
From Everand
Google Cloud Data Engineer 100+ Practice Exam Questions With Well Explained Answers
vivian njoroge
No ratings yet
Efficient JavaScript Automation with Grunt: Definitive Reference for Developers and Engineers
From Everand
Efficient JavaScript Automation with Grunt: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Designing deep learning systems: Software engineering, #1
From Everand
Designing deep learning systems: Software engineering, #1
rayaan
No ratings yet
PyTorch Essentials: A Comprehensive Guide to Machine Learning Techniques
From Everand
PyTorch Essentials: A Comprehensive Guide to Machine Learning Techniques
Adam Jones
No ratings yet
Practical MXNet Applications: Definitive Reference for Developers and Engineers
From Everand
Practical MXNet Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
From Everand
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenACC Programming Essentials: Definitive Reference for Developers and Engineers
From Everand
OpenACC Programming Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Microsoft Azure DevOps Engineer AZ 400
From Everand
Microsoft Azure DevOps Engineer AZ 400
Manish Soni
No ratings yet
Mastering Kubernetes
From Everand
Mastering Kubernetes
Manish Soni
No ratings yet
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
From Everand
HPE Compute Certification Guide: 444 Practice Questions for the Advanced HPE1-H02 Exam
Steve Brown
No ratings yet
The Nginx Handbook: Practical Solutions for Load Balancing and Reverse Proxy
From Everand
The Nginx Handbook: Practical Solutions for Load Balancing and Reverse Proxy
Robert Johnson
No ratings yet
Netdata in Practice: Definitive Reference for Developers and Engineers
From Everand
Netdata in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Kafka Developer Certified: The Essential Guide
From Everand
Kafka Developer Certified: The Essential Guide
SUJAN
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

CSD Final Report

Uploaded by

CSD Final Report

Uploaded by

CS559 Final Report

Team member - Mitul Vardhan(12141070)

Github Link: https://github.com/ramanujan123/CS559_PROJECT

Project Title : Implementation and Evaluation of ML-based

Design of Learning Based Caching Algorithm

Computer System Design

Simulation Experiment and Comparison with SOTA Algorithms:

The algorithm was simulated on the following workloads:

Operational Issues Faced

1. Reward Design and Function:

2. Training Stability and Convergence:

3. Dependency on Local Virtual Machines due to Lack of Cloud Computing

[ 1 ] Y. Im, P. Prahladan, T. H. Kim, Y. G. Hong and S. Ha, "SNN-cache: A practical machine

[ 2 ] N. Zaidenberg, L. Gavish and Y. Meir, "New caching algorithms performance evaluation,"

[5] https://docs.sweeting.me/s/self-host-a-wikipedia-mirror and

You might also like