Architecting Scalable AI RAG Systems
Architecting Scalable AI RAG Systems
Experiences of tomorrow.
Engineered together.
We transform how people experience the
business. All through next generation technology.
2002 4000+
What we do: founded professionals
Product
Engineering
Intelligent
Automation
Data &
Analytics 20+ 300+
offices clients
2
Our Global Delivery Centres
Global Reach, Local Insight - Ciklum bridges the best in tech from the three key IT regions
Bulgaria
Czech Republic
Poland
Romania
Slovakia
Spain
Ukraine
United Kingdom
Asia
India
Pakistan
LATAM
Argentina
Uruguay
3
Our speakers
AI Tech Lead with over 11 years Certified Professional Machine Tech enthusiast specializing in
of hands-on experience in Learning Expert with 7 years Node.js, SQL/NoSQL and
Telecom, Fintech, and of commercial experience in Cloud technologies with 5+
Aerospace. He specializes in developing Machine Learning years of experience
AI, data integrity, fraud projects from the ground to
detection, system delivery into the Cloud (AWS Hands-on experience in
performance, architecting 5+YoE). projects across outsourcing
frameworks and solutions for and product companies,
real-time systems. Has worked and delivered contributing to the
primarily for customers from development of in-house
Develops an AI upskill S&P 500 products, smart chatbots, and
program for 300 engineers at voicebots by leveraging
Ciklum. different AI technologies
4
Our speakers
5
Playing in all parts of the AI stack
User Experiences & Engagement Emerging Stack Trends Partners
6
Agenda
01 What is RAG
05 Build with Javascript
7
Session’s Tech map
FAISS
Infrastructure
8
What is RAG
9 9
LLM Wrapper
● Build with Java
● Deploy locally
● Integrate a 3rd party client
10
Why do we need RAG?
11 11
AWS
● Build with Python
● Build Docker images
● Semantic search with FAISS
● Deploy on AWS
12
Data Chunking and LLMs
LLMs also have a limited capacity for context.
Just as humans cannot digest unlimited context, these models have a specific size limit for the content they
can process.
13
JavaScript
● Build with TypeScript
● Semantic Search with Pg vector
14
Embeddings. Similarity
● Embeddings
Numerical representations of concepts, in a high-dimensional space,
capturing semantic meaning.
● Similarity:
○ Lexical: entities are alike in appearance
○ Semantic: entities are alike in meaning
15
Azure
● Deploy on Azure
● Semantic Search with Qdrant
● Conversation history
16
RAG Architecture
17
Benefits of RAG
18
QA & Testing
● SW characteristics
● Top 5 risks
● Methods and tools
● Balanced success factors
19
Software Characteristics
ISO 25010 Product Quality Model
Functional Performance
Compatibility Usability Reliability Security Maintainability Portability
Suitability Efficiency
AI-specific Characteristics
20
Top 5 current shortcomings and risks
Ethical Loss of
Lack of Dynamic
& Bias Control Hallucinations
Transparency Learning
Concerns
Testing / QA
21
Experience-
Based
testing
Stress Pairwise
testing testing
Methods &
tools Transfer Experiences
Black-Box
that help us mitigate risks
Learning of tomorrow. testing
testing
and ensure proper testing Engineered
of AI Together.
Exploratory Robustness
testing testing
Combinatorial Metamorphic
testing testing
Some essential elements
that should be considered when verifying AI systems
KNOW TEST
THE ALGORITHM THE ALGORISM
BALANCED
SUCCESS
FACTORS
24
Context optimization
25
What is a good prompt
Act as an experienced Learning specialist. I need to improve my
upselling skills. Prepare an educational program for me to improve
that skills. Program should be for 2 month with 4 hours effort per
week.
Please provide answer with the next output:
Topic: Name Instruction
● blocks Context
● …
Role
Books:
Formatting
Example:
Topic: Negotiation basics
Tone
● Win-win strategy Examples
● Active listening strategy
Books: "Getting to Yes" by Roger Fisher and William Ury
26
Prompt tactics
Model-guided Self-evaluating
* Shot Prompting
prompting prompting
Zero Before answering, I want Can this program be
Add 2+2: you to first ask for any improved?
extra information that helps
One you produce a better
Add 3+3: 6 answer.
Add 2+2:
If you got no questions,
Few please provide an answer
Add 3+3: 6 instead.
Add 5+5: 10
Add 2+2:
27
Chain of
thoughts
Virma has three bags, each of which
fits five shirts. How many shirts can
Virma fit in her bags?
Let's think step-by-step.
28
Thread-of-Thought
29
Share your
feedback!
31
Product
Join our team
Engineering
From custom platform and
product development to
scaled agile delivery, we join
forces to build advanced
technology solutions
32