Lecture 09 RAG
Lecture 09 RAG
Augmented Generation)
Young-Sik Choi,
Department of Artificial Intelligence
Korea Aerospace University
What is RAG?
• LLM이 신뢰할 수 있는 외부 지식 베이스를 참조하여 응답을 생
성하는 시스템
• 신뢰할 수 있는 지식과 결합해서 응답을 생성하므로 ‘hallucination’을
줄일 수 있고,
• 학습할 때의 데이터에 비하여, 최신 지식에 근거해서 응답을 생성할 수
있고,
• 생성된 응답의 출처를 제시할 수 있다.
• 검색 서비스와 결합된 LLM은 일종의 RAG 시스템
• Perplexity, SearchGPT 등
Basic Pipeline of RAG System
Indexing
Embedding Index
Documents
Chunks
Retrieval Generation
Index Top k
• Agentic Chunking: With this method chunks are processed using a large
language model to ensure each chunk stands alone with complete meaning,
enhancing coherence and context preservation.
• semchunk: A fast and lightweight Python library designed to split text into semantically
meaningful chunks. It supports various tokenizers and allows customization of chunk sizes.
• semantic-text-splitter: This Python library divides text into semantic chunks up to a desired size,
supporting length calculations by characters and tokens. It's callable from both Rust and Python,
making it versatile for different applications.
• semantic-chunker: A versatile library that divides text into semantically meaningful chunks by
employing a "Bring Your Own Embedder" approach. Users can provide their own embedding
functions to map text into vector spaces, facilitating flexible and context-aware chunking.
Text Embedding
Text and Code Embedding by Contrastive Pre-Training (OpenAI, 2022)
Encoder maps a chunk to embedding
Cosine Similarity Between Vectors
Contrastive Learning
Opensource for Text Embedding
• Sentence Transformers: A Python framework that provides state-of-the-art pre-
trained models for generating sentence and text embeddings. It supports tasks like
semantic search, clustering, and paraphrase mining.
• FastEmbed: A lightweight and fast Python library designed for embedding generation.
It supports popular text models and is optimized for speed and efficiency, making it
suitable for serverless runtimes.