📍 Build a Local RAG Agent with Llama 3.2 and Qdrant Learn to run a fully offline RAG agent on your machine with complete control over your data. Shubham Saboo and Gargi Gupta break it down in this step-by-step guide. What’s in it: ☑ Fully local RAG setup – no external APIs, no internet dependencies ☑ Llama 3.2 via Ollama and Qdrant for vector search ☑ Interactive playground for real-time exploration ☑ Less than 20 lines of Python to get started 🔗 Start building here: https://lnkd.in/dsEvnibn 📹 Watch the demo on Youtube: https://lnkd.in/dagDjJc8
Qdrant
Softwareentwicklung
Berlin, Berlin 28.054 Follower:innen
Massive-Scale Vector Database
Info
Powering the next generation of AI applications with advanced and high-performant vector similarity search technology. Qdrant engine is an open-source vector search database. It deploys as an API service providing a search for the nearest high-dimensional vectors. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more. Make the most of your Unstructured Data!
- Website
-
https://qdrant.tech
Externer Link zu Qdrant
- Branche
- Softwareentwicklung
- Größe
- 51–200 Beschäftigte
- Hauptsitz
- Berlin, Berlin
- Art
- Privatunternehmen
- Gegründet
- 2021
- Spezialgebiete
- Deep Tech, Search Engine, Open-Source, Vector Search, Rust, Vector Search Engine, Vector Similarity, Artificial Intelligence und Machine Learning
Orte
-
Primär
Berlin, Berlin 10115, DE
Beschäftigte von Qdrant
-
Predrag Knežević
Staff Platform Engineer at Qdrant
-
Wei Lien Dang
General Partner at Unusual Ventures | Investing in AI, data, security, dev tools, OSS
-
Fabrizio Schmidt
VP of Engineering & Product @ Qdrant
-
Todd Didier
Powering the Future of AI-Driven Search & Data Analysis
Updates
-
Want to create a fully functional RAG-based app from the ground up? 🛫 This series by Jupiter will take you through the journey of building a RAG app for therapeutic consultation and mental wellness using LangChain, Qdrant, FastAPI, Vercel, MongoDB, and Hugging Face. Some of the things you’ll learn: ☑ Backend/frontend setup with FastAPI and Next.js ☑ Auth with Auth0 by Okta ☑ Dataset preparation and retrieval with Qdrant ☑ Real-time conversational features With real code examples and practical steps to implement RAG in a real-world project. 🚀 Ready to build this weekend? Start with Part 0: https://lnkd.in/dmsFqB5v
-
Qdrant hat dies direkt geteilt
Find your 𝐓𝐰𝐢𝐧 𝐂𝐞𝐥𝐞𝐛𝐫𝐢𝐭𝐲 in 𝐕𝐞𝐜𝐭𝐨𝐫 𝐒𝐩𝐚𝐜𝐞! 👯 𝐋𝐞𝐬𝐬𝐨𝐧 𝟏: 𝐄𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧 ----- Have you ever wondered which celebrity you look like? 🤔 Chances are you’ve tried one of those apps that claim to find your celebrity twin. But have you ever stopped to think about how these apps actually work? In this new project, I'll show you how to build a Twin Celebrity App using vector similarity in Qdrant. Yes, you read that right—by leveraging the power of embeddings, we can create a fully functional app. It’s simpler than you think! 🙏 But let's not get ahead of ourselves .... let's start with the first lesson 👇 ----- 💠 𝐅𝐢𝐫𝐬𝐭 𝐋𝐞𝐬𝐬𝐨𝐧: 𝐄𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐨𝐧 We’re starting with the foundation—creating and storing celebrity embeddings in a vector database. Here’s the step-by-step breakdown: 🔶 𝐃𝐚𝐭𝐚𝐬𝐞𝐭 𝐋𝐨𝐚𝐝𝐢𝐧𝐠 / 𝐏𝐫𝐞𝐩𝐫𝐨𝐜𝐞𝐬𝐬𝐢𝐧𝐠 We need a dataset containing celebrity images. This can be easily accessed via Hugging Face using the 𝐝𝐚𝐭𝐚𝐬𝐞𝐭𝐬 library. For this project, sampling a fraction of the dataset is enough - curious about the dataset I've used? Check the repo! 🔶 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐞 𝐄𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠𝐬 Using a PyTorch implementation of the 𝐅𝐚𝐜𝐞𝐍𝐞𝐭 paper, we can create embeddings well suited for face recognition. The library (𝐟𝐚𝐜𝐞𝐧𝐞𝐭-𝐩𝐲𝐭𝐨𝐫𝐜𝐡) also includes an implementation of 𝐌𝐓𝐂𝐍𝐍 for efficient face detection. This model will be very valuable when processing the user's selfies (future lessons 🤣 ) 🔶 𝐒𝐭𝐨𝐫𝐢𝐧𝐠 𝐄𝐦𝐛𝐞𝐝𝐝𝐢𝐧𝐠𝐬 𝐢𝐧 𝐐𝐝𝐫𝐚𝐧𝐭 Now that we have the embeddings ready, it's time to store them in Qdrant. The free cluster on Qdrant Cloud is perfect for this project—enough to store all necessary vectors without incurring any costs. 🔶 𝐏𝐢𝐩𝐞𝐥𝐢𝐧𝐞𝐬 𝐰𝐢𝐭𝐡 ZenML To make our workflow maintainable and production-ready, I’ve orchestrated all these steps with ZenML. This first lesson ends at this point, with all the embeddings stored in Qdrant. 💠 𝐖𝐡𝐚𝐭'𝐬 𝐍𝐞𝐱𝐭? In the upcoming lessons, we’ll cover: 🔸 Building a backend for our app using FastAPI and deploying it to Cloud Run. 🔸 Creating a frontend UI for our app with Streamlit. ----- That's everything for today! 💪 𝐖𝐡𝐚𝐭 𝐝𝐨 𝐲𝐨𝐮 𝐭𝐡𝐢𝐧𝐤? 𝐀𝐫𝐞 𝐲𝐨𝐮 𝐫𝐞𝐚𝐝𝐲 𝐭𝐨 𝐟𝐢𝐧𝐝 𝐲𝐨𝐮𝐫 𝐜𝐞𝐥𝐞𝐛𝐫𝐢𝐭𝐲 𝐭𝐰𝐢𝐧? 🚀 P.S. I'm already working on the 𝐘𝐨𝐮𝐓𝐮𝐛𝐞 𝐯𝐢𝐝𝐞𝐨 for this first lesson! 📺 ----- 💡 𝐅𝐨𝐥𝐥𝐨𝐰 𝐦𝐞 𝐟𝐨𝐫 𝐫𝐞𝐥𝐞𝐯𝐚𝐧𝐭 𝐜𝐨𝐧𝐭𝐞𝐧𝐭 𝐨𝐧 𝐩𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 𝐌𝐋, 𝐌𝐋𝐎𝐩𝐬 𝐚𝐧𝐝 𝐆𝐞𝐧𝐞𝐫𝐚𝐭𝐢𝐯𝐞 𝐀𝐈 #mlops #machinelearning #datascience
-
🔥 𝐏𝐫𝐚𝐜𝐭𝐢𝐜𝐚𝐥 𝐆𝐮𝐢𝐝𝐞 𝐭𝐨 𝐑𝐀𝐆-𝐛𝐚𝐬𝐞𝐝 𝐏𝐢𝐩𝐞𝐥𝐢𝐧𝐞𝐬 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐚𝐧𝐝 𝐈𝐦𝐩𝐫𝐨𝐯𝐞𝐦𝐞𝐧𝐭 Atita Arora (Solution Architect, Qdrant) and Deanna Emery's (Founding AI Researcher, Quotient AI) talk from the AI Engineer World's Fair is finally available! In the video, they address core challenges in #RAG-based pipelines and show how to improve desirable metrics like faithfulness, chunk and context relevance step-by-step. You'll get practical tips for evaluating and improving your #RAG-based systems, with real examples that led to an 8% performance boost. 👉 Worth checking out https://lnkd.in/dcrrUbxJ
-
We’re thrilled to welcome Todd Didier as a Senior Account Executive to the Qdrant team! Based in Los Angeles, California, Todd brings extensive experience working with AI-native and digital-native companies, as well as enterprises across the West Coast. Welcome aboard! 🎉
-
Aryn AI's Sycamore is an LLM-powered data preparation, processing, and analytics system for complex, unstructured documents like PDFs, HTML, presentations, and more. You can prepare data for GenAI and RAG applications, power high-quality document processing workflows, and run analytics on large document collections with natural language. With the v0.1.22 release of Sycamore, you can now read from and ingest documents into Qdrant collections as part of Sycamore pipelines. - The usage documentation is at https://buff.ly/4e4IkfT - You can find an end-to-end example at https://buff.ly/4f70xdm
-
👥𝐕𝐢𝐝𝐞𝐨 𝐬𝐞𝐫𝐢𝐞𝐬 𝐨𝐧 𝐡𝐨𝐰 𝐭𝐨 𝐛𝐮𝐢𝐥𝐝 𝐟𝐫𝐨𝐦 𝐬𝐜𝐫𝐚𝐭𝐜𝐡 𝐚𝐧 𝐀𝐈-𝐩𝐨𝐰𝐞𝐫𝐞𝐝 𝐬𝐨𝐜𝐢𝐚𝐥 𝐦𝐞𝐝𝐢𝐚 𝐚𝐩𝐩 In many social media platforms, information quickly gets lost, and search engines are usually not calibrated to search well, especially when it comes to semantic search. Richard Osborne builds a social media platform called "Tribu" from scratch in his YouTube series. Due to the LLM and vector search combination under the hood, it has excellent potential to provide better friends and communities semantic recommendations + remove doomscrolling from user experience. The stack of "Tribu" is: ✅ Qdrant as a vector store for similarity searches; ✅ n8n for server-side workflows; ✅ 𝐋𝐥𝐚𝐦𝐚 3.2 as a conversational LLM; ✅ OpenNoodl for the frontend; ✅ Zep AI (YC W24) for conversation memory and building up key facts about each user; ✅ 𝐏𝐨𝐜𝐤𝐞𝐭𝐁𝐚𝐬𝐞 as an authentication and database backend. 👉 Check it out https://lnkd.in/daZNXcVq
-
Qdrant hat dies direkt geteilt
Build robust GenAI pipelines with LlamaIndex, Qdrant, and MLflow for advanced RAG! Learn how to: 🔍 Streamline RAG workflows for better scalability 📊 Ensure performance consistency across model versions 🚀 Optimize indexing systems for efficiency This step-by-step guide covers: • Setting up MLflow for experiment tracking • Configuring Qdrant for vector storage • Integrating Ollama for LLM and embedding models • Creating a playground app to test the full pipeline Check out the full post here: https://lnkd.in/gCnm9JgE
-
Qdrant hat dies direkt geteilt
One of the most frequent questions we receive via the Qdrant community and support channels is about 𝐨𝐩𝐭𝐢𝐦𝐢𝐳𝐢𝐧𝐠 𝐩𝐞𝐫𝐟𝐨𝐫𝐦𝐚𝐧𝐜𝐞, memory 𝐫𝐞𝐬𝐨𝐮𝐫𝐜𝐞𝐬 and 𝐩𝐫𝐞𝐜𝐢𝐬𝐢𝐨𝐧. However, it is precisely like with 𝘎𝘰𝘰𝘥, 𝘍𝘢𝘴𝘵, and 𝘊𝘩𝘦𝘢𝘱. You can pick only two of these, and there will always be a trade-off on the third. ➡ High-speed search and low memory usage: Yes, but you will need to give up high precision. ➡ High precision & low memory usage: Yes, you can utilize other on-disk storage, but your responses will not be the fastest. ➡ High precision & high-speed search: Yes, you will need to put everything into memory, and it will not be cheap. More on this in the 𝐎𝐩𝐭𝐢𝐦𝐢𝐳𝐚𝐭𝐢𝐨𝐧 𝐆𝐮𝐢𝐝𝐞: https://lnkd.in/dzA3SAiY Also recommended. This article by Benito Martin about Balancing Accuracy and Speed with Qdrant Hyperparameters, Hybrid Search, and Semantic Caching https://lnkd.in/dZdnd5Xy
-
📢 Join our upcoming Webinar: ColPali + Qdrant in Action! Struggling to retrieve complex, visually rich PDFs? ColPali has you covered with efficient document retrieval using Vision Language Models. What’s on the agenda: ✅ How ColPali Enhances Document Retrieval ✅ Boosting ColPali with Qdrant's Binary Quantization ✅ Using ColPali as a Reranker ✅ ColPali in RAG/Vision RAG Applications Led by Sabrina A., Evgeniya Sukhodolskaya, and Atita Arora. 📅 Date: Thursday, November 21, 2024 🕗 Time: 8:00 AM PT / 11:00 AM ET / 5:00 PM CET 👉 RSVP: https://lnkd.in/dW9jRHSk