Pipecat 0.0.48 release out today! Pipecat is the open source framework for voice and multimodal AI agents. Pipecat developers build voice and video AI agents, with flexibility for transport. (Pipecat is not tightly coupled to one transport platform.) Lots of good stuff, including: ✅ Server-side (in the Pipecat pipeline) support for Krisp's excellent audio processing models. These are commercial models, but if you need state of the art background noise reduction and speaker isolation, they're well worth paying for. Krisp's client-side models are available for free in our Daily Web, iOS, and Android SDKs. But for telephony use cases, or if you want to minimize CPU use of your client apps, you can now run Krisp models server-side. ✅ A new Tavus video avatars pipeline element. Clone yourself. Talk to yourself. Clone your friends. Talk to your friends. Introduce your clone to the clones of your friends and see what happens. ✅ Output audio mixers. Add music or background noise to your voice AI agents. See `examples/foundational/23-bot-background-sound.py` ✅ New frame processor input queues make it easier to sequence things like TTS speech fragments. 💛 Aleix Conchillo Flaqué #pipecat #voiceai #conversationalai
Daily
Software Development
San Francisco, CA 3,100 followers
Build voice, video, and real-time AI into any app
About us
Daily is a real-time voice, video, and AI platform for developers. Build and scale on our secure and compliant global infrastructure. Loved by developers at startups to the Fortune 500, Daily offers AI-ready capabilities, customizable and flexible APIs, advanced recording, real-time insights, and more — all delivered with transparent and affordable pricing.
- Website
-
https://daily.co
External link for Daily
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- San Francisco, CA
- Type
- Privately Held
- Founded
- 2016
- Specialties
- WebRTC, AI, Video, and Voice
Products
Locations
-
Primary
San Francisco, CA, US
Employees at Daily
Updates
-
First day at AI Tinkerers Hackathon at Weights & Biases! Busy day building, plus notes from Shrestha Basu Mallick on Google Gemini and Tereza Tizkova from E2B
-
Excited to be here at the AI Tinkerers Hackathon! Come find our teammate Varun S. and say hi! ☀ Theme: Humans-in-the-loop agents 🗓 When: November 2nd – November 3rd 📍 Where: Weights & Biases HQ, San Francisco ☁ Sponsored by: Google Cloud 🏢 Hosted by: Weights & Biases 🤝 Community Sponsors: Daily, CopilotKit🪁, E2B, Browserbase, HumanLayer (YC F24), Payman
We're hacking at the AI Tinkerers, Human-in-the-loop Agents Hackathon. Come say hi 👋🏽
-
We're gathered announcing the SF winners of our hackathon! Join us via the livestream https://lnkd.in/gtVbm7Ym #hackathon
-
Daily reposted this
This is a nice example of what production Voice AI applications often look like. The core voice AI loop is [speech-to-text] + [LLM] + [text-to-speech]. But in real-world use cases you also need: - event-driven UI updates - use-case specific context management - context storage/hydration - observability/metrics - text processing/reformatting - content guardrails - function calling abstractions and orchestration helpers This is Pipecat code. Pipecat is the largest framework for real-time voice and multi-modal AI — Open Source and vendor-neutral tooling for building voice AI agents, interactive video avatars, voice co-pilots, and more. https://lnkd.in/gRQfHTY2
-
Voice AI + Winne the Pooh 🤩 This is a wonderful example of real-time voice AI augmenting experiences. Neat post from David Beros and Brian Foody about what they're building at Sprout!
🌱 Over the past few weeks, my good friend Brian Foody and I have spent some evenings working on something close to home—creating screen-free, audio-based language and learning games for our kids. Today, we’re excited to introduce Sprout! Inspired by the joy our daughters experience listening to beautiful audiobook classics like Beatrix Potter’s The Tales of Peter Rabbit, and A.A Milne’s The House at Pooh Corner, we started thinking: what if these were interactive? How might we create rich, engaging, voice-driven experiences that educate and entertain without more screen time? The results exceeded our expectations, so we shipped it. One of our first releases is the game Guess the Animal, featuring the lovable (and now public-domain!) character of Winnie the Pooh. It's been super fun to work hands-on with new AI tooling including Daily, Cartesia, and Anthropic's Claude LLM to design and build the experience, plus the power of Framer for a rapid website. We’re excited to see Sprout in the hands of families everywhere. Give it a try with your kids and let us know what they think. 👇 #AI #EdTech #Parenting #LanguageLearning
-
Big Pipecat release today! Lots of low-level improvements to performance and ergonomics. ✳️ TTS service additions and improvements (Google, AWS, Azure). The event framework that makes it easy to build complex client apps and workflows is rounding out nicely. ✳️ The team spent a lot of time on this release working on function calling implementations and tests, across multiple models. (Stay tuned for more thoughts on this!) 🔥 There will be another release tomorrow, to add support for the OpenAI Realtime API! 🔥 Follow along in the repo/discord if you're interested in Python+WebRTC tooling for gpt-4o-realtime-preview. 🙌 via Aleix Conchillo Flaqué, our fearless maintainer: "Congrats everyone on this release because, whether you contribute or just use it, you all just make Pipecat better." #pipecat #python #webrtc #voiceai #multimodal #ai #genai #conversationalai #openai #realtimeapi #llm #llms
Release v0.0.42 · pipecat-ai/pipecat
github.com
-
Developers love building conversational voice AI agents with Cartesia. They bring the latest in research to enable state-of-the-art, realistic voices. We're excited to partner, and make it easy for developers to build with Cartesia right in Daily Bots! #conversationalAI #voiceAI #genAI
🎉Daily launches Daily Bots with Cartesia as Primary Voice Provider! 🚀🤖 Daily launched Daily Bots last month, attracting hundreds of developers instantly. We're honored to be the default provider for both Daily Bots and Pipecat, one of the fastest-growing open-source frameworks for voice AI. Since 2016, Daily has been pioneering real-time multimodal AI. It's a privilege to collaborate with Kwindla Hultman Kramer and the exceptional Daily team. Why Daily Bots with Cartesia is a Game-Changer: ⚡️ Ultra-low latency: voice-to-voice responses as fast as 500ms 🔄 Interruption support: Natural conversation flow, just like human dialogue 🧼 Clean abstractions: Simplified development for faster deployment 📚 Rich example library: A treasure trove of resources for developers Read the full story here 👇
Daily launches Daily Bots with Cartesia as Primary Voice Provider
cartesia.ai
-
What are the best tools for builders in 2024? Product Hunt is sharing expert insights across key categories. Daily cofounder Kwindla Hultman Kramer discusses the best #AI infrastructure tools. #GenAI #LLMs
Anonymous product reviews are low-value. Imagine instead you could ask the smartest founders what the best products in their niche are. How much more valuable would that be??? Today we launched Product Landscapes: - Immad Akhund broke down online banking software - Rajiv Ayyangar broke down video conferencing - Ben Lang broke down Notion templates .... and many more. Check out our launch here. We hope you learn a thing or two about these landscapes!! https://lnkd.in/e9MuQR7M
Product Landscapes - Category overviews written by top founders | Product Hunt
producthunt.com
-
Daily reposted this
This a really nice example of how to build an AI voice and video conversational agent that uses function calling. Function calling is an increasingly important component of production real-time AI applications. The latest LLMs are now quite good at calling functions reliably, which has greatly expanded the use cases that conversational AI is well suited to. For example, you can use function calling: ➡ as part of a dynamic RAG system that allows flexible access to a knowledge base, ➡ to help an AI follow a script, ➡ for saving information gathered from a user during a conversation, ➡ to implement lightweight lookup of dynamic information That last one is the basis for everyone's favorite docs example of function calling, the `get_weather()` tool! Function calling is still a fairly new capability and still a little bit challenging to implement from scratch for a new app. The first challenge is that you have to implement a control flow that glues together the LLM request/responses with code that makes the function calls and formats the function return values. The second challenge is that function calls add to latency, and low latency — fast response times — is critical for voice AI applications. The example in the tweet below uses state-of-the-art, low-latency AI tools from Tavus and Cartesia, and then goes the extra mile by hosting a Mistral AI model specialized for function calling on Cerebrium's excellent serverless GPU infrastructure. Go try out the demo. It's very, very responsive. It's also super-impressive that this is a pretty small amount of code — ~400 lines of Python, including all the function calling and scripting of the AI agent.
At Cerebrium, we have built a few demos showcasing voice AI capabilities but we wanted to push the boundary and see if we could create realistic, human-like situations in order to train and onboard teams to perform better - recreating real life scenarios! An example of this is simulations in handling an angry customer on a sales call, praciting to sell a new product line or preparing for the notoriously stressful YC interview👀 Excited to see what companies create internally! We built this with great partners Tavus Cartesia Mistral AI Check the comments for links to the demo, blog and code:) #startups #ai #aiavatars #genai #llm #voiceAI