A multimodal chat interface with many tools.
-
Updated
Oct 18, 2024 - Python
A multimodal chat interface with many tools.
日本語LLMまとめ - Overview of Japanese LLMs
It is a case study of an intelligent agent for Ocean.
autoupdate paper list
Use PEFT or Full-parameter to finetune 350+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL2, MiniCPM-V-2.6, GLM4v, Xcomposer2.5, Yi-VL, DeepSeek-VL, Phi3.5-Vision, ...)
Visualize streams of multimodal data. Fast, easy to use, and simple to integrate. Built in Rust using egui.
Generative AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. It features AI personas, AGI functions, multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.
This repo contains the code and data for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks"
LunarDB is a cache key-value store database made in C++
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Fast Multimodal LLM on Mobile Devices
RAI is a multi-vendor agent framework for robotics, utilizing Langchain and ROS 2 tools to perform complex actions, defined scenarios, free interface execution, log summaries, voice interaction and more.
The easiest way to serve AI apps and models - Build reliable Inference APIs, LLM apps, Multi-model chains, RAG service, and much more!
Sample app for service ripe.net
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
Semantic alignment of astronomical data with natural language using multi-modal models. (Jax) Code associated with https://arxiv.org/abs/2403.08851 (COLM 2023).
Rerun viewer with Gradio
Codebase for Aria - an Open Multimodal Native MoE
Add a description, image, and links to the multimodal topic page so that developers can more easily learn about it.
To associate your repository with the multimodal topic, visit your repo's landing page and select "manage topics."