The unofficial python package that returns response of Google Bard
A high-performance ML model serving framework, offers dynamic batching
Probabilistic reasoning and statistical analysis in TensorFlow
Build your chatbot within minutes on your favorite device
20+ high-performance LLMs with recipes to pretrain, finetune at scale
Easiest and laziest way for building multi-agent LLMs applications
Tensor search for humans
Powering Amazon custom machine learning chips
LLMFlows - Simple, Explicit and Transparent LLM Apps
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Run 100B+ language models at home, BitTorrent-style
Framework for Accelerating LLM Generation with Multiple Decoding Heads
Implementation of "Tree of Thoughts
Implementation of model parallel autoregressive transformers on GPUs
A computer vision framework to create and deploy apps in minutes
CPU/GPU inference server for Hugging Face transformer models