Towards Ultimate Expert Specialization in Mixture-of-Experts Language
The ChatGPT Retrieval Plugin lets you easily find personal documents
Diffusion Transformer with Fine-Grained Chinese Understanding
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Repo for external large-scale work
Official PyTorch Implementation of "Scalable Diffusion Models"
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
Open-source, high-performance Mixture-of-Experts large language model
Open-Source Financial Large Language Models!
LLaMA: Open and Efficient Foundation Language Models
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
An Open Bilingual Chat LLM | Open Source Bilingual Conversation LLM
Implementation of model parallel autoregressive transformers on GPUs
Open Multilingual Multimodal Chat LMs
A minimal PyTorch re-implementation of the OpenAI GPT
Open-source pre-training implementation of Google's LaMDA in PyTorch
GLIDE: a diffusion-based text-conditional image synthesis model
An implementation of model parallel GPT-2 and GPT-3-style models
JetBrainsā 4B parameter code model for completions
Tencentās 36-language state-of-the-art translation model
OpenAIās compact 20B open model for fast, agentic, and local use