MindSpore ONE

This repository contains SoTA algorithms, models, and interesting projects in the area of multimodal understanding and content generation.

ONE is short for "ONE for all"

News

[2025.12.24] We release v0.5.0, compatibility with 🤗 Transformers v4.57.1 (70+ new models) and 🤗 Diffusers v0.35.2, plus previews of v0.36 pipelines like Flux2, QwenImageEditPlus, Lucy and Kandinsky5. Also introduces initial ComfyUI integration. Happy exploring!
[2025.11.02] v0.4.0 is released, with 280+ transformers models and 70+ diffusers pipelines supported. See here
[2025.04.10] We release v0.3.0. More than 15 SoTA generative models are added, including Flux, CogView4, OpenSora2.0, Movie Gen 30B, CogVideoX 5B~30B. Have fun!
[2025.02.21] We support DeepSeek Janus-Pro, a SoTA multimodal understanding and generation model. See here
[2024.11.06] v0.2.0 is released

Quick tour

To install v0.5.0, please install MindSpore 2.6.0 - 2.7.1 and run pip install mindone

Alternatively, to install the latest version from the master branch, please run:

git clone https://github.com/mindspore-lab/mindone.git
cd mindone
pip install -e .

We support state-of-the-art diffusion models for generating images, audio, and video. Let's get started using Stable Diffusion 3 as an example.

Hello MindSpore from Stable Diffusion 3!

import mindspore
from mindone.diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained(
    "stabilityai/stable-diffusion-3-medium-diffusers",
    mindspore_dtype=mindspore.float16,
)
prompt = "A cat holding a sign that says 'Hello MindSpore'"
image = pipe(prompt)[0][0]
image.save("sd3.png")

run hf diffusers on mindspore

mindone diffusers is under active development, most tasks were tested with MindSpore 2.6.0-2.7.1 on Ascend Atlas 800T A2 machines
compatible with 🤗 diffusers v0.35.2, preview supports for SoTA v0.36 pipelines, see support list
18+ training examples - controlnet, dreambooth, lora and more

run hf transformers on mindspore

mindone transformers is under active development, most tasks were tested with mindspore 2.6.0-2.7.1 on Ascend Atlas 800T A2 machines
compatibale with 🤗 transformers v4.57.1
providing 350+ state-of-the-art machine learning models in text, computer vision, audio, video, and multimodal model for inference, see support list

supported models under mindone/examples

task	model	inference	finetune	pretrain	institute
Text/Image-to-Video	wan2.1 🔥	✅	✖️	✖️	Alibaba
Text/Image-to-Video	wan2.2 🔥🔥	✅	✅	✖️	Alibaba
Audio/Image-Text-to-Text	qwen2_5_omni 🔥🔥	✅	✅	✖️	Alibaba
Image/Video-Text-to-Text	qwen2_5_vl 🔥🔥	✅	✅	✖️	Alibaba
Any-to-Any	qwen3_omni_moe 🔥🔥🔥	✅	✖️	✖️	Alibaba
Image-Text-to-Text	qwen3_vl/qwen3_vl_moe 🔥🔥🔥	✅	✖️	✖️	Alibaba
Text-to-Image	qwen_image 🔥🔥🔥	✅	✅	✖️	Alibaba
Text-to-Text	minicpm 🔥🔥	✅	✖️	✖️	OpenBMB
Any-to-Any	janus	✅	✅	✅	DeepSeek
Any-to-Any	emu3	✅	✅	✅	BAAI
Class-to-Image	var	✅	✅	✅	ByteDance
Text-to-Image	omnigen2 🔥	✅	✅	✖️	VectorSpaceLab
Text/Image-to-Video	hpcai open sora 1.2/2.0	✅	✅	✅	HPC-AI Tech
Text/Image-to-Video	cogvideox 1.5 5B~30B	✅	✅	✅	Zhipu
Image/Text-to-Text	glm4v 🔥	✅	✖️	✖️	Zhipu
Text-to-Video	open sora plan 1.3	✅	✅	✅	PKU
Text-to-Video	hunyuanvideo	✅	✅	✅	Tencent
Image-to-Video	hunyuanvideo-i2v 🔥	✅	✖️	✖️	Tencent
Text-to-Video	movie gen 30B	✅	✅	✅	Meta
Segmentation	lang_sam 🔥	✅	✖️	✖️	Meta
Segmentation	sam2	✅	✖️	✖️	Meta
Text-to-Video	step_video_t2v	✅	✖️	✖️	StepFun
Text-to-Speech	sparktts	✅	✖️	✖️	Spark Audio
Text-to-Image	flux	✅	✅	✖️	Black Forest Lab
Text-to-Image	stable diffusion 3	✅	✅	✖️	Stability AI

supported captioner

task	model	inference	finetune	pretrain	features
Image-Text-to-Text	pllava	✅	✖️	✖️	support video and image captioning

training-free acceleration

Introduce dit infer acceleration - DiTCache, PromptGate and FBCache with Taylorseer, tested on sd3 and flux.1.

Name		Name	Last commit message	Last commit date
Latest commit History 1,043 Commits
.github		.github
docs		docs
examples		examples
mindone		mindone
scripts		scripts
tests		tests
tools		tools
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
awesome_vision.md		awesome_vision.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MindSpore ONE

News

Quick tour

run hf diffusers on mindspore

run hf transformers on mindspore

supported models under mindone/examples

supported captioner

training-free acceleration

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 58

Uh oh!

Languages

License

mindspore-lab/mindone

Folders and files

Latest commit

History

Repository files navigation

MindSpore ONE

News

Quick tour

run hf diffusers on mindspore

run hf transformers on mindspore

supported models under mindone/examples

supported captioner

training-free acceleration

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 58

Uh oh!

Languages

Packages