Skip to content

add AudioDiffusionPipeline and LatentAudioDiffusionPipeline #1334 #1426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Dec 5, 2022
Merged

Conversation

teticio
Copy link
Contributor

@teticio teticio commented Nov 25, 2022

I have added AudioDiffusionPipeline and LatentAudioDiffusionPipeline which I intend to migrate from https://github.com/teticio/audio-diffusion. I have added them to the main src as opposed to the community pipelines due to the inheritance of LatentAudioDiffusionPipeline from AudioDiffusionPipeline, which cannot be done in a single pipeline file, as well as the fact that the Mel class is needed to convert from audio to images and vice versa. It might make sense to move the Mel class somewhere more central, as it could be used by other pipelines.

teticio and others added 20 commits November 21, 2022 14:41
author teticio <[email protected]> 1668765652 +0000
committer teticio <[email protected]> 1669041721 +0000

parent 499ff34
author teticio <[email protected]> 1668765652 +0000
committer teticio <[email protected]> 1669041704 +0000

add colab notebook

[Flax] Fix loading scheduler from subfolder (#1319)

[FLAX] Fix loading scheduler from subfolder

Fix/Enable all schedulers for in-painting (#1331)

* inpaint fix k lms

* onnox as well

* up

Correct path to schedlure (#1322)

* [Examples] Correct path

* uP

Avoid nested fix-copies (#1332)

* Avoid nested `# Copied from` statements during `make fix-copies`

* style

Fix img2img speed with LMS-Discrete Scheduler (#896)

Casting `self.sigmas` into a different dtype (the one of original_samples) is not advisable. In my img2img pipeline this leads to a long running time in the  `integrate.quad` call later on- by long I mean more than 10x slower.

Co-authored-by: Anton Lozhkov <[email protected]>

Fix the order of casts for onnx inpainting (#1338)

Legacy Inpainting Pipeline for Onnx Models (#1237)

* Add legacy inpainting pipeline compatibility for onnx

* remove commented out line

* Add onnx legacy inpainting test

* Fix slow decorators

* pep8 styling

* isort styling

* dummy object

* ordering consistency

* style

* docstring styles

* Refactor common prompt encoding pattern

* Update tests to permanent repository home

* support all available schedulers until ONNX IO binding is available

Co-authored-by: Anton Lozhkov <[email protected]>

* updated styling from PR suggested feedback

Co-authored-by: Anton Lozhkov <[email protected]>

Jax infer support negative prompt (#1337)

* support negative prompts in sd jax pipeline

* pass batched neg_prompt

* only encode when negative prompt is None

Co-authored-by: Juan Acevedo <[email protected]>

Update README.md: Minor change to Imagic code snippet, missing dir error (#1347)

Minor change to Imagic Readme

Missing dir causes an error when running the example code.

make style

change the sample model (#1352)

* Update alt_diffusion.mdx

* Update alt_diffusion.mdx

Add bit diffusion [WIP] (#971)

* Create bit_diffusion.py

Bit diffusion based on the paper, arXiv:2208.04202, Chen2022AnalogBG

* adding bit diffusion to new branch

ran tests

* tests

* tests

* tests

* tests

* removed test folders + added to README

* Update README.md

Co-authored-by: Patrick von Platen <[email protected]>
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Nov 25, 2022

The documentation is not available anymore as the PR was closed or merged.

@teticio
Copy link
Contributor Author

teticio commented Nov 25, 2022

@patrickvonplaten See previous PR for additional comments (#1334)

@teticio
Copy link
Contributor Author

teticio commented Nov 30, 2022

@patrickvonplaten I guess you must be super busy but it would be great if you could just let me know if the basic approach of moving Mel into models so that it can be used as a compostable component in the pipeline (and therefore replaced by a neural alternative) works for you. Then I can migrate my saved models and existing repo to this format ahead of the release to diffusers. Bear in mind that I had to make Mel a LOADABLE_CLASS for this. Thanks and sorry for the bother.

@teticio
Copy link
Contributor Author

teticio commented Dec 2, 2022

@patrickvonplaten . I had to add ConfixMixin to the LOADABLE_CLASSES so that Mel could be instantiated from_pretrained. (Previously I had added Mel here, but agree that is too specific.) Can you think of a better solution? Maybe there should be a LoadableClasssMixin instead?

I jumped the gun and assumed that we are close to being able to merge, so I have updated my existing repo and model artefacts to be compatible with this PR. In other words, the slow tests will now work also.

@patrickvonplaten
Copy link
Contributor

Hey @teticio, I think you could still use the ModelMixin for the Mel class so that we don't need to update pipeline_utils.py :-)

.gitignore Outdated
*.mp4

hf-internal-testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix this directly on the test level

DanceDiffusionPipeline,
DDIMPipeline,
DDPMPipeline,
KarrasVePipeline,
LDMPipeline,
LDMSuperResolutionPipeline,
Mel,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Mel,

@patrickvonplaten
Copy link
Contributor

This PR looks good for merge to me!
Note that we should change the model_index.json as done in this PR because Mel should not be in the public API: https://huggingface.co/teticio/latent-audio-diffusion-ddim-256/commit/ac08e817d31ac82498abf4eee6fd3954db41fe27

We do the same for other models :-) See:
https://huggingface.co/BAAI/AltDiffusion-m9/blob/main/model_index.json#L17

@patrickvonplaten patrickvonplaten merged commit 48d0123 into huggingface:main Dec 5, 2022
@teticio
Copy link
Contributor Author

teticio commented Dec 5, 2022

@patrickvonplaten Hey, thanks for all the great suggestions and support along the way. Just a couple of things before we put this one to bed:

  1. Mel is still importable from diffusers and pipelines - I think you may want to remove it from there. I've updated all my model repos following your example.
  2. I see that the slow tests are failing - do the Dockerfiles only get built nightly? If so, then the missing libsndfile should hopefully get installed with the changes I made.

tcapelle pushed a commit to tcapelle/diffusers that referenced this pull request Dec 12, 2022
…ce#1334 (huggingface#1426)

* add AudioDiffusionPipeline and LatentAudioDiffusionPipeline

* add docs to toc

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* Update pr_tests.yml

Fix tests

* parent 499ff34
author teticio <[email protected]> 1668765652 +0000
committer teticio <[email protected]> 1669041721 +0000

parent 499ff34
author teticio <[email protected]> 1668765652 +0000
committer teticio <[email protected]> 1669041704 +0000

add colab notebook

[Flax] Fix loading scheduler from subfolder (huggingface#1319)

[FLAX] Fix loading scheduler from subfolder

Fix/Enable all schedulers for in-painting (huggingface#1331)

* inpaint fix k lms

* onnox as well

* up

Correct path to schedlure (huggingface#1322)

* [Examples] Correct path

* uP

Avoid nested fix-copies (huggingface#1332)

* Avoid nested `# Copied from` statements during `make fix-copies`

* style

Fix img2img speed with LMS-Discrete Scheduler (huggingface#896)

Casting `self.sigmas` into a different dtype (the one of original_samples) is not advisable. In my img2img pipeline this leads to a long running time in the  `integrate.quad` call later on- by long I mean more than 10x slower.

Co-authored-by: Anton Lozhkov <[email protected]>

Fix the order of casts for onnx inpainting (huggingface#1338)

Legacy Inpainting Pipeline for Onnx Models (huggingface#1237)

* Add legacy inpainting pipeline compatibility for onnx

* remove commented out line

* Add onnx legacy inpainting test

* Fix slow decorators

* pep8 styling

* isort styling

* dummy object

* ordering consistency

* style

* docstring styles

* Refactor common prompt encoding pattern

* Update tests to permanent repository home

* support all available schedulers until ONNX IO binding is available

Co-authored-by: Anton Lozhkov <[email protected]>

* updated styling from PR suggested feedback

Co-authored-by: Anton Lozhkov <[email protected]>

Jax infer support negative prompt (huggingface#1337)

* support negative prompts in sd jax pipeline

* pass batched neg_prompt

* only encode when negative prompt is None

Co-authored-by: Juan Acevedo <[email protected]>

Update README.md: Minor change to Imagic code snippet, missing dir error (huggingface#1347)

Minor change to Imagic Readme

Missing dir causes an error when running the example code.

make style

change the sample model (huggingface#1352)

* Update alt_diffusion.mdx

* Update alt_diffusion.mdx

Add bit diffusion [WIP] (huggingface#971)

* Create bit_diffusion.py

Bit diffusion based on the paper, arXiv:2208.04202, Chen2022AnalogBG

* adding bit diffusion to new branch

ran tests

* tests

* tests

* tests

* tests

* removed test folders + added to README

* Update README.md

Co-authored-by: Patrick von Platen <[email protected]>

* move Mel to module in pipeline construction, make librosa optional

* fix imports

* fix copy & paste error in comment

* fix style

* add missing register_to_config

* fix class docstrings

* fix class docstrings

* tweak docstrings

* tweak docstrings

* update slow test

* put trailing commas back

* respect alphabetical order

* remove LatentAudioDiffusion, make vqvae optional

* move Mel from models back to pipelines :-)

* allow loading of pretrained audiodiffusion models

* fix tests

* fix dummies

* remove reference to latent_audio_diffusion in docs

* unused import

* inherit from SchedulerMixin to make loadable

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>
sliard pushed a commit to sliard/diffusers that referenced this pull request Dec 21, 2022
…ce#1334 (huggingface#1426)

* add AudioDiffusionPipeline and LatentAudioDiffusionPipeline

* add docs to toc

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* Update pr_tests.yml

Fix tests

* parent 499ff34
author teticio <[email protected]> 1668765652 +0000
committer teticio <[email protected]> 1669041721 +0000

parent 499ff34
author teticio <[email protected]> 1668765652 +0000
committer teticio <[email protected]> 1669041704 +0000

add colab notebook

[Flax] Fix loading scheduler from subfolder (huggingface#1319)

[FLAX] Fix loading scheduler from subfolder

Fix/Enable all schedulers for in-painting (huggingface#1331)

* inpaint fix k lms

* onnox as well

* up

Correct path to schedlure (huggingface#1322)

* [Examples] Correct path

* uP

Avoid nested fix-copies (huggingface#1332)

* Avoid nested `# Copied from` statements during `make fix-copies`

* style

Fix img2img speed with LMS-Discrete Scheduler (huggingface#896)

Casting `self.sigmas` into a different dtype (the one of original_samples) is not advisable. In my img2img pipeline this leads to a long running time in the  `integrate.quad` call later on- by long I mean more than 10x slower.

Co-authored-by: Anton Lozhkov <[email protected]>

Fix the order of casts for onnx inpainting (huggingface#1338)

Legacy Inpainting Pipeline for Onnx Models (huggingface#1237)

* Add legacy inpainting pipeline compatibility for onnx

* remove commented out line

* Add onnx legacy inpainting test

* Fix slow decorators

* pep8 styling

* isort styling

* dummy object

* ordering consistency

* style

* docstring styles

* Refactor common prompt encoding pattern

* Update tests to permanent repository home

* support all available schedulers until ONNX IO binding is available

Co-authored-by: Anton Lozhkov <[email protected]>

* updated styling from PR suggested feedback

Co-authored-by: Anton Lozhkov <[email protected]>

Jax infer support negative prompt (huggingface#1337)

* support negative prompts in sd jax pipeline

* pass batched neg_prompt

* only encode when negative prompt is None

Co-authored-by: Juan Acevedo <[email protected]>

Update README.md: Minor change to Imagic code snippet, missing dir error (huggingface#1347)

Minor change to Imagic Readme

Missing dir causes an error when running the example code.

make style

change the sample model (huggingface#1352)

* Update alt_diffusion.mdx

* Update alt_diffusion.mdx

Add bit diffusion [WIP] (huggingface#971)

* Create bit_diffusion.py

Bit diffusion based on the paper, arXiv:2208.04202, Chen2022AnalogBG

* adding bit diffusion to new branch

ran tests

* tests

* tests

* tests

* tests

* removed test folders + added to README

* Update README.md

Co-authored-by: Patrick von Platen <[email protected]>

* move Mel to module in pipeline construction, make librosa optional

* fix imports

* fix copy & paste error in comment

* fix style

* add missing register_to_config

* fix class docstrings

* fix class docstrings

* tweak docstrings

* tweak docstrings

* update slow test

* put trailing commas back

* respect alphabetical order

* remove LatentAudioDiffusion, make vqvae optional

* move Mel from models back to pipelines :-)

* allow loading of pretrained audiodiffusion models

* fix tests

* fix dummies

* remove reference to latent_audio_diffusion in docs

* unused import

* inherit from SchedulerMixin to make loadable

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
…ce#1334 (huggingface#1426)

* add AudioDiffusionPipeline and LatentAudioDiffusionPipeline

* add docs to toc

* fix tests

* fix tests

* fix tests

* fix tests

* fix tests

* Update pr_tests.yml

Fix tests

* parent 499ff34
author teticio <[email protected]> 1668765652 +0000
committer teticio <[email protected]> 1669041721 +0000

parent 499ff34
author teticio <[email protected]> 1668765652 +0000
committer teticio <[email protected]> 1669041704 +0000

add colab notebook

[Flax] Fix loading scheduler from subfolder (huggingface#1319)

[FLAX] Fix loading scheduler from subfolder

Fix/Enable all schedulers for in-painting (huggingface#1331)

* inpaint fix k lms

* onnox as well

* up

Correct path to schedlure (huggingface#1322)

* [Examples] Correct path

* uP

Avoid nested fix-copies (huggingface#1332)

* Avoid nested `# Copied from` statements during `make fix-copies`

* style

Fix img2img speed with LMS-Discrete Scheduler (huggingface#896)

Casting `self.sigmas` into a different dtype (the one of original_samples) is not advisable. In my img2img pipeline this leads to a long running time in the  `integrate.quad` call later on- by long I mean more than 10x slower.

Co-authored-by: Anton Lozhkov <[email protected]>

Fix the order of casts for onnx inpainting (huggingface#1338)

Legacy Inpainting Pipeline for Onnx Models (huggingface#1237)

* Add legacy inpainting pipeline compatibility for onnx

* remove commented out line

* Add onnx legacy inpainting test

* Fix slow decorators

* pep8 styling

* isort styling

* dummy object

* ordering consistency

* style

* docstring styles

* Refactor common prompt encoding pattern

* Update tests to permanent repository home

* support all available schedulers until ONNX IO binding is available

Co-authored-by: Anton Lozhkov <[email protected]>

* updated styling from PR suggested feedback

Co-authored-by: Anton Lozhkov <[email protected]>

Jax infer support negative prompt (huggingface#1337)

* support negative prompts in sd jax pipeline

* pass batched neg_prompt

* only encode when negative prompt is None

Co-authored-by: Juan Acevedo <[email protected]>

Update README.md: Minor change to Imagic code snippet, missing dir error (huggingface#1347)

Minor change to Imagic Readme

Missing dir causes an error when running the example code.

make style

change the sample model (huggingface#1352)

* Update alt_diffusion.mdx

* Update alt_diffusion.mdx

Add bit diffusion [WIP] (huggingface#971)

* Create bit_diffusion.py

Bit diffusion based on the paper, arXiv:2208.04202, Chen2022AnalogBG

* adding bit diffusion to new branch

ran tests

* tests

* tests

* tests

* tests

* removed test folders + added to README

* Update README.md

Co-authored-by: Patrick von Platen <[email protected]>

* move Mel to module in pipeline construction, make librosa optional

* fix imports

* fix copy & paste error in comment

* fix style

* add missing register_to_config

* fix class docstrings

* fix class docstrings

* tweak docstrings

* tweak docstrings

* update slow test

* put trailing commas back

* respect alphabetical order

* remove LatentAudioDiffusion, make vqvae optional

* move Mel from models back to pipelines :-)

* allow loading of pretrained audiodiffusion models

* fix tests

* fix dummies

* remove reference to latent_audio_diffusion in docs

* unused import

* inherit from SchedulerMixin to make loadable

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants