Skip to content
View lmxue's full-sized avatar

Block or report lmxue

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 159 24 Updated Oct 31, 2024

Multilingual Voice Understanding Model

Python 3,252 300 Updated Oct 18, 2024

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook 2,515 205 Updated Aug 1, 2024

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key

Python 6,101 609 Updated Oct 25, 2024

Open source real-time translation app for Android that runs locally

C++ 6,734 507 Updated Sep 27, 2024

Foundational model for human-like, expressive TTS

Python 3,848 658 Updated Jul 30, 2024

A generative speech model for daily dialogue.

Python 31,947 3,480 Updated Oct 21, 2024

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 13,641 1,090 Updated May 23, 2024

Inference and training library for high-quality TTS models.

Python 4,527 457 Updated Oct 30, 2024

[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"

Python 306 21 Updated Sep 3, 2024

Official repo for WavCraft, an AI agent for audio creation and editing

Python 652 96 Updated Sep 13, 2024

Awesome speech/audio LLMs, representation learning, and codec models

669 31 Updated Oct 31, 2024

利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.

Python 16,726 2,664 Updated Jul 26, 2024

AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation

Python 4,610 576 Updated Jul 2, 2024

A lightweight library for Frechet Audio Distance calculation.

Python 233 24 Updated Sep 4, 2024

trying to reproduce suno v3

24 1 Updated Mar 24, 2024

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Jupyter Notebook 7,617 744 Updated Jun 24, 2024

Brand new TTS solution

Python 13,684 1,025 Updated Oct 30, 2024

我的 ComfyUI 工作流合集 | My ComfyUI workflows collection

5,086 476 Updated Oct 30, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 22,104 2,153 Updated Aug 9, 2024

AI powered speech denoising and enhancement

Python 1,386 138 Updated Jun 21, 2024

VoicePAT is a modular and efficient toolkit for voice privacy research, with main focus on speaker anonymization.

Shell 46 4 Updated May 14, 2024

Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)

Python 57,194 7,099 Updated Oct 30, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,494 1,023 Updated Oct 31, 2024

High-level API for tar-based dataset

Python 10 Updated Feb 3, 2024

提取微信聊天记录,将其导出成HTML、Word、Excel文档永久保存,对聊天记录进行分析生成年度聊天报告,用聊天数据训练专属于个人的AI聊天助手

Python 34,181 3,577 Updated Sep 23, 2024

Think DSP: Digital Signal Processing in Python, by Allen B. Downey.

Jupyter Notebook 3,961 3,222 Updated May 10, 2024

Audio Codec Speech processing Universal PERformance Benchmark

Python 209 22 Updated Sep 28, 2024

[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 619 43 Updated Oct 27, 2024

🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support

Python 7,880 961 Updated Oct 24, 2024
Next