


default search action
Jonathan Le Roux
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [i75]Yoshiki Masuyama, Gordon Wichern, François G. Germain, Christopher Ick, Jonathan Le Roux:
Retrieval-Augmented Neural Field for HRTF Upsampling and Personalization. CoRR abs/2501.13017 (2025) - 2024
- [j17]Christoph Böddeker
, Aswin Shanmugam Subramanian
, Gordon Wichern
, Reinhold Haeb-Umbach
, Jonathan Le Roux
:
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1185-1197 (2024) - [j16]Stefan Uhlich
, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern
, Jonathan Le Roux
, Dipam Chakraborty, Sharada Mohanty, Kai Li, Yi Luo
, Jianwei Yu, Rongzhi Gu, Roman A. Solovyev
, Alexander L. Stempkovskiy
, Tatiana Habruseva
, Mikhail Sukhovei, Yuki Mitsufuji
:
The Sound Demixing Challenge 2023 - Cinematic Demixing Track. Trans. Int. Soc. Music. Inf. Retr. 7(1): 44-62 (2024) - [c126]Zeyuan Yang, Jiageng Lin, Peihao Chen, Anoop Cherian, Tim K. Marks, Jonathan Le Roux, Chuang Gan:
RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation. CVPR 2024: 16251-16261 - [c125]Zexu Pan, Gordon Wichern, François G. Germain, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Late Audio-Visual Fusion for in-the-Wild Speaker Diarization. ICASSP Workshops 2024: 174-178 - [c124]Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-Weon Jung, François G. Germain, Jonathan Le Roux, Shinji Watanabe:
Improving Audio Captioning Models with Fine-Grained Audio Features, Text Embedding Supervision, and LLM Mix-Up Augmentation. ICASSP 2024: 316-320 - [c123]Chang-Bin Jeon, Gordon Wichern, François G. Germain, Jonathan Le Roux:
Why Does Music Source Separation Benefit from Cacophony? ICASSP Workshops 2024: 873-877 - [c122]Teysir Baoueb, Haocheng Liu, Mathieu Fontaine, Jonathan Le Roux, Gaël Richard:
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis. ICASSP 2024: 986-990 - [c121]Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization. ICASSP 2024: 1016-1020 - [c120]Dimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
Generation or Replication: Auscultating Audio Latent Diffusion Models. ICASSP 2024: 1156-1160 - [c119]Zexu Pan, Gordon Wichern, François G. Germain, Sameer Khurana, Jonathan Le Roux:
NeuroHeed+: Improving Neuro-Steered Speaker Extraction with Joint Auditory Attention Detection. ICASSP 2024: 11456-11460 - [c118]Haocheng Liu, Teysir Baoueb, Mathieu Fontaine, Jonathan Le Roux, Gaël Richard:
GLA-GRAD: A Griffin-Lim Extended Waveform Generation Diffusion Model. ICASSP 2024: 11611-11615 - [c117]Chiori Hori, Pu Wang, Mahbub Rahman, Cristian J. Vaca-Rubio, Sameer Khurana, Anoop Cherian, Jonathan Le Roux:
WI-FI based Indoor Monitoring Enhanced by Multimodal Fusion. ICASSP 2024: 13296-13300 - [c116]Jie Yin, Andrew Luo, Yilun Du, Anoop Cherian, Tim K. Marks, Jonathan Le Roux, Chuang Gan:
Disentangled Acoustic Fields For Multimodal Physical Scene Understanding. IROS 2024: 557-564 - [c115]Kohei Saijo, Gordon Wichern, François G. Germain, Zexu Pan, Jonathan Le Roux:
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement. IWAENC 2024: 205-209 - [i74]Teysir Baoueb, Haocheng Liu, Mathieu Fontaine, Jonathan Le Roux, Gaël Richard:
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis. CoRR abs/2402.01753 (2024) - [i73]Haocheng Liu, Teysir Baoueb, Mathieu Fontaine, Jonathan Le Roux, Gaël Richard:
GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model. CoRR abs/2402.15516 (2024) - [i72]Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization. CoRR abs/2402.17907 (2024) - [i71]Junghyun Koo, Gordon Wichern, François G. Germain, Sameer Khurana, Jonathan Le Roux:
SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers. CoRR abs/2404.02252 (2024) - [i70]Janek Ebbers, François G. Germain, Gordon Wichern, Jonathan Le Roux:
Sound Event Bounding Boxes. CoRR abs/2406.04212 (2024) - [i69]Louis Bahrman, Mathieu Fontaine, Jonathan Le Roux, Gaël Richard:
Speech dereverberation constrained on room impulse response characteristics. CoRR abs/2407.08657 (2024) - [i68]Jie Yin, Andrew Luo, Yilun Du, Anoop Cherian, Tim K. Marks, Jonathan Le Roux, Chuang Gan:
Disentangled Acoustic Fields For Multimodal Physical Scene Understanding. CoRR abs/2407.11333 (2024) - [i67]Kohei Saijo, Gordon Wichern, François G. Germain, Zexu Pan, Jonathan Le Roux:
Enhanced Reverberation as Supervision for Unsupervised Speech Separation. CoRR abs/2408.03438 (2024) - [i66]Kohei Saijo, Gordon Wichern, François G. Germain, Zexu Pan, Jonathan Le Roux:
TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement. CoRR abs/2408.03440 (2024) - [i65]Kohei Saijo, Janek Ebbers, François G. Germain, Sameer Khurana, Gordon Wichern, Jonathan Le Roux:
Leveraging Audio-Only Data for Text-Queried Target Sound Extraction. CoRR abs/2409.13152 (2024) - [i64]Kohei Saijo, Janek Ebbers, François G. Germain, Gordon Wichern, Jonathan Le Roux:
Task-Aware Unified Source Separation. CoRR abs/2410.23987 (2024) - 2023
- [j15]Zhong-Qiu Wang
, Gordon Wichern
, Shinji Watanabe
, Jonathan Le Roux
:
STFT-Domain Neural Speech Enhancement With Very Low Algorithmic Latency. IEEE ACM Trans. Audio Speech Lang. Process. 31: 397-410 (2023) - [j14]Darius Petermann
, Gordon Wichern
, Aswin Shanmugam Subramanian
, Zhong-Qiu Wang
, Jonathan Le Roux
:
Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks. IEEE ACM Trans. Audio Speech Lang. Process. 31: 2592-2605 (2023) - [c114]Zexu Pan, Gordon Wichern, Yoshiki Masuyama, François G. Germain, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
Scenario-Aware Audio-Visual TF-Gridnet for Target Speech Extraction. ASRU 2023: 1-8 - [c113]Rohith Aralikatti, Christoph Böddeker, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Reverberation as Supervision For Speech Separation. ICASSP 2023: 1-5 - [c112]Dimitrios Bralios, Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux:
Latent Iterative Refinement for Modular Source Separation. ICASSP 2023: 1-5 - [c111]Ke Chen, Gordon Wichern, François G. Germain, Jonathan Le Roux:
Paᗧ-HuBERT: Self-Supervised Music Source Separation Via Primitive Auditory Clustering And Hidden-Unit Bert. ICASSP Workshops 2023: 1-5 - [c110]Darius Petermann, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Hyperbolic Audio Source Separation. ICASSP 2023: 1-5 - [c109]Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux:
Optimal Condition Training for Target Source Separation. ICASSP 2023: 1-5 - [c108]Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux:
Cold Diffusion for Speech Enhancement. ICASSP 2023: 1-5 - [c107]Chiori Hori, Puyuan Peng, David Harwath, Xinyu Liu, Kei Ota, Siddarth Jain, Radu Corcodel, Devesh K. Jha, Diego Romeres, Jonathan Le Roux:
Style-transfer based Speech and Audio-visual Scene understanding for Robot Action Sequence Acquisition from Videos. INTERSPEECH 2023: 4663-4667 - [c106]François G. Germain, Gordon Wichern, Jonathan Le Roux:
Hyperbolic Unsupervised Anomalous Sound Detection. WASPAA 2023: 1-5 - [c105]Ricardo Falcón Pérez
, Gordon Wichern, François G. Germain, Jonathan Le Roux:
Location as Supervision for Weakly Supervised Multi-Channel Source Separation of Machine Sounds. WASPAA 2023: 1-5 - [i63]Christoph Böddeker, Aswin Shanmugam Subramanian, Gordon Wichern, Reinhold Haeb-Umbach, Jonathan Le Roux:
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings. CoRR abs/2303.03849 (2023) - [i62]Ke Chen, Gordon Wichern, François G. Germain, Jonathan Le Roux:
Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT. CoRR abs/2304.02160 (2023) - [i61]Chiori Hori, Puyuan Peng, David Harwath, Xinyu Liu, Kei Ota, Siddarth Jain, Radu Corcodel, Devesh K. Jha, Diego Romeres, Jonathan Le Roux:
Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos. CoRR abs/2306.15644 (2023) - [i60]Stefan Uhlich, Giorgio Fabbro, Masato Hirano, Shusuke Takahashi, Gordon Wichern, Jonathan Le Roux, Dipam Chakraborty, Sharada P. Mohanty, Kai Li, Yi Luo, Jianwei Yu, Rongzhi Gu, Roman A. Solovyev, Alexander L. Stempkovskiy, Tatiana Habruseva, Mikhail Sukhovei, Yuki Mitsufuji:
The Sound Demixing Challenge 2023 - Cinematic Demixing Track. CoRR abs/2308.06981 (2023) - [i59]Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-weon Jung, François G. Germain, Jonathan Le Roux, Shinji Watanabe
:
Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation. CoRR abs/2309.17352 (2023) - [i58]Dimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
Generation or Replication: Auscultating Audio Latent Diffusion Models. CoRR abs/2310.10604 (2023) - [i57]Zexu Pan, Gordon Wichern, Yoshiki Masuyama, François G. Germain, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction. CoRR abs/2310.19644 (2023) - [i56]Zexu Pan, Gordon Wichern, François G. Germain, Sameer Khurana, Jonathan Le Roux:
NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection. CoRR abs/2312.07513 (2023) - 2022
- [j13]Yosuke Higuchi
, Niko Moritz, Jonathan Le Roux
, Takaaki Hori
:
Momentum Pseudo-Labeling: Semi-Supervised ASR With Continuously Improving Pseudo-Labels. IEEE J. Sel. Top. Signal Process. 16(6): 1424-1438 (2022) - [c104]Anoop Cherian, Chiori Hori, Tim K. Marks, Jonathan Le Roux:
(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering. AAAI 2022: 444-453 - [c103]Satvik Venkatesh, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Improved Domain Generalization via Disentangled Multi-Task Learning in Unsupervised Anomalous Sound Detection. DCASE 2022 - [c102]Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks. ICASSP 2022: 526-530 - [c101]Olga Slizovskaia, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
Locate This, Not that: Class-Conditioned Sound Event DOA Estimation. ICASSP 2022: 711-715 - [c100]Niko Moritz, Takaaki Hori, Shinji Watanabe
, Jonathan Le Roux:
Sequence Transduction with Graph-Based Supervision. ICASSP 2022: 7212-7216 - [c99]Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe
, Jonathan Le Roux:
Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR. ICASSP 2022: 7322-7326 - [c98]Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy. ICASSP 2022: 7672-7676 - [c97]Ankit P. Shah, Shijie Geng, Peng Gao, Anoop Cherian, Takaaki Hori, Tim K. Marks, Jonathan Le Roux, Chiori Hori:
Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning. ICASSP 2022: 7732-7736 - [c96]Efthymios Tzinis, Gordon Wichern, Aswin Shanmugam Subramanian, Paris Smaragdis, Jonathan Le Roux:
Heterogeneous Target Speech Separation. INTERSPEECH 2022: 1796-1800 - [c95]Chiori Hori, Takaaki Hori, Jonathan Le Roux:
Low-Latency Online Streaming VideoQA Using Audio-Visual Transformers. INTERSPEECH 2022: 4511-4515 - [i55]Anoop Cherian, Chiori Hori, Tim K. Marks, Jonathan Le Roux:
(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering. CoRR abs/2202.09277 (2022) - [i54]Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe
, Jonathan Le Roux:
Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR. CoRR abs/2203.00232 (2022) - [i53]Olga Slizovskaia, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
Locate This, Not That: Class-Conditioned Sound Event DOA Estimation. CoRR abs/2203.04197 (2022) - [i52]Efthymios Tzinis, Gordon Wichern, Aswin Shanmugam Subramanian, Paris Smaragdis, Jonathan Le Roux:
Heterogeneous Target Speech Separation. CoRR abs/2204.03594 (2022) - [i51]Zhong-Qiu Wang, Gordon Wichern, Shinji Watanabe
, Jonathan Le Roux:
STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency. CoRR abs/2204.09911 (2022) - [i50]Zexu Pan, Gordon Wichern, François G. Germain, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Towards End-to-end Speaker Diarization in the Wild. CoRR abs/2211.01299 (2022) - [i49]Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux:
Cold Diffusion for Speech Enhancement. CoRR abs/2211.02527 (2022) - [i48]Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux:
Optimal Condition Training for Target Source Separation. CoRR abs/2211.05927 (2022) - [i47]Rohith Aralikatti, Christoph Böddeker, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Reverberation as Supervision for Speech Separation. CoRR abs/2211.08303 (2022) - [i46]Dimitrios Bralios, Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux:
Latent Iterative Refinement for Modular Source Separation. CoRR abs/2211.11917 (2022) - [i45]Darius Petermann, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Hyperbolic Audio Source Separation. CoRR abs/2212.05008 (2022) - [i44]Darius Petermann, Gordon Wichern, Aswin Shanmugam Subramanian, Zhong-Qiu Wang, Jonathan Le Roux:
Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks. CoRR abs/2212.07327 (2022) - 2021
- [j12]Zhong-Qiu Wang
, Gordon Wichern
, Jonathan Le Roux
:
On the Compensation Between Magnitude and Phase in Speech Separation. IEEE Signal Process. Lett. 28: 2018-2022 (2021) - [j11]Zhong-Qiu Wang
, Gordon Wichern
, Jonathan Le Roux
:
Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3476-3490 (2021) - [c94]Shijie Geng, Peng Gao, Moitreya Chatterjee, Chiori Hori, Jonathan Le Roux, Yongfeng Zhang, Hongsheng Li, Anoop Cherian:
Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers. AAAI 2021: 1415-1423 - [c93]Yun-Ning Hung, Gordon Wichern, Jonathan Le Roux:
Transcription Is All You Need: Learning To Separate Musical Mixtures With Score As Supervision. ICASSP 2021: 46-50 - [c92]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Capturing Multi-Resolution Context by Dilated Self-Attention. ICASSP 2021: 5869-5873 - [c91]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Semi-Supervised Speech Recognition Via Graph-Based Temporal Classification. ICASSP 2021: 6548-6552 - [c90]Sameer Khurana, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training. ICASSP 2021: 6553-6557 - [c89]Moitreya Chatterjee, Jonathan Le Roux, Narendra Ahuja, Anoop Cherian:
Visual Scene Graphs for Audio Source Separation. ICCV 2021: 1184-1193 - [c88]Chiori Hori, Takaaki Hori, Jonathan Le Roux:
Optimizing Latency for Online Video Captioning Using Audio-Visual Transformers. Interspeech 2021: 586-590 - [c87]Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition. Interspeech 2021: 726-730 - [c86]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition. Interspeech 2021: 1822-1826 - [c85]Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux:
Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers. Interspeech 2021: 2097-2101 - [c84]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
Convolutive Prediction for Reverberant Speech Separation. WASPAA 2021: 56-60 - [c83]Gordon Wichern, Ankush Chakrabarty, Zhong-Qiu Wang, Jonathan Le Roux:
Anomalous Sound Detection Using Attentive Neural Processes. WASPAA 2021: 186-190 - [i43]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Capturing Multi-Resolution Context by Dilated Self-Attention. CoRR abs/2104.02858 (2021) - [i42]Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux:
Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers. CoRR abs/2104.09426 (2021) - [i41]Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition. CoRR abs/2106.08922 (2021) - [i40]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition. CoRR abs/2107.01269 (2021) - [i39]Chiori Hori, Takaaki Hori, Jonathan Le Roux:
Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers. CoRR abs/2108.02147 (2021) - [i38]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
On The Compensation Between Magnitude and Phase in Speech Separation. CoRR abs/2108.05470 (2021) - [i37]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
Convolutive Prediction for Reverberant Speech Separation. CoRR abs/2108.07194 (2021) - [i36]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation. CoRR abs/2108.07376 (2021) - [i35]Moitreya Chatterjee, Jonathan Le Roux, Narendra Ahuja, Anoop Cherian:
Visual Scene Graphs for Audio Source Separation. CoRR abs/2109.11955 (2021) - [i34]Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux:
Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement. CoRR abs/2110.00570 (2021) - [i33]Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy. CoRR abs/2110.04948 (2021) - [i32]Ankit P. Shah, Shijie Geng, Peng Gao, Anoop Cherian, Takaaki Hori, Tim K. Marks, Jonathan Le Roux, Chiori Hori:
Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning. CoRR abs/2110.06894 (2021) - [i31]Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks. CoRR abs/2110.09958 (2021) - [i30]Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux:
Sequence Transduction with Graph-based Supervision. CoRR abs/2111.01272 (2021) - 2020
- [j10]Fatemeh Pishdadian, Gordon Wichern, Jonathan Le Roux:
Finding Strength in Weakness: Learning to Separate Sounds With Weak Supervision. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2386-2399 (2020) - [c82]Fatemeh Pishdadian, Gordon Wichern, Jonathan Le Roux:
Learning to Separate Sounds from Weakly Labeled Scenes. ICASSP 2020: 91-95 - [c81]Matthew Maciejewski, Gordon Wichern, Emmett McQuinn, Jonathan Le Roux:
WHAMR!: Noisy and Reverberant Single-Channel Speech Separation. ICASSP 2020: 696-700 - [c80]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Streaming Automatic Speech Recognition with the Transformer Model. ICASSP 2020: 6074-6078 - [c79]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe
:
End-To-End Multi-Speaker Speech Recognition With Transformer. ICASSP 2020: 6134-6138 - [c78]Leda Sari, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR. ICASSP 2020: 7384-7388 - [c77]Niko Moritz, Gordon Wichern, Takaaki Hori, Jonathan Le Roux:
All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection. INTERSPEECH 2020: 3112-3116 - [c76]Tejas Jayashankar, Jonathan Le Roux, Pierre Moulin:
Detecting Audio Attacks on ASR Systems with Dropout Uncertainty. INTERSPEECH 2020: 4671-4675 - [c75]Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux:
Transformer-Based Long-Context End-to-End Speech Recognition. INTERSPEECH 2020: 5011-5015 - [c74]Ethan Manilow, Gordon Wichern, Jonathan Le Roux:
Hierarchical Musical Instrument Separation. ISMIR 2020: 376-383 - [c73]Prem Seetharaman, Gordon Wichern, Bryan Pardo, Jonathan Le Roux:
Autoclip: Adaptive Gradient Clipping for Source Separation Networks. MLSP 2020: 1-6 - [i29]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Streaming automatic speech recognition with the transformer model. CoRR abs/2001.02674 (2020) - [i28]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe:
End-to-End Multi-speaker Speech Recognition with Transformer. CoRR abs/2002.03921 (2020) - [i27]Leda Sari, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Speaker Adaptation using Attention-based Speaker Memory for End-to-End ASR. CoRR abs/2002.06165 (2020) - [i26]Tejas Jayashankar, Jonathan Le Roux, Pierre Moulin:
Detecting Audio Attacks on ASR Systems with Dropout Uncertainty. CoRR abs/2006.01906 (2020) - [i25]Shijie Geng, Peng Gao, Chiori Hori, Jonathan Le Roux, Anoop Cherian:
Spatio-Temporal Scene Graphs for Video Dialog. CoRR abs/2007.03848 (2020) - [i24]Prem Seetharaman, Gordon Wichern, Bryan Pardo, Jonathan Le Roux:
AutoClip: Adaptive Gradient Clipping for Source Separation Networks. CoRR abs/2007.14469 (2020) - [i23]Peng Gao, Chiori Hori, Shijie Geng, Takaaki Hori, Jonathan Le Roux:
Multi-Pass Transformer for Machine Translation. CoRR abs/2009.11382 (2020) - [i22]Yun-Ning Hung, Gordon Wichern, Jonathan Le Roux:
Transcription Is All You Need: Learning to Separate Musical Mixtures with Score as Supervision. CoRR abs/2010.11904 (2020) - [i21]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Semi-Supervised Speech Recognition via Graph-based Temporal Classification. CoRR abs/2010.15653 (2020) - [i20]Sameer Khurana, Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training. CoRR abs/2011.13439 (2020)
2010 – 2019
- 2019
- [j9]Jonathan Le Roux
, Gordon Wichern
, Shinji Watanabe
, Andy M. Sarroff, John R. Hershey:
Phasebook and Friends: Leveraging Discrete Representations for Source Separation. IEEE J. Sel. Top. Signal Process. 13(2): 370-382 (2019) - [c72]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe
:
MIMO-Speech: End-to-End Multi-Channel Multi-Speaker Speech Recognition. ASRU 2019: 237-244 - [c71]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Streaming End-to-End Speech Recognition with Joint CTC-Attention Based Models. ASRU 2019: 936-943 - [c70]Jonathan Le Roux, Gordon Wichern, Shinji Watanabe
, Andy M. Sarroff, John R. Hershey:
The Phasebook: Building Complex Masks via Discrete Representations for Source Separation. ICASSP 2019: 66-70 - [c69]Prem Seetharaman, Gordon Wichern, Shrikant Venkataramani, Jonathan Le Roux:
Class-conditional Embeddings for Music Source Separation. ICASSP 2019: 301-305 - [c68]Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo:
Bootstrapping Single-channel Source Separation via Unsupervised Spatial Clustering on Stereo Mixtures. ICASSP 2019: 356-360 - [c67]Jonathan Le Roux, Scott Wisdom, Hakan Erdogan, John R. Hershey:
SDR - Half-baked or Well Done? ICASSP 2019: 626-630 - [c66]Ryo Aihara, Toshiyuki Hanazawa, Yohei Okato, Gordon Wichern, Jonathan Le Roux:
Teacher-student Deep Clustering for Low-delay Single Channel Speech Separation. ICASSP 2019: 690-694 - [c65]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Triggered Attention for End-to-end Speech Recognition. ICASSP 2019: 5666-5670 - [c64]Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe
, Jonathan Le Roux:
Cycle-consistency Training for End-to-end Speech Recognition. ICASSP 2019: 6271-6275 - [c63]Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition. INTERSPEECH 2019: 76-80 - [c62]Gordon Wichern, Joe Antognini, Michael Flynn, Licheng Richard Zhu, Emmett McQuinn, Dwight Crow, Ethan Manilow, Jonathan Le Roux:
WHAM!: Extending Speech Separation to Noisy Environments. INTERSPEECH 2019: 1368-1372 - [c61]Hiroshi Seki, Takaaki Hori, Shinji Watanabe
, Jonathan Le Roux, John R. Hershey:
End-to-End Multilingual Multi-Speaker Speech Recognition. INTERSPEECH 2019: 3755-3759 - [c60]Hiroshi Seki, Takaaki Hori, Shinji Watanabe
, Niko Moritz, Jonathan Le Roux:
Vectorized Beam Search for CTC-Attention-Based Speech Recognition. INTERSPEECH 2019: 3825-3829 - [c59]Ethan Manilow, Gordon Wichern, Prem Seetharaman, Jonathan Le Roux:
Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity. WASPAA 2019: 45-49 - [c58]Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin W. Wilson, Jonathan Le Roux, John R. Hershey:
Universal Sound Separation. WASPAA 2019: 175-179 - [i19]Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin W. Wilson, Jonathan Le Roux, John R. Hershey:
Universal Sound Separation. CoRR abs/1905.03330 (2019) - [i18]Gordon Wichern, Joe Antognini, Michael Flynn, Licheng Richard Zhu, Emmett McQuinn, Dwight Crow, Ethan Manilow, Jonathan Le Roux:
WHAM!: Extending Speech Separation to Noisy Environments. CoRR abs/1907.01160 (2019) - [i17]Ethan Manilow, Gordon Wichern, Prem Seetharaman, Jonathan Le Roux:
Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity. CoRR abs/1909.08494 (2019) - [i16]Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe:
MIMO-SPEECH: End-to-End Multi-Channel Multi-Speaker Speech Recognition. CoRR abs/1910.06522 (2019) - [i15]Matthew Maciejewski, Gordon Wichern, Emmett McQuinn, Jonathan Le Roux:
WHAMR!: Noisy and Reverberant Single-Channel Speech Separation. CoRR abs/1910.10279 (2019) - [i14]Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo:
Bootstrapping deep music separation from primitive auditory grouping principles. CoRR abs/1910.11133 (2019) - [i13]Fatemeh Pishdadian, Gordon Wichern, Jonathan Le Roux:
Finding Strength in Weakness: Learning to Separate Sounds with Weak Supervision. CoRR abs/1911.02182 (2019) - 2018
- [c57]Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
A Purely End-to-End System for Multi-speaker Speech Recognition. ACL (1) 2018: 2620-2630 - [c56]Zhong-Qiu Wang, Jonathan Le Roux, John R. Hershey:
Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation. ICASSP 2018: 1-5 - [c55]Zhong-Qiu Wang, Jonathan Le Roux, John R. Hershey:
Alternative Objective Functions for Deep Clustering. ICASSP 2018: 686-690 - [c54]Shane Settle, Jonathan Le Roux, Takaaki Hori, Shinji Watanabe
, John R. Hershey:
End-to-End Multi-Speaker Speech Recognition. ICASSP 2018: 4819-4823 - [c53]Hiroshi Seki, Shinji Watanabe
, Takaaki Hori, Jonathan Le Roux, John R. Hershey:
An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech. ICASSP 2018: 4919-4923 - [c52]Zhong-Qiu Wang, Jonathan Le Roux, DeLiang Wang, John R. Hershey:
End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction. INTERSPEECH 2018: 2708-2712 - [c51]Gordon Wichern, Jonathan Le Roux:
Phase Reconstruction with Learned Time-Frequency Representations for Single-Channel Speech Separation. IWAENC 2018: 396-400 - [i12]Zhong-Qiu Wang, Jonathan Le Roux, DeLiang Wang, John R. Hershey:
End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction. CoRR abs/1804.10204 (2018) - [i11]Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
A Purely End-to-end System for Multi-speaker Speech Recognition. CoRR abs/1805.05826 (2018) - [i10]Jonathan Le Roux, Gordon Wichern, Shinji Watanabe, Andy M. Sarroff, John R. Hershey:
Phasebook and Friends: Leveraging Discrete Representations for Source Separation. CoRR abs/1810.01395 (2018) - [i9]Takaaki Hori, Ramón Fernandez Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux:
Cycle-consistency training for end-to-end speech recognition. CoRR abs/1811.01690 (2018) - [i8]Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo:
Bootstrapping single-channel source separation via unsupervised spatial clustering on stereo mixtures. CoRR abs/1811.02130 (2018) - [i7]Jonathan Le Roux, Scott Wisdom, Hakan Erdogan, John R. Hershey:
SDR - half-baked or well done? CoRR abs/1811.02508 (2018) - [i6]Prem Seetharaman, Gordon Wichern, Shrikant Venkataramani, Jonathan Le Roux:
Class-conditional embeddings for music source separation. CoRR abs/1811.03076 (2018) - 2017
- [j8]Takaaki Hori, Zhuo Chen, Hakan Erdogan, John R. Hershey, Jonathan Le Roux, Vikramjit Mitra, Shinji Watanabe
:
Multi-microphone speech recognition integrating beamforming, robust feature extraction, and advanced DNN/RNN backend. Comput. Speech Lang. 46: 401-418 (2017) - [j7]Yuuki Tachioka, Shinji Watanabe
, Jonathan Le Roux, John R. Hershey:
Prior-based Binary Masking and Discriminative Methods for Reverberant and Noisy Speech Recognition Using Distant Stereo Microphones. J. Inf. Process. 25: 407-416 (2017) - [j6]Tomoki Hayashi, Shinji Watanabe
, Tomoki Toda
, Takaaki Hori, Jonathan Le Roux, Kazuya Takeda:
Duration-Controlled LSTM for Polyphonic Sound Event Detection. IEEE ACM Trans. Audio Speech Lang. Process. 25(11): 2059-2070 (2017) - [c50]Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Nima Mesgarani:
Deep clustering and conventional networks for music separation: Stronger together. ICASSP 2017: 61-65 - [c49]Tomoki Hayashi, Shinji Watanabe
, Tomoki Toda
, Takaaki Hori, Jonathan Le Roux, Kazuya Takeda:
BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection. ICASSP 2017: 766-770 - [c48]Shinji Watanabe
, Takaaki Hori, Jonathan Le Roux, John R. Hershey:
Student-teacher network learning with enhanced features. ICASSP 2017: 5275-5279 - [c47]Yuuki Tachioka, Tomohiro Narita, Iori Miura, Takanobu Uramoto, Natsuki Monta, Shingo Uenohara, Ken'ichi Furuya
, Shinji Watanabe
, Jonathan Le Roux:
Coupled Initialization of Multi-Channel Non-Negative Matrix Factorization Based on Spatial and Spectral Information. INTERSPEECH 2017: 2461-2465 - [c46]Paul Magron, Jonathan Le Roux, Tuomas Virtanen
:
Consistent anisotropic Wiener filtering for audio source separation. WASPAA 2017: 269-273 - [p4]John R. Hershey, Jonathan Le Roux, Shinji Watanabe, Scott Wisdom, Zhuo Chen, Yusuf Ziya Isik:
Novel Deep Architectures in Speech Processing. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 135-164 - [p3]Hakan Erdogan, John R. Hershey, Shinji Watanabe, Jonathan Le Roux:
Deep Recurrent Networks for Separation and Recognition of Single-Channel Speech in Nonstationary Background Audio. New Era for Robust Speech Recognition, Exploiting Deep Learning 2017: 165-186 - 2016
- [c45]Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Takaaki Hori, Jonathan Le Roux, Kazuya Takeda:
Bidirectional LSTM-HMM Hybrid System for Polyphonic Sound Event Detection. DCASE 2016: 35-39 - [c44]John R. Hershey, Zhuo Chen, Jonathan Le Roux, Shinji Watanabe
:
Deep clustering: Discriminative embeddings for segmentation and separation. ICASSP 2016: 31-35 - [c43]Scott Wisdom, John R. Hershey, Jonathan Le Roux, Shinji Watanabe
:
Deep unfolding for multichannel source separation. ICASSP 2016: 121-125 - [c42]Yusuf Ziya Isik
, Jonathan Le Roux, Zhuo Chen, Shinji Watanabe
, John R. Hershey:
Single-Channel Multi-Speaker Separation Using Deep Clustering. INTERSPEECH 2016: 545-549 - [c41]Hakan Erdogan, John R. Hershey, Shinji Watanabe
, Michael I. Mandel, Jonathan Le Roux:
Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks. INTERSPEECH 2016: 1981-1985 - [c40]Scott Wisdom, Thomas Powers, John R. Hershey, Jonathan Le Roux, Les E. Atlas:
Full-Capacity Unitary Recurrent Neural Networks. NIPS 2016: 4880-4888 - [c39]Takaaki Hori, Hai Wang, Chiori Hori, Shinji Watanabe
, Bret Harsham, Jonathan Le Roux, John R. Hershey, Yusuke Koji, Yi Jing, Zhaocheng Zhu, Takeyuki Aikawa:
Dialog state tracking with attention-based sequence-to-sequence learning. SLT 2016: 552-558 - [i5]Yusuf Ziya Isik, Jonathan Le Roux, Zhuo Chen, Shinji Watanabe, John R. Hershey:
Single-Channel Multi-Speaker Separation using Deep Clustering. CoRR abs/1607.02173 (2016) - [i4]Scott Wisdom, Thomas Powers, John R. Hershey, Jonathan Le Roux, Les E. Atlas:
Full-Capacity Unitary Recurrent Neural Networks. CoRR abs/1611.00035 (2016) - [i3]Yi Luo, Zhuo Chen, John R. Hershey, Jonathan Le Roux, Nima Mesgarani:
Deep Clustering and Conventional Networks for Music Separation: Stronger Together. CoRR abs/1611.06265 (2016) - 2015
- [j5]Timo Gerkmann
, Martin Krawczyk-Becker, Jonathan Le Roux:
Phase Processing for Single-Channel Speech Enhancement: History and recent advances. IEEE Signal Process. Mag. 32(2): 55-66 (2015) - [c38]Takaaki Hori, Zhuo Chen, Hakan Erdogan, John R. Hershey, Jonathan Le Roux, Vikramjit Mitra, Shinji Watanabe
:
The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition. ASRU 2015: 475-481 - [c37]Felix Weninger, Hakan Erdogan, Shinji Watanabe
, Emmanuel Vincent, Jonathan Le Roux, John R. Hershey, Björn W. Schuller
:
Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR. LVA/ICA 2015: 91-99 - [c36]Jonathan Le Roux, John R. Hershey, Felix Weninger:
Deep NMF for speech separation. ICASSP 2015: 66-70 - [c35]Hakan Erdogan, John R. Hershey, Shinji Watanabe
, Jonathan Le Roux:
Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks. ICASSP 2015: 708-712 - [c34]Jonathan Le Roux, Emmanuel Vincent, John R. Hershey, Daniel P. W. Ellis:
Micbots: Collecting large realistic datasets for speech and audio research using mobile robots. ICASSP 2015: 5635-5639 - [i2]John R. Hershey, Zhuo Chen, Jonathan Le Roux, Shinji Watanabe:
Deep clustering: Discriminative embeddings for segmentation and separation. CoRR abs/1508.04306 (2015) - 2014
- [c33]Yuuki Tachioka, Shinji Watanabe
, Jonathan Le Roux, John R. Hershey:
Sequence discriminative training for low-rank deep neural networks. GlobalSIP 2014: 572-576 - [c32]Felix Weninger, John R. Hershey, Jonathan Le Roux, Björn W. Schuller
:
Discriminatively trained recurrent neural networks for single-channel speech separation. GlobalSIP 2014: 577-581 - [c31]Yuuki Tachioka, Tomohiro Narita, Shinji Watanabe
, Jonathan Le Roux:
Ensemble integration of calibrated speaker localization and statistical speech detection in domestic environments. HSCMA 2014: 162-166 - [c30]Shinji Watanabe
, Jonathan Le Roux:
Black box optimization for automatic speech recognition. ICASSP 2014: 3256-3260 - [c29]Umut Simsekli, Jonathan Le Roux, John R. Hershey:
Non-negative source-filter dynamical system for speech enhancement. ICASSP 2014: 6206-6210 - [c28]Felix Weninger, Jonathan Le Roux, John R. Hershey, Shinji Watanabe:
Discriminative NMF and its application to single-channel source separation. INTERSPEECH 2014: 865-869 - [c27]Yuuki Tachioka, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
Sequential maximum mutual information linear discriminant analysis for speech recognition. INTERSPEECH 2014: 2415-2419 - [i1]John R. Hershey, Jonathan Le Roux, Felix Weninger:
Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures. CoRR abs/1409.2574 (2014) - 2013
- [j4]Jonathan Le Roux, Emmanuel Vincent:
Consistent Wiener Filtering for Audio Source Separation. IEEE Signal Process. Lett. 20(3): 217-220 (2013) - [c26]Yuuki Tachioka, Shinji Watanabe
, Jonathan Le Roux, John R. Hershey:
A generalized discriminative training framework for system combination. ASRU 2013: 43-48 - [c25]Emmanuel Vincent, Jon Barker
, Shinji Watanabe
, Jonathan Le Roux, Francesco Nesta, Marco Matassoni:
The second 'CHiME' speech separation and recognition challenge: An overview of challenge systems and outcomes. ASRU 2013: 162-167 - [c24]Emmanuel Vincent, Jon Barker
, Shinji Watanabe
, Jonathan Le Roux, Francesco Nesta, Marco Matassoni:
The second 'chime' speech separation and recognition challenge: Datasets, tasks and baselines. ICASSP 2013: 126-130 - [c23]Cédric Févotte, Jonathan Le Roux, John R. Hershey:
Non-negative dynamical system with application to speech and audio. ICASSP 2013: 3158-3162 - [c22]Jonathan Le Roux, Petros T. Boufounos, Kang Kang, John R. Hershey:
Source localization in reverberant environments using sparse optimization. ICASSP 2013: 4310-4314 - [c21]Koichiro Yoshino, Shinji Watanabe, Jonathan Le Roux, John R. Hershey:
Statistical Dialogue Management using Intention Dependency Graph. IJCNLP 2013: 962-966 - [c20]Jonathan Le Roux, Shinji Watanabe
, John R. Hershey:
Ensemble learning for speech enhancement. WASPAA 2013: 1-4 - [c19]Umut Simsekli, Jonathan Le Roux, John R. Hershey:
Hierarchical and coupled non-negative dynamical systems with application to audio modeling. WASPAA 2013: 1-4 - [c18]Vamsi K. Potluru, Sergey M. Plis, Jonathan Le Roux, Barak A. Pearlmutter
, Vince D. Calhoun, Thomas P. Hayes:
Block Coordinate Descent for Sparse NMF. ICLR (Poster) 2013 - 2012
- [c17]Jonathan Le Roux, John R. Hershey:
Indirect model-based speech enhancement. ICASSP 2012: 4045-4048 - [p2]John R. Hershey, Steven J. Rennie, Jonathan Le Roux:
Factorial Models for Noise Robust Speech Recognition. Techniques for Noise Robustness in Automatic Speech Recognition 2012: 311-345 - 2011
- [j3]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Alain de Cheveigné
, Shigeki Sagayama:
Computational auditory induction as a missing-data model-fitting problem with Bregman divergence. Speech Commun. 53(5): 658-676 (2011) - [c16]Masahiro Nakano, Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Shigeki Sagayama:
Infinite-state spectrum model for music signal analysis. ICASSP 2011: 1972-1975 - [c15]Masahiro Nakano, Jonathan Le Roux, Hirokazu Kameoka, Tomohiko Nakamura
, Nobutaka Ono
, Shigeki Sagayama:
Bayesian nonparametric spectrogram modeling based on infinite factorial infinite hidden Markov model. WASPAA 2011: 325-328 - 2010
- [c14]Jonathan Le Roux, Emmanuel Vincent, Yuu Mizuno, Hirokazu Kameoka, Nobutaka Ono
, Shigeki Sagayama:
Consistent Wiener Filtering: Generalized Time-Frequency Masking Respecting Spectrogram Consistency. LVA/ICA 2010: 89-96 - [c13]Masahiro Nakano, Jonathan Le Roux, Hirokazu Kameoka, Yu Kitano, Nobutaka Ono
, Shigeki Sagayama:
Nonnegative Matrix Factorization with Markov-Chained Bases for Modeling Time-Varying Patterns in Music Spectrograms. LVA/ICA 2010: 149-156 - [c12]Hirokazu Kameoka, Takuya Yoshioka, Mariko Hamamura, Jonathan Le Roux, Kunio Kashino:
Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation. LVA/ICA 2010: 245-253 - [c11]Hirokazu Kameoka, Jonathan Le Roux, Yasunori Ohishi:
A statistical model of speech F0 contours. SAPA@INTERSPEECH 2010: 43-48 - [p1]Nobutaka Ono
, Kenichi Miyamoto, Hirokazu Kameoka, Jonathan Le Roux, Yuuki Uchiyama, Emiru Tsunoo, Takuya Nishimoto, Shigeki Sagayama:
Harmonic and Percussive Sound Separation and Its Application to MIR-Related Tasks. Advances in Music Information Retrieval 2010: 213-236
2000 – 2009
- 2008
- [c10]Nobutaka Ono, Kenichi Miyamoto, Jonathan Le Roux, Hirokazu Kameoka, Shigeki Sagayama:
Separation of a monaural audio signal into harmonic/percussive components by complementary diffusion on spectrogram. EUSIPCO 2008: 1-4 - [c9]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Shigeki Sagayama, Alain de Cheveigné
:
Modulation analysis of speech through orthogonal FIR filterbank optimization. ICASSP 2008: 4189-4192 - [c8]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono, Alain de Cheveigné, Shigeki Sagayama:
Computational auditory induction by missing-data non-negative matrix factorization. SAPA@INTERSPEECH 2008: 1-6 - [c7]Jonathan Le Roux, Nobutaka Ono, Shigeki Sagayama:
Explicit consistency constraints for STFT spectrograms and their application to phase reconstruction. SAPA@INTERSPEECH 2008: 23-28 - [c6]Jonathan Le Roux, Alain de Cheveigné, Lucas C. Parra:
Adaptive Template Matching with Shift-Invariant Semi-NMF. NIPS 2008: 921-928 - 2007
- [j2]Erik McDermott, Timothy J. Hazen, Jonathan Le Roux, Atsushi Nakamura, Shigeru Katagiri:
Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error. IEEE Trans. Speech Audio Process. 15(1): 203-223 (2007) - [j1]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Alain de Cheveigné
, Shigeki Sagayama:
Single and Multiple F0 Contour Estimation Through Parametric Spectrogram Modeling of Speech in Noisy Environments. IEEE Trans. Speech Audio Process. 15(4): 1135-1145 (2007) - [c5]Alain de Cheveigné, Jonathan Le Roux, Jonathan Z. Simon
:
MEG Signal Denoising Based on Time-Shift PCA. ICASSP (1) 2007: 317-320 - [c4]Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Alain de Cheveigné, Shigeki Sagayama:
Harmonic-Temporal Clustering of Speech for Single and Multiple F0 Contour Estimation in Noisy Environments. ICASSP (4) 2007: 1053-1056 - 2006
- [c3]Hirokazu Kameoka, Jonathan Le Roux, Nobutaka Ono, Shigeki Sagayama:
Speech analyzer using a joint estimation model of spectral envelope and fine structure. INTERSPEECH 2006 - 2005
- [c2]Jonathan Le Roux, Erik McDermott:
Optimization methods for discriminative training. INTERSPEECH 2005: 3341-3344 - 2002
- [c1]Hongping Yan, Jean François Barczi, Philippe de Reffye, Bao-Gang Hu, Marc Jaeger, Jonathan Le Roux:
Fast Algorithms of Plant Computation Based on Substructure Instances. WSCG (Short Papers) 2002: 145-152
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-03-08 00:50 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint