default search action
Joon Son Chung
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j9]Youkyum Kim, Jaemin Jung, Jihwan Park, Byeong-Yeol Kim, Joon Son Chung:
Bridging the Gap Between Audio and Text Using Parallel-Attention for User-Defined Keyword Spotting. IEEE Signal Process. Lett. 31: 2100-2104 (2024) - [j8]Mehmet Hamza Erol, Arda Senocak, Jiu Feng, Joon Son Chung:
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning. IEEE Signal Process. Lett. 31: 2975-2979 (2024) - [j7]Jaesung Huh, Joon Son Chung, Arsha Nagrani, Andrew Brown, Jee-weon Jung, Daniel Garcia-Romero, Andrew Zisserman:
The VoxCeleb Speaker Recognition Challenge: A Retrospective. IEEE ACM Trans. Audio Speech Lang. Process. 32: 3850-3866 (2024) - [c70]Ji-Hoon Kim, Jaehun Kim, Joon Son Chung:
Let There Be Sound: Reconstructing High Quality Speech from Silent Videos. AAAI 2024: 2759-2767 - [c69]Dawit Mureja Argaw, Mattia Soldan, Alejandro Pardo, Chen Zhao, Fabian Caba Heilbron, Joon Son Chung, Bernard Ghanem:
Towards Automated Movie Trailer Generation. CVPR 2024: 7445-7454 - [c68]Dawit Mureja Argaw, Seunghyun Yoon, Fabian Caba Heilbron, Hanieh Deilamsalehy, Trung Bui, Zhaowen Wang, Franck Dernoncourt, Joon Son Chung:
Scaling Up Video Summarization Pretraining with Large Language Models. CVPR 2024: 8332-8341 - [c67]Youngjoon Jang, Ji-Hoon Kim, Junseok Ahn, Doyeop Kwak, Hongsun Yang, Yooncheol Ju, Ilhwan Kim, Byeong-Yeol Kim, Joon Son Chung:
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text. CVPR 2024: 8818-8828 - [c66]Jiu Feng, Mehmet Hamza Erol, Joon Son Chung, Arda Senocak:
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers. ICASSP 2024: 1416-1420 - [c65]Junseok Ahn, Youngjoon Jang, Joon Son Chung:
Slowfast Network for Continuous Sign Language Recognition. ICASSP 2024: 3920-3924 - [c64]Jongbhin Woo, Hyeonggon Ryu, Arda Senocak, Joon Son Chung:
Speech Guided Masked Image Modeling for Visually Grounded Speech. ICASSP 2024: 8361-8365 - [c63]Chaeyoung Jung, Suyeon Lee, Kihyun Nam, Kyeongha Rho, You Jin Kim, Youngjoon Jang, Joon Son Chung:
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning. ICASSP 2024: 8391-8395 - [c62]Tan Dat Nguyen, Ji-Hoon Kim, Youngjoon Jang, Jaehun Kim, Joon Son Chung:
Fregrad: Lightweight and Fast Frequency-Aware Diffusion Vocoder. ICASSP 2024: 10736-10740 - [c61]Hee-Soo Heo, Kihyun Nam, Bong-Jin Lee, Youngki Kwon, Minjae Lee, You Jin Kim, Joon Son Chung:
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification. ICASSP 2024: 12321-12325 - [c60]Doyeop Kwak, Jaemin Jung, Kihyun Nam, Youngjoon Jang, Jee-Weon Jung, Shinji Watanabe, Joon Son Chung:
VoxMM: Rich Transcription of Conversations in the Wild. ICASSP 2024: 12551-12555 - [c59]Yeonghyeon Lee, Inmo Yeon, Juhan Nam, Joon Son Chung:
VoiceLDM: Text-to-Speech with Environmental Context. ICASSP 2024: 12566-12571 - [c58]Suyeon Lee, Chaeyoung Jung, Youngjoon Jang, Jaehun Kim, Joon Son Chung:
Seeing Through The Conversation: Audio-Visual Speech Separation Based on Diffusion Model. ICASSP 2024: 12632-12636 - [c57]Jongsuk Kim, Hyeongkeun Lee, Kyeongha Rho, Junmo Kim, Joon Son Chung:
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning. ICML 2024 - [c56]Jongbhin Woo, Hyeonggon Ryu, Youngjoon Jang, Jae-Won Cho, Joon Son Chung:
Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding. ACM Multimedia 2024: 8199-8208 - [c55]Joon Son Chung:
Multimodal Learning of Speech and Speaker Representations. Odyssey 2024 - [c54]Sooyoung Park, Arda Senocak, Joon Son Chung:
Can CLIP Help Sound Source Localization? WACV 2024: 5699-5708 - [i81]Jiu Feng, Mehmet Hamza Erol, Joon Son Chung, Arda Senocak:
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers. CoRR abs/2401.08415 (2024) - [i80]Tan Dat Nguyen, Ji-Hoon Kim, Youngjoon Jang, Jaehun Kim, Joon Son Chung:
FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder. CoRR abs/2401.10032 (2024) - [i79]Jongsuk Kim, Hyeongkeun Lee, Kyeongha Rho, Junmo Kim, Joon Son Chung:
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning. CoRR abs/2403.09502 (2024) - [i78]Dawit Mureja Argaw, Seunghyun Yoon, Fabian Caba Heilbron, Hanieh Deilamsalehy, Trung Bui, Zhaowen Wang, Franck Dernoncourt, Joon Son Chung:
Scaling Up Video Summarization Pretraining with Large Language Models. CoRR abs/2404.03398 (2024) - [i77]Dawit Mureja Argaw, Mattia Soldan, Alejandro Pardo, Chen Zhao, Fabian Caba Heilbron, Joon Son Chung, Bernard Ghanem:
Towards Automated Movie Trailer Generation. CoRR abs/2404.03477 (2024) - [i76]Youngjoon Jang, Ji-Hoon Kim, Junseok Ahn, Doyeop Kwak, Hongsun Yang, Yooncheol Ju, Ilhwan Kim, Byeong-Yeol Kim, Joon Son Chung:
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text. CoRR abs/2405.10272 (2024) - [i75]Mehmet Hamza Erol, Arda Senocak, Jiu Feng, Joon Son Chung:
Audio Mamba: Bidirectional State Space Model for Audio Representation Learning. CoRR abs/2406.03344 (2024) - [i74]Jee-weon Jung, Xin Wang, Nicholas W. D. Evans, Shinji Watanabe, Hye-jin Shim, Hemlata Tak, Sidhhant Arora, Junichi Yamagishi, Joon Son Chung:
To what extent can ASV systems naturally defend against spoofing attacks? CoRR abs/2406.05339 (2024) - [i73]Chaeyoung Jung, Suyeon Lee, Ji-Hoon Kim, Joon Son Chung:
FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching. CoRR abs/2406.09286 (2024) - [i72]Jaesong Lee, Soyoon Kim, Hanbyul Kim, Joon Son Chung:
Lightweight Audio Segmentation for Long-form Speech Translation. CoRR abs/2406.10549 (2024) - [i71]Kihyun Nam, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung:
Disentangled Representation Learning for Environment-agnostic Speaker Recognition. CoRR abs/2406.14559 (2024) - [i70]Jiu Feng, Mehmet Hamza Erol, Joon Son Chung, Arda Senocak:
ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions. CoRR abs/2407.08691 (2024) - [i69]Arda Senocak, Hyeonggon Ryu, Junsik Kim, Tae-Hyun Oh, Hanspeter Pfister, Joon Son Chung:
Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment. CoRR abs/2407.13676 (2024) - [i68]Jaesung Huh, Joon Son Chung, Arsha Nagrani, Andrew Brown, Jee-weon Jung, Daniel Garcia-Romero, Andrew Zisserman:
The VoxCeleb Speaker Recognition Challenge: A Retrospective. CoRR abs/2408.14886 (2024) - [i67]Jee-weon Jung, Wangyou Zhang, Soumi Maiti, Yihan Wu, Xin Wang, Ji-Hoon Kim, Yuta Matsunaga, Seyun Um, Jinchuan Tian, Hye-jin Shim, Nicholas W. D. Evans, Joon Son Chung, Shinnosuke Takamichi, Shinji Watanabe:
Text-To-Speech Synthesis In The Wild. CoRR abs/2409.08711 (2024) - [i66]Jee-weon Jung, Yihan Wu, Xin Wang, Ji-Hoon Kim, Soumi Maiti, Yuta Matsunaga, Hye-jin Shim, Jinchuan Tian, Nicholas W. D. Evans, Joon Son Chung, Wangyou Zhang, Seyun Um, Shinnosuke Takamichi, Shinji Watanabe:
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild. CoRR abs/2409.17285 (2024) - 2023
- [c53]Youngjoon Jang, Youngtaek Oh, Jae-Won Cho, Myungchul Kim, Dong-Jin Kim, In So Kweon, Joon Son Chung:
Self-Sufficient Framework for Continuous Sign Language Recognition. ICASSP 2023: 1-5 - [c52]Jee-Weon Jung, Hee-Soo Heo, Bong-Jin Lee, Jaesung Huh, Andrew Brown, Youngki Kwon, Shinji Watanabe, Joon Son Chung:
In Search of Strong Embedding Extractors for Speaker Diarisation. ICASSP 2023: 1-5 - [c51]Jaemin Jung, Youkyum Kim, Jihwan Park, Youshin Lim, Byeong-Yeol Kim, Youngjoon Jang, Joon Son Chung:
Metric Learning for User-Defined Keyword Spotting. ICASSP 2023: 1-5 - [c50]You Jin Kim, Hee-Soo Heo, Jee-Weon Jung, Youngki Kwon, Bong-Jin Lee, Joon Son Chung:
Advancing the Dimensionality Reduction of Speaker Embeddings for Speaker Diarisation: Disentangling Noise and Informing Speech Activity. ICASSP 2023: 1-5 - [c49]Jiyoung Lee, Joon Son Chung, Soo-Whan Chung:
Imaginary Voice: Face-Styled Diffusion Model for Text-to-Speech. ICASSP 2023: 1-5 - [c48]Sooyoung Park, Arda Senocak, Joon Son Chung:
MarginNCE: Robust Sound Localization with a Negative Margin. ICASSP 2023: 1-5 - [c47]Hyeonggon Ryu, Arda Senocak, In So Kweon, Joon Son Chung:
Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples. ICASSP 2023: 1-5 - [c46]Arda Senocak, Hyeonggon Ryu, Junsik Kim, Tae-Hyun Oh, Hanspeter Pfister, Joon Son Chung:
Sound Source Localization is All about Cross-Modal Alignment. ICCV 2023: 7743-7753 - [c45]Jiu Feng, Mehmet Hamza Erol, Joon Son Chung, Arda Senocak:
FlexiAST: Flexibility is What AST Needs. INTERSPEECH 2023: 2828-2832 - [c44]Hee-Soo Heo, Jee-weon Jung, Jingu Kang, Youngki Kwon, Bong-Jin Lee, You Jin Kim, Joon Son Chung:
Curriculum Learning for Self-supervised Speaker Verification. INTERSPEECH 2023: 4693-4697 - [c43]Kihyun Nam, Youkyum Kim, Jaesung Huh, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung:
Disentangled Representation Learning for Multilingual Speaker Recognition. INTERSPEECH 2023: 5316-5320 - [c42]Youngjoon Jang, Kyeongha Rho, Jong-Bin Woo, Hyeongkeun Lee, Jihwan Park, Youshin Lim, Byeong-Yeol Kim, Joon Son Chung:
That's What I Said: Fully-Controllable Talking Face Generation. ACM Multimedia 2023: 3827-3836 - [i65]Jaesung Huh, Andrew Brown, Jee-weon Jung, Joon Son Chung, Arsha Nagrani, Daniel Garcia-Romero, Andrew Zisserman:
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge. CoRR abs/2302.10248 (2023) - [i64]Jiyoung Lee, Joon Son Chung, Soo-Whan Chung:
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech. CoRR abs/2302.13700 (2023) - [i63]Youngjoon Jang, Youngtaek Oh, Jae-Won Cho, Myungchul Kim, Dong-Jin Kim, In So Kweon, Joon Son Chung:
Self-Sufficient Framework for Continuous Sign Language Recognition. CoRR abs/2303.11771 (2023) - [i62]Hyeonggon Ryu, Arda Senocak, In So Kweon, Joon Son Chung:
Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples. CoRR abs/2303.17517 (2023) - [i61]Youngjoon Jang, Kyeongha Rho, Jong-Bin Woo, Hyeongkeun Lee, Jihwan Park, Youshin Lim, Byeong-Yeol Kim, Joon Son Chung:
That's What I Said: Fully-Controllable Talking Face Generation. CoRR abs/2304.03275 (2023) - [i60]Jiu Feng, Mehmet Hamza Erol, Joon Son Chung, Arda Senocak:
FlexiAST: Flexibility is What AST Needs. CoRR abs/2307.09286 (2023) - [i59]Ji-Hoon Kim, Jaehun Kim, Joon Son Chung:
Let There Be Sound: Reconstructing High Quality Speech from Silent Videos. CoRR abs/2308.15256 (2023) - [i58]Arda Senocak, Hyeonggon Ryu, Junsik Kim, Tae-Hyun Oh, Hanspeter Pfister, Joon Son Chung:
Sound Source Localization is All about Cross-Modal Alignment. CoRR abs/2309.10724 (2023) - [i57]Junseok Ahn, Youngjoon Jang, Joon Son Chung:
SlowFast Network for Continuous Sign Language Recognition. CoRR abs/2309.12304 (2023) - [i56]Chaeyoung Jung, Suyeon Lee, Kihyun Nam, Kyeongha Rho, You Jin Kim, Youngjoon Jang, Joon Son Chung:
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning. CoRR abs/2309.12306 (2023) - [i55]Yeonghyeon Lee, Inmo Yeon, Juhan Nam, Joon Son Chung:
VoiceLDM: Text-to-Speech with Environmental Context. CoRR abs/2309.13664 (2023) - [i54]Hee-Soo Heo, Kihyun Nam, Bong-Jin Lee, Youngki Kwon, Minjae Lee, You Jin Kim, Joon Son Chung:
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification. CoRR abs/2309.14741 (2023) - [i53]Suyeon Lee, Chaeyoung Jung, Youngjoon Jang, Jaehun Kim, Joon Son Chung:
Seeing Through the Conversation: Audio-Visual Speech Separation based on Diffusion Model. CoRR abs/2310.19581 (2023) - [i52]Sooyoung Park, Arda Senocak, Joon Son Chung:
Can CLIP Help Sound Source Localization? CoRR abs/2311.04066 (2023) - 2022
- [j6]Jingu Kang, Jaesung Huh, Hee Soo Heo, Joon Son Chung:
Augmentation Adversarial Training for Self-Supervised Speaker Representation Learning. IEEE J. Sel. Top. Signal Process. 16(6): 1253-1262 (2022) - [j5]Triantafyllos Afouras, Joon Son Chung, Andrew W. Senior, Oriol Vinyals, Andrew Zisserman:
Deep Audio-Visual Speech Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 44(12): 8717-8727 (2022) - [c41]Youngjoon Jang, Youngtaek Oh, Jae-Won Cho, Dong-Jin Kim, Joon Son Chung, In So Kweon:
Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language Recognition. BMVC 2022: 322 - [c40]Jee-weon Jung, Hee-Soo Heo, Hemlata Tak, Hye-jin Shim, Joon Son Chung, Bong-Jin Lee, Ha-Jin Yu, Nicholas W. D. Evans:
AASIST: Audio Anti-Spoofing Using Integrated Spectro-Temporal Graph Attention Networks. ICASSP 2022: 6367-6371 - [c39]Namkyu Jung, Geonmin Kim, Joon Son Chung:
Spell My Name: Keyword Boosted Speech Recognition. ICASSP 2022: 6642-6646 - [c38]Youngki Kwon, Hee-Soo Heo, Jee-Weon Jung, You Jin Kim, Bong-Jin Lee, Joon Son Chung:
Multi-Scale Speaker Embedding-Based Graph Attention Networks For Speaker Diarisation. ICASSP 2022: 8367-8371 - [c37]Jee-weon Jung, You Jin Kim, Hee-Soo Heo, Bong-Jin Lee, Youngki Kwon, Joon Son Chung:
Pushing the limits of raw waveform speaker recognition. INTERSPEECH 2022: 2228-2232 - [c36]Hye-jin Shim, Hemlata Tak, Xuechen Liu, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung, Soo-Whan Chung, Ha-Jin Yu, Bong-Jin Lee, Massimiliano Todisco, Héctor Delgado, Kong Aik Lee, Md. Sahidullah, Tomi Kinnunen, Nicholas W. D. Evans:
Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion. Odyssey 2022: 330-337 - [i51]Andrew Brown, Jaesung Huh, Joon Son Chung, Arsha Nagrani, Andrew Zisserman:
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge. CoRR abs/2201.04583 (2022) - [i50]Jee-weon Jung, You Jin Kim, Hee-Soo Heo, Bong-Jin Lee, Youngki Kwon, Joon Son Chung:
Pushing the limits of raw waveform speaker recognition. CoRR abs/2203.08488 (2022) - [i49]Hye-jin Shim, Hemlata Tak, Xuechen Liu, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung, Soo-Whan Chung, Ha-Jin Yu, Bong-Jin Lee, Massimiliano Todisco, Héctor Delgado, Kong Aik Lee, Md. Sahidullah, Tomi Kinnunen, Nicholas W. D. Evans:
Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion. CoRR abs/2204.09976 (2022) - [i48]Jee-weon Jung, Hee-Soo Heo, Bong-Jin Lee, Jaesong Lee, Hye-jin Shim, Youngki Kwon, Joon Son Chung, Shinji Watanabe:
Large-scale learning of generalised representations for speaker recognition. CoRR abs/2210.10985 (2022) - [i47]Jee-weon Jung, Hee-Soo Heo, Bong-Jin Lee, Jaesung Huh, Andrew Brown, Youngki Kwon, Shinji Watanabe, Joon Son Chung:
In search of strong embedding extractors for speaker diarisation. CoRR abs/2210.14682 (2022) - [i46]Kihyun Nam, Youkyum Kim, Hee Soo Heo, Jee-weon Jung, Joon Son Chung:
Disentangled representation learning for multilingual speaker recognition. CoRR abs/2211.00437 (2022) - [i45]Jaemin Jung, Youkyum Kim, Jihwan Park, Youshin Lim, Byeong-Yeol Kim, Youngjoon Jang, Joon Son Chung:
Metric Learning for User-defined Keyword Spotting. CoRR abs/2211.00439 (2022) - [i44]Youngjoon Jang, Youngtaek Oh, Jae-Won Cho, Dong-Jin Kim, Joon Son Chung, In So Kweon:
Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language Recognition. CoRR abs/2211.00448 (2022) - [i43]Sooyoung Park, Arda Senocak, Joon Son Chung:
MarginNCE: Robust Sound Localization with a Negative Margin. CoRR abs/2211.01966 (2022) - 2021
- [c35]Yoohwan Kwon, Hee-Soo Heo, Bong-Jin Lee, Joon Son Chung:
The ins and outs of speaker recognition: lessons from VoxSRC 2020. ICASSP 2021: 5809-5813 - [c34]Jee-weon Jung, Hee-Soo Heo, Ha-Jin Yu, Joon Son Chung:
Graph Attention Networks for Speaker Verification. ICASSP 2021: 6149-6153 - [c33]Andrew Brown, Jaesung Huh, Arsha Nagrani, Joon Son Chung, Andrew Zisserman:
Playing a Part: Speaker Verification at the movies. ICASSP 2021: 6174-6178 - [c32]Jee-weon Jung, Hee-Soo Heo, Youngki Kwon, Joon Son Chung, Bong-Jin Lee:
Three-Class Overlapped Speech Detection Using a Convolutional Recurrent Neural Network. Interspeech 2021: 3086-3090 - [c31]Youngki Kwon, Jee-weon Jung, Hee-Soo Heo, You Jin Kim, Bong-Jin Lee, Joon Son Chung:
Adapting Speaker Embeddings for Speaker Diarisation. Interspeech 2021: 3101-3105 - [c30]You Jin Kim, Hee-Soo Heo, Soyeon Choe, Soo-Whan Chung, Yoohwan Kwon, Bong-Jin Lee, Youngki Kwon, Joon Son Chung:
Look Who's Talking: Active Speaker Detection in the Wild. Interspeech 2021: 3675-3679 - [c29]Jaesung Huh, Minjae Lee, Heesoo Heo, Seongkyu Mun, Joon Son Chung:
Metric Learning for Keyword Spotting. SLT 2021: 133-140 - [c28]Seong Min Kye, Joon Son Chung, Hoirin Kim:
Supervised Attention for Speaker Recognition. SLT 2021: 286-293 - [c27]Seong Min Kye, Yoohwan Kwon, Joon Son Chung:
Cross Attentive Pooling for Speaker Verification. SLT 2021: 294-300 - [c26]Youngki Kwon, Hee Soo Heo, Jaesung Huh, Bong-Jin Lee, Joon Son Chung:
Look Who's Not Talking. SLT 2021: 567-573 - [i42]Jee-weon Jung, Hee-Soo Heo, Youngki Kwon, Joon Son Chung, Bong-Jin Lee:
Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network. CoRR abs/2104.02878 (2021) - [i41]Youngki Kwon, Jee-weon Jung, Hee-Soo Heo, You Jin Kim, Bong-Jin Lee, Joon Son Chung:
Adapting Speaker Embeddings for Speaker Diarisation. CoRR abs/2104.02879 (2021) - [i40]You Jin Kim, Hee-Soo Heo, Soyeon Choe, Soo-Whan Chung, Yoohwan Kwon, Bong-Jin Lee, Youngki Kwon, Joon Son Chung:
Look Who's Talking: Active Speaker Detection in the Wild. CoRR abs/2108.07640 (2021) - [i39]Jee-weon Jung, Hee-Soo Heo, Hemlata Tak, Hye-jin Shim, Joon Son Chung, Bong-Jin Lee, Ha-Jin Yu, Nicholas W. D. Evans:
AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks. CoRR abs/2110.01200 (2021) - [i38]Namkyu Jung, Geonmin Kim, Joon Son Chung:
Spell my name: keyword boosted speech recognition. CoRR abs/2110.02791 (2021) - [i37]Youngki Kwon, Hee-Soo Heo, Jee-weon Jung, You Jin Kim, Bong-Jin Lee, Joon Son Chung:
Multi-scale speaker embedding-based graph attention networks for speaker diarisation. CoRR abs/2110.03361 (2021) - [i36]You Jin Kim, Hee-Soo Heo, Jee-weon Jung, Youngki Kwon, Bong-Jin Lee, Joon Son Chung:
Disentangled dimensionality reduction for noise-robust speaker diarisation. CoRR abs/2110.03380 (2021) - 2020
- [j4]Arsha Nagrani, Joon Son Chung, Weidi Xie, Andrew Zisserman:
Voxceleb: Large-scale speaker verification in the wild. Comput. Speech Lang. 60 (2020) - [j3]Soo-Whan Chung, Joon Son Chung, Hong-Goo Kang:
Perfect Match: Self-Supervised Embeddings for Cross-Modal Retrieval. IEEE J. Sel. Top. Signal Process. 14(3): 568-576 (2020) - [c25]Samuel Albanie, Gül Varol, Liliane Momeni, Triantafyllos Afouras, Joon Son Chung, Neil Fox, Andrew Zisserman:
BSL-1K: Scaling Up Co-articulated Sign Language Recognition Using Mouthing Cues. ECCV (11) 2020: 35-53 - [c24]Triantafyllos Afouras, Andrew Owens, Joon Son Chung, Andrew Zisserman:
Self-supervised Learning of Audio-Visual Objects from Video. ECCV (18) 2020: 208-224 - [c23]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
ASR is All You Need: Cross-Modal Distillation for Lip Reading. ICASSP 2020: 2143-2147 - [c22]Arsha Nagrani, Joon Son Chung, Samuel Albanie, Andrew Zisserman:
Disentangled Speech Embeddings Using Cross-Modal Self-Supervision. ICASSP 2020: 6829-6833 - [c21]Seongkyu Mun, Soyeon Choe, Jaesung Huh, Joon Son Chung:
The Sound of My Voice: Speaker Representation Loss for Target Voice Separation. ICASSP 2020: 7289-7293 - [c20]Joon Son Chung, Jaesung Huh, Arsha Nagrani, Triantafyllos Afouras, Andrew Zisserman:
Spot the Conversation: Speaker Diarisation in the Wild. INTERSPEECH 2020: 299-303 - [c19]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
Now You're Speaking My Language: Visual Language Identification. INTERSPEECH 2020: 2402-2406 - [c18]Joon Son Chung, Jaesung Huh, Seongkyu Mun, Minjae Lee, Hee-Soo Heo, Soyeon Choe, Chiheon Ham, Sunghwan Jung, Bong-Jin Lee, Icksang Han:
In Defence of Metric Learning for Speaker Recognition. INTERSPEECH 2020: 2977-2981 - [c17]Soo-Whan Chung, Soyeon Choe, Joon Son Chung, Hong-Goo Kang:
FaceFilter: Audio-Visual Speech Separation Using Still Images. INTERSPEECH 2020: 3481-3485 - [c16]Soo-Whan Chung, Hong-Goo Kang, Joon Son Chung:
Seeing Voices and Hearing Voices: Learning Discriminative Embeddings Using Cross-Modal Self-Supervision. INTERSPEECH 2020: 3486-3490 - [c15]Joon Son Chung, Jaesung Huh, Seongkyu Mun:
Delving into VoxCeleb: Environment Invariant Speaker Recognition. Odyssey 2020: 349-356 - [i35]Arsha Nagrani, Joon Son Chung, Samuel Albanie, Andrew Zisserman:
Disentangled Speech Embeddings using Cross-modal Self-supervision. CoRR abs/2002.08742 (2020) - [i34]Joon Son Chung, Jaesung Huh, Seongkyu Mun, Minjae Lee, Hee Soo Heo, Soyeon Choe, Chiheon Ham, Sunghwan Jung, Bong-Jin Lee, Icksang Han:
In defence of metric learning for speaker recognition. CoRR abs/2003.11982 (2020) - [i33]Soo-Whan Chung, Hong-Goo Kang, Joon Son Chung:
Seeing voices and hearing voices: learning discriminative embeddings using cross-modal self-supervision. CoRR abs/2004.14326 (2020) - [i32]Soo-Whan Chung, Soyeon Choe, Joon Son Chung, Hong-Goo Kang:
FaceFilter: Audio-visual speech separation using still images. CoRR abs/2005.07074 (2020) - [i31]Jaesung Huh, Minjae Lee, Heesoo Heo, Seongkyu Mun, Joon Son Chung:
Metric Learning for Keyword Spotting. CoRR abs/2005.08776 (2020) - [i30]Joon Son Chung, Jaesung Huh, Arsha Nagrani, Triantafyllos Afouras, Andrew Zisserman:
Spot the conversation: speaker diarisation in the wild. CoRR abs/2007.01216 (2020) - [i29]Jaesung Huh, Hee Soo Heo, Jingu Kang, Shinji Watanabe, Joon Son Chung:
Augmentation adversarial training for unsupervised speaker recognition. CoRR abs/2007.12085 (2020) - [i28]Samuel Albanie, Gül Varol, Liliane Momeni, Triantafyllos Afouras, Joon Son Chung, Neil Fox, Andrew Zisserman:
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues. CoRR abs/2007.12131 (2020) - [i27]Triantafyllos Afouras, Andrew Owens, Joon Son Chung, Andrew Zisserman:
Self-Supervised Learning of Audio-Visual Objects from Video. CoRR abs/2008.04237 (2020) - [i26]Seong Min Kye, Yoohwan Kwon, Joon Son Chung:
Cross attentive pooling for speaker verification. CoRR abs/2008.05983 (2020) - [i25]Hee Soo Heo, Bong-Jin Lee, Jaesung Huh, Joon Son Chung:
Clova Baseline System for the VoxCeleb Speaker Recognition Challenge 2020. CoRR abs/2009.14153 (2020) - [i24]Jee-weon Jung, Hee-Soo Heo, Ha-Jin Yu, Joon Son Chung:
Graph Attention Networks for Speaker Verification. CoRR abs/2010.11543 (2020) - [i23]Andrew Brown, Jaesung Huh, Arsha Nagrani, Joon Son Chung, Andrew Zisserman:
Playing a Part: Speaker Verification at the Movies. CoRR abs/2010.15716 (2020) - [i22]Yoohwan Kwon, Hee-Soo Heo, Bong-Jin Lee, Joon Son Chung:
The ins and outs of speaker recognition: lessons from VoxSRC 2020. CoRR abs/2010.15809 (2020) - [i21]Seong Min Kye, Joon Son Chung, Hoirin Kim:
Supervised attention for speaker recognition. CoRR abs/2011.05189 (2020) - [i20]Youngki Kwon, Hee Soo Heo, Jaesung Huh, Bong-Jin Lee, Joon Son Chung:
Look who's not talking. CoRR abs/2011.14885 (2020) - [i19]Arsha Nagrani, Joon Son Chung, Jaesung Huh, Andrew Brown, Ernesto Coto, Weidi Xie, Mitchell McLaren, Douglas A. Reynolds, Andrew Zisserman:
VoxSRC 2020: The Second VoxCeleb Speaker Recognition Challenge. CoRR abs/2012.06867 (2020)
2010 – 2019
- 2019
- [j2]Amir Jamaludin, Joon Son Chung, Andrew Zisserman:
You Said That?: Synthesising Talking Faces from Audio. Int. J. Comput. Vis. 127(11-12): 1767-1779 (2019) - [c14]Soo-Whan Chung, Joon Son Chung, Hong-Goo Kang:
Perfect Match: Improved Cross-modal Embeddings for Audio-visual Synchronisation. ICASSP 2019: 3965-3969 - [c13]Weidi Xie, Arsha Nagrani, Joon Son Chung, Andrew Zisserman:
Utterance-level Aggregation for Speaker Recognition in the Wild. ICASSP 2019: 5791-5795 - [c12]Joon Son Chung, Bong-Jin Lee, Icksang Han:
Who Said That?: Audio-Visual Speaker Diarisation of Real-World Meetings. INTERSPEECH 2019: 371-375 - [c11]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
My Lips Are Concealed: Audio-Visual Speech Enhancement Through Obstructions. INTERSPEECH 2019: 4295-4299 - [i18]Weidi Xie, Arsha Nagrani, Joon Son Chung, Andrew Zisserman:
Utterance-level Aggregation For Speaker Recognition In The Wild. CoRR abs/1902.10107 (2019) - [i17]Joon Son Chung, Bong-Jin Lee, Icksang Han:
Who said that?: Audio-visual speaker diarisation of real-world meetings. CoRR abs/1906.10042 (2019) - [i16]Joon Son Chung:
Naver at ActivityNet Challenge 2019 - Task B Active Speaker Detection (AVA). CoRR abs/1906.10555 (2019) - [i15]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
My lips are concealed: Audio-visual speech enhancement through obstructions. CoRR abs/1907.04975 (2019) - [i14]Joon Son Chung, Jaesung Huh, Seongkyu Mun:
Delving into VoxCeleb: environment invariant speaker recognition. CoRR abs/1910.11238 (2019) - [i13]Seongkyu Mun, Soyeon Choe, Jaesung Huh, Joon Son Chung:
The sound of my voice: speaker representation loss for target voice separation. CoRR abs/1911.02411 (2019) - [i12]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
ASR is all you need: cross-modal distillation for lip reading. CoRR abs/1911.12747 (2019) - [i11]Joon Son Chung, Arsha Nagrani, Ernesto Coto, Weidi Xie, Mitchell McLaren, Douglas A. Reynolds, Andrew Zisserman:
VoxSRC 2019: The first VoxCeleb Speaker Recognition Challenge. CoRR abs/1912.02522 (2019) - 2018
- [j1]Joon Son Chung, Andrew Zisserman:
Learning to lip read words by watching videos. Comput. Vis. Image Underst. 173: 76-85 (2018) - [c10]Joon Son Chung, Arsha Nagrani, Andrew Zisserman:
VoxCeleb2: Deep Speaker Recognition. INTERSPEECH 2018: 1086-1090 - [c9]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
The Conversation: Deep Audio-Visual Speech Enhancement. INTERSPEECH 2018: 3244-3248 - [c8]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
Deep Lip Reading: A Comparison of Models and an Online Application. INTERSPEECH 2018: 3514-3518 - [i10]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
The Conversation: Deep Audio-Visual Speech Enhancement. CoRR abs/1804.04121 (2018) - [i9]Joon Son Chung, Arsha Nagrani, Andrew Zisserman:
VoxCeleb2: Deep Speaker Recognition. CoRR abs/1806.05622 (2018) - [i8]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
Deep Lip Reading: a comparison of models and an online application. CoRR abs/1806.06053 (2018) - [i7]Triantafyllos Afouras, Joon Son Chung, Andrew Zisserman:
LRS3-TED: a large-scale dataset for visual speech recognition. CoRR abs/1809.00496 (2018) - [i6]Triantafyllos Afouras, Joon Son Chung, Andrew W. Senior, Oriol Vinyals, Andrew Zisserman:
Deep Audio-Visual Speech Recognition. CoRR abs/1809.02108 (2018) - [i5]Soo-Whan Chung, Joon Son Chung, Hong-Goo Kang:
Perfect match: Improved cross-modal embeddings for audio-visual synchronisation. CoRR abs/1809.08001 (2018) - 2017
- [b1]Joon Son Chung:
Visual recognition of human communication. University of Oxford, UK, 2017 - [c7]Joon Son Chung, Amir Jamaludin, Andrew Zisserman:
You said that? BMVC 2017 - [c6]Joon Son Chung, Andrew Zisserman:
Lip Reading in Profile. BMVC 2017 - [c5]Joon Son Chung, Andrew W. Senior, Oriol Vinyals, Andrew Zisserman:
Lip Reading Sentences in the Wild. CVPR 2017: 3444-3453 - [c4]Arsha Nagrani, Joon Son Chung, Andrew Zisserman:
VoxCeleb: A Large-Scale Speaker Identification Dataset. INTERSPEECH 2017: 2616-2620 - [i4]Joon Son Chung, Amir Jamaludin, Andrew Zisserman:
You said that? CoRR abs/1705.02966 (2017) - [i3]Arsha Nagrani, Joon Son Chung, Andrew Zisserman:
VoxCeleb: a large-scale speaker identification dataset. CoRR abs/1706.08612 (2017) - 2016
- [c3]Joon Son Chung, Andrew Zisserman:
Lip Reading in the Wild. ACCV (2) 2016: 87-103 - [c2]Joon Son Chung, Andrew Zisserman:
Out of Time: Automated Lip Sync in the Wild. ACCV Workshops (2) 2016: 251-263 - [i2]Joon Son Chung, Andrew Zisserman:
Signs in time: Encoding human motion as a temporal image. CoRR abs/1608.02059 (2016) - [i1]Joon Son Chung, Andrew W. Senior, Oriol Vinyals, Andrew Zisserman:
Lip Reading Sentences in the Wild. CoRR abs/1611.05358 (2016) - 2014
- [c1]Joon Son Chung, Relja Arandjelovic, Giles Bergel, Alexandra Franklin, Andrew Zisserman:
Re-presentations of Art Collections. ECCV Workshops (1) 2014: 85-100
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-07 21:30 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint