default search action

combined dblp search
author search
venue search
publication search

ask others

Guangzhi Sun

> Home > Persons

Person information

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2025
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/csl/SunZVBW25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/csl/SunZVBW25
Guangzhi Sun, Chao Zhang, Ivan Vulic, Pawel Budzianowski, Philip C. Woodland:
Knowledge-aware audio-grounded generative slot filling for limited annotated data. Comput. Speech Lang. 89: 101707 (2025)
2024
[j4]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/SunZW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/SunZW24
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Graph Neural Networks for Contextual ASR With the Tree-Constrained Pointer Generator. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2407-2417 (2024)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/taslp/LiYSZTWPZWYS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/LiYSZTWPZWYS24
Yang Li, Cheng Yu, Guangzhi Sun, Weiqin Zu, Zheng Tian, Ying Wen, Wei Pan, Chao Zhang, Jun Wang, Yang Yang, Fanglei Sun:
Cross-Utterance Conditioned VAE for Speech Generation. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4263-4276 (2024)
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/SunFJ0GW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/SunFJ0GW24
Guangzhi Sun, Shutong Feng, Dongcheng Jiang, Chao Zhang, Milica Gasic, Philip C. Woodland:
Speech-based Slot Filling using Large Language Models. ACL (Findings) 2024: 6351-6362
[c21]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/ChenLYSLWZWW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/ChenLYSLWZWW24
Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang:
M³AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset. ACL (1) 2024: 9041-9060
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/cui/SunZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cui/SunZS24
Guangzhi Sun, Xiao Zhan, Jose Such:
Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-based Conversational Agents. CUI 2024: 35
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LashkarashviliW24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LashkarashviliW24
Nineli Lashkarashvili, Wen Wu, Guangzhi Sun, Philip C. Woodland:
Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation. ICASSP 2024: 10986-10990
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/TangYSC0LLMZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/TangYSC0LLMZ24
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Extending Large Language Models for Speech and Audio Captioning. ICASSP 2024: 11236-11240
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/ZhaoSZXZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/ZhaoSZXZ24
Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng:
Enhancing Quantised End-to-End ASR Models Via Personalisation. ICASSP 2024: 12426-12430
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/YuTSC0L0M024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/YuTSC0L0M024
Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Connecting Speech Encoder and Large Language Model for ASR. ICASSP 2024: 12637-12641
[c15]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/iclr/TangYSC000M024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/TangYSC000M024
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
SALMONN: Towards Generic Hearing Abilities for Large Language Models. ICLR 2024
[c14]
- view
  - electronic edition @ openreview.net (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/icml/SunYTC000M0024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/SunYTC000M0024
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang:
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models. ICML 2024
[c13]
- view
  - electronic edition @ aclanthology.org (open access)
  - no references & citations available
- export record
  dblp key:
  - conf/sigdial/FengSLWZG24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/sigdial/FengSLWZG24
Shutong Feng, Guangzhi Sun, Nurul Lubis, Wen Wu, Chao Zhang, Milica Gasic:
Affect Recognition in Conversations Using Large Language Models. SIGDIAL 2024: 259-273
[i41]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-11747
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-11747
Nineli Lashkarashvili, Wen Wu, Guangzhi Sun, Philip C. Woodland:
Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation. CoRR abs/2402.11747 (2024)
[i40]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-03230
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-03230
Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, Kevin K. Nejad, Felipe Yáñez, Bati Yilmaz, Kangjoo Lee, Alexandra O. Cohen, Valentina Borghesani, Anton Pashkov, Daniele Marinazzo, Jonathan Nicholas, Alessandro Salatiello, Ilia Sucholutsky, Pasquale Minervini, Sepehr Razavi, Roberta Rocca, Elkhan Yusifov, Tereza Okalova, Nianlong Gu, Martin Ferianc, Mikail Khona, Kaustubh R. Patil, Pui-Shee Lee, Rui Mata, Nicholas E. Myers, Jennifer K. Bizley, Sebastian Musslick, Isil Poyraz Bilgin, Guiomar Niso, Justin M. Ales, Michael Gaebler, N. Apurva Ratan Murty, Leyla Loued-Khenissi, Anna Behler, Chloe M. Hall, Jessica Dafflon, Sherry Dongqi Bao, Bradley C. Love:
Large language models surpass human experts in predicting neuroscience results. CoRR abs/2403.03230 (2024)
[i39]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-14168
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-14168
Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang:
M³AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset. CoRR abs/2403.14168 (2024)
[i38]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-09395
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-09395
Xiaoliang Luo, Guangzhi Sun, Bradley C. Love:
Matching domain experts by training from scratch on domain knowledge. CoRR abs/2405.09395 (2024)
[i37]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-13684
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-13684
Guangzhi Sun, Potsawee Manakul, Adian Liusie, Kunat Pipatanakul, Chao Zhang, Philip C. Woodland, Mark J. F. Gales:
CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models. CoRR abs/2405.13684 (2024)
[i36]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-00522
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-00522
Keqi Deng, Guangzhi Sun, Philip C. Woodland:
Wav2Prompt: End-to-End Speech Prompt Generation and Tuning For LLM in Zero and Few-shot Learning. CoRR abs/2406.00522 (2024)
[i35]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-03199
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-03199
Ziyun Cui, Ziyang Zhang, Wen Wu, Guangzhi Sun, Chao Zhang:
Bayesian WeakS-to-Strong from Text Classification to Generation. CoRR abs/2406.03199 (2024)
[i34]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-07914
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-07914
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Jun Zhang, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang:
Can Large Language Models Understand Spatial Audio? CoRR abs/2406.07914 (2024)
[i33]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-15704
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-15704
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Yuxuan Wang, Chao Zhang:
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models. CoRR abs/2406.15704 (2024)
[i32]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-19706
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-19706
Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng:
SAML: Speaker Adaptive Mixture of LoRA Experts for End-to-End ASR. CoRR abs/2406.19706 (2024)
[i31]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-11977
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-11977
Guangzhi Sun, Xiao Zhan, Jose Such:
Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-based Conversational Agents. CoRR abs/2407.11977 (2024)
[i30]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-03979
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-03979
Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng:
Speaker Adaptation for Quantised End-to-End ASR Models. CoRR abs/2408.03979 (2024)
[i29]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-15585
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-15585
Yiyang Zhao, Shuai Wang, Guangzhi Sun, Zehua Chen, Chao Zhang, Mingxing Xu, Thomas Fang Zheng:
Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models. CoRR abs/2408.15585 (2024)
[i28]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-09642
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-09642
Yudong Yang, Zhan Liu, Wenyi Yu, Guangzhi Sun, Qiuqiang Kong, Chao Zhang:
Extract and Diffuse: Latent Integration for Improved Diffusion-based Speech and Vocal Enhancement. CoRR abs/2409.09642 (2024)
[i27]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-10999
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-10999
Potsawee Manakul, Guangzhi Sun, Warit Sirichotedumrong, Kasima Tharnpipitchai, Kunat Pipatanakul:
Enhancing Low-Resource Language and Instruction Following Capabilities of Audio Language Models. CoRR abs/2409.10999 (2024)
[i26]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-16644
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-16644
Siyin Wang, Wenyi Yu, Yudong Yang, Changli Tang, Yixuan Li, Jimin Zhuang, Xianzhao Chen, Xiaohai Tian, Jun Zhang, Guangzhi Sun, Lu Lu, Chao Zhang:
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation. CoRR abs/2409.16644 (2024)
2023
[j2]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/taslp/SunZW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/taslp/SunZW23
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Minimising Biasing Word Errors for Contextual ASR With the Tree-Constrained Pointer Generator. IEEE ACM Trans. Audio Speech Lang. Process. 31: 345-354 (2023)
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/HwangHCZNSMHPZKYZLKRSWST23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/HwangHCZNSMHPZKYZLKRSWST23
Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao:
TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch. ASRU 2023: 1-9
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/LeeSZW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/LeeSZW23
Evonne P. C. Lee, Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Spectral Clustering-Aware Learning of Embeddings for Speaker Diarisation. ICASSP 2023: 1-5
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SunZW23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SunZW23
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
End-to-End Spoken Language Understanding with Tree-Constrained Pointer Generator. ICASSP 2023: 1-5
[c9]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SunZ0W23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SunZ0W23
Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland:
Can Contextual Biasing Remain Effective with Whisper and GPT-2? INTERSPEECH 2023: 1289-1293
[i25]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18824
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18824
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator. CoRR abs/2305.18824 (2023)
[i24]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-01942
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-01942
Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland:
Can Contextual Biasing Remain Effective with Whisper and GPT-2? CoRR abs/2306.01942 (2023)
[i23]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-01764
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-01764
Guangzhi Sun, Chao Zhang, Ivan Vulic, Pawel Budzianowski, Philip C. Woodland:
Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data. CoRR abs/2307.01764 (2023)
[i22]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-04156
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-04156
Yang Li, Cheng Yu, Guangzhi Sun, Weiqin Zu, Zheng Tian, Ying Wen, Wei Pan, Chao Zhang, Jun Wang, Yang Yang, Fanglei Sun:
Cross-Utterance Conditioned VAE for Speech Generation. CoRR abs/2309.04156 (2023)
[i21]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-09136
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-09136
Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng:
Enhancing Quantised End-to-End ASR Models via Personalisation. CoRR abs/2309.09136 (2023)
[i20]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-12881
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-12881
Shutong Feng, Guangzhi Sun, Nurul Lubis, Chao Zhang, Milica Gasic:
Affect Recognition in Conversations Using Large Language Models. CoRR abs/2309.12881 (2023)
[i19]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-13963
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-13963
Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Connecting Speech Encoder and Large Language Model for ASR. CoRR abs/2309.13963 (2023)
[i18]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-04791
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-04791
Theodor Nguyen, Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland:
Conditional Diffusion Model for Target Speaker Extraction. CoRR abs/2310.04791 (2023)
[i17]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-05863
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-05863
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models. CoRR abs/2310.05863 (2023)
[i16]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-13289
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-13289
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
SALMONN: Towards Generic Hearing Abilities for Large Language Models. CoRR abs/2310.13289 (2023)
[i15]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-17864
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-17864
Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao, Robin Scheibler, Samuele Cornell, Sean Kim, Stavros Petridis:
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch. CoRR abs/2310.17864 (2023)
[i14]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-07418
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-07418
Guangzhi Sun, Shutong Feng, Dongcheng Jiang, Chao Zhang, Milica Gasic, Philip C. Woodland:
Speech-based Slot Filling using Large Language Models. CoRR abs/2311.07418 (2023)
2022
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/LiYSJSZW0022
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/LiYSJSZW0022
Yang Li, Cheng Yu, Guangzhi Sun, Hua Jiang, Fanglei Sun, Weiqin Zu, Ying Wen, Yang Yang, Jun Wang:
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech. ACL (1) 2022: 391-400
[c7]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SunZW22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SunZW22
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition. INTERSPEECH 2022: 2043-2047
[i13]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-04120
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-04120
Yang Li, Cheng Yu, Guangzhi Sun, Hua Jiang, Fanglei Sun, Weiqin Zu, Ying Wen, Yang Yang, Jun Wang:
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech. CoRR abs/2205.04120 (2022)
[i12]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-09058
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-09058
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator. CoRR abs/2205.09058 (2022)
[i11]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-00857
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-00857
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition. CoRR abs/2207.00857 (2022)
[i10]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-13576
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-13576
Evonne P. C. Lee, Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Spectral Clustering-aware Learning of Embeddings for Speaker Diarisation. CoRR abs/2210.13576 (2022)
[i9]
- view
  - electronic edition via DOI (open access)
  - references & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-16554
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-16554
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
End-to-end Spoken Language Understanding with Tree-constrained Pointer Generator. CoRR abs/2210.16554 (2022)
2021
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/nn/SunZW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/nn/SunZW21
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Combination of deep speaker embeddings for diarisation. Neural Networks 141: 372-384 (2021)
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/asru/SunZW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/asru/SunZW21
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Tree-Constrained Pointer Generator for End-to-End Contextual Speech Recognition. ASRU 2021: 780-787
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SunLZW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SunLZW21
Guangzhi Sun, D. Liu, Chao Zhang, Philip C. Woodland:
Content-Aware Speaker Embeddings for Speaker Diarisation. ICASSP 2021: 7168-7172
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/Sun0W21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/Sun0W21
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Transformer Language Models with LSTM-Based Cross-Utterance Information Representation. ICASSP 2021: 7363-7367
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-06467
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-06467
Guangzhi Sun, D. Liu, Chao Zhang, Philip C. Woodland:
Content-Aware Speaker Embeddings for Speaker Diarisation. CoRR abs/2102.06467 (2021)
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2102-06474
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2102-06474
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Transformer Language Models with LSTM-based Cross-utterance Information Representation. CoRR abs/2102.06474 (2021)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2109-00627
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-00627
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Tree-constrained Pointer Generator for End-to-end Contextual Speech Recognition. CoRR abs/2109.00627 (2021)
2020
[c3]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SunZWCZW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SunZWCZW20
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis. ICASSP 2020: 6264-6268
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SunZWCZRRW20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SunZWCZRRW20
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior. ICASSP 2020: 6699-6703
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2002-03785
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-03785
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu:
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis. CoRR abs/2002.03785 (2020)
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2002-03788
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2002-03788
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior. CoRR abs/2002.03788 (2020)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2009-01008
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2009-01008
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Cross-Utterance Language Models with Acoustic Error Sampling. CoRR abs/2009.01008 (2020)
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-2010-12025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2010-12025
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Combination of Deep Speaker Embeddings for Diarisation. CoRR abs/2010.12025 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SunZW19a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SunZW19a
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Speaker Diarisation Using 2D Self-attentive Combination of Embeddings. ICASSP 2019: 5801-5805
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - references & citations
- export record
  dblp key:
  - journals/corr/abs-1902-03190
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1902-03190
Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Speaker diarisation using 2D self-attentive combination of embeddings. CoRR abs/1902.03190 (2019)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.