Google Scholar

Profils utilisateurs correspondant à "Wayne Xiong"

Wayne Xiong

Microsoft

Adresse e-mail validée de microsoft.com

Cité 2918 fois

[PDF] arxiv.org

Achieving human parity in conversational speech recognition

W Xiong, J Droppo, X Huang, F Seide, M Seltzer… - arXiv preprint arXiv …, 2016 - arxiv.org

Conversational speech recognition has served as a flagship speech recognition task since
the release of the Switchboard corpus in the 1990s. In this paper, we measure the human …

Enregistrer Citer Cité 737 fois Autres articles Les 5 versions Version HTML

[PDF] arxiv.org

The Microsoft 2017 conversational speech recognition system

W Xiong, L Wu, F Alleva, J Droppo… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org

We describe the latest version of Microsoft's conversational speech recognition system for
the Switchboard and CallHome domains. The system adds a CNN-BLSTM acoustic model to …

Enregistrer Citer Cité 996 fois Autres articles Les 21 versions

Toward human parity in conversational speech recognition

W Xiong, J Droppo, X Huang, F Seide… - … on Audio, Speech …, 2017 - ieeexplore.ieee.org

Conversational speech recognition has served as a flagship speech recognition task since
the release of the Switchboard corpus in the 1990s. In this paper, we measure a human error …

Enregistrer Citer Cité 265 fois Autres articles Les 3 versions

[PDF] arxiv.org

Pyramidkv: Dynamic kv cache compression based on pyramidal information funneling

…, B Gao, Y Liu, Y Li, T Liu, K Lu, W Xiong… - arXiv preprint arXiv …, 2024 - arxiv.org

In this study, we investigate whether attention-based information flow inside large language
models (LLMs) is aggregated through noticeable patterns for long context processing. Our …

Enregistrer Citer Cité 85 fois Autres articles Les 3 versions Version HTML

[PDF] isca-archive.org

[PDF][PDF] Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention.

D Yu, W Xiong, J Droppo, A Stolcke, G Ye, J Li… - Interspeech, 2016 - isca-archive.org

In this paper, we propose a deep convolutional neural network (CNN) with layer-wise context
expansion and location-based attention, for large vocabulary speech recognition. In our …

Enregistrer Citer Cité 120 fois Autres articles Les 5 versions Version HTML

[PDF] arxiv.org

Z-code++: A pre-trained language model optimized for abstractive summarization

…, R Xu, HH Awadalla, Y Shi, C Zhu, W Xiong… - arXiv preprint arXiv …, 2022 - arxiv.org

This paper presents Z-Code++, a new pre-trained language model optimized for abstractive
text summarization. The model extends the state of the art encoder-decoder model using …

Enregistrer Citer Cité 63 fois Autres articles Les 6 versions Version HTML

[PDF] arxiv.org

Advances in online audio-visual meeting transcription

…, A Vinnikov, L Wu, X Xiao, W Xiong… - 2019 IEEE Automatic …, 2019 - ieeexplore.ieee.org

This paper describes a system that generates speaker-annotated transcripts of meetings by
using a microphone array and a 360-degree camera. The hallmark of the system is its ability …

Enregistrer Citer Cité 97 fois Autres articles Les 6 versions

[PDF] arxiv.org

Progressive joint modeling in unsupervised single-channel overlapped speech recognition

Z Chen, J Droppo, J Li, W Xiong - IEEE/ACM Transactions on …, 2017 - ieeexplore.ieee.org

Unsupervised single-channel overlapped speech recognition is one of the hardest problems
in automatic speech recognition (ASR). Permutation invariant training (PIT) is a state of the …

Enregistrer Citer Cité 89 fois Autres articles Les 4 versions

Pyramidkv: Dynamic kv cache compression based on pyramidal information funneling

…, B Gao, Y Liu, Y Li, T Liu, K Lu, W Xiong… - arXiv e …, 2024 - ui.adsabs.harvard.edu

In this study, we investigate whether attention-based information flow inside large language
models (LLMs) is aggregated through noticeable patterns for long context processing. Our …

Enregistrer Citer Cité 33 fois Autres articles

[PDF] arxiv.org

Momentum calibration for text generation

…, Y Liu, X Wang, P He, Y Yu, SQ Chen, W Xiong… - arXiv preprint arXiv …, 2022 - arxiv.org

The input and output of most text generation tasks can be transformed to two sequences of
tokens and they can be modeled using sequence-to-sequence learning modeling tools such …

Enregistrer Citer Cité 31 fois Autres articles Les 2 versions Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Profils utilisateurs correspondant à "Wayne Xiong"

Wayne Xiong

Achieving human parity in conversational speech recognition

The Microsoft 2017 conversational speech recognition system

Toward human parity in conversational speech recognition

Pyramidkv: Dynamic kv cache compression based on pyramidal information funneling

[PDF][PDF] Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention.

Z-code++: A pre-trained language model optimized for abstractive summarization

Advances in online audio-visual meeting transcription

Progressive joint modeling in unsupervised single-channel overlapped speech recognition

Pyramidkv: Dynamic kv cache compression based on pyramidal information funneling

Momentum calibration for text generation