default search action
Brian Kingsbury
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c129]A F. M. Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen:
Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization. ICASSP 2024: 10931-10935 - [c128]Siddhant Arora, George Saon, Shinji Watanabe, Brian Kingsbury:
Semi-Autoregressive Streaming ASR with Label Context. ICASSP 2024: 11681-11685 - [i50]A F. M. Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen:
Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization. CoRR abs/2401.06980 (2024) - [i49]Ankit Gupta, George Saon, Brian Kingsbury:
Exploring the limits of decoder-only models trained on public speech recognition corpora. CoRR abs/2402.00235 (2024) - 2023
- [c127]Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogério Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James R. Glass:
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. ICASSP 2023: 1-5 - [c126]Vishal Sunder, Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, Eric Fosler-Lussier:
Fine-Grained Textual Knowledge Transfer to Improve RNN Transducers for Speech Recognition and Understanding. ICASSP 2023: 1-5 - [c125]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Brian Kingsbury:
Multi-Speaker Data Augmentation for Improved end-to-end Automatic Speech Recognition. ICASSP 2023: 1-5 - [c124]Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury:
ConvKT: Conversation-Level Knowledge Transfer for Context Aware End-to-End Spoken Language Understanding. INTERSPEECH 2023: 1129-1133 - [c123]Xiaodong Cui, George Saon, Brian Kingsbury:
Improving RNN Transducer Acoustic Models for English Conversational Speech Recognition. INTERSPEECH 2023: 1299-1303 - [c122]Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogério Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James R. Glass:
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages. INTERSPEECH 2023: 2268-2272 - [c121]Kristjan H. Greenewald, Brian Kingsbury, Yuancheng Yu:
High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction. ISIT 2023: 2613-2618 - [i48]Kristjan H. Greenewald, Brian Kingsbury, Yuancheng Yu:
High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction. CoRR abs/2305.04712 (2023) - [i47]Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogério Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James R. Glass:
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages. CoRR abs/2305.12606 (2023) - [i46]Siddhant Arora, George Saon, Shinji Watanabe, Brian Kingsbury:
Semi-Autoregressive Streaming ASR With Label Context. CoRR abs/2309.10926 (2023) - [i45]Xiaodong Cui, Ashish R. Mittal, Songtao Lu, Wei Zhang, George Saon, Brian Kingsbury:
Soft Random Sampling: A Theoretical and Empirical Analysis. CoRR abs/2311.12727 (2023) - 2022
- [c120]Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Hilde Kuehne:
Everything at Once - Multi-modal Fusion Transformer for Video Retrieval. CVPR 2022: 19988-19997 - [c119]Songtao Lu, Xiaodong Cui, Mark S. Squillante, Brian Kingsbury, Lior Horesh:
Decentralized Bilevel Optimization for Personalized Client Learning. ICASSP 2022: 5543-5547 - [c118]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Brian Kingsbury, George Saon:
Improving End-to-end Models for Set Prediction in Spoken Language Understanding. ICASSP 2022: 7162-7166 - [c117]Vishal Sunder, Samuel Thomas, Hong-Kwang Jeff Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier:
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding. ICASSP 2022: 7497-7501 - [c116]Zvi Kons, Aharon Satt, Hong-Kwang Kuo, Samuel Thomas, Boaz Carmeli, Ron Hoory, Brian Kingsbury:
A New Data Augmentation Method for Intent Classification Enhancement and its Application on Spoken Conversation Datasets. ICASSP 2022: 7632-7636 - [c115]Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, George Saon:
Towards Reducing the Need for Speech Training Data to Build Spoken Language Understanding Systems. ICASSP 2022: 7932-7936 - [c114]Samuel Thomas, Brian Kingsbury, George Saon, Hong-Kwang Jeff Kuo:
Integrating Text Inputs for Training and Adapting RNN Transducer ASR Models. ICASSP 2022: 8127-8131 - [c113]Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury:
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States. INTERSPEECH 2022: 1656-1660 - [c112]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan:
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization. INTERSPEECH 2022: 2038-2042 - [c111]Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata:
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing. INTERSPEECH 2022: 2638-2642 - [c110]Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang Kuo, Brian Kingsbury:
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems. INTERSPEECH 2022: 2683-2687 - [c109]Takashi Fukuda, Samuel Thomas, Masayuki Suzuki, Gakuto Kurata, George Saon, Brian Kingsbury:
Global RNN Transducer Models For Multi-dialect Speech Recognition. INTERSPEECH 2022: 3138-3142 - [c108]Songtao Lu, Siliang Zeng, Xiaodong Cui, Mark S. Squillante, Lior Horesh, Brian Kingsbury, Jia Liu, Mingyi Hong:
A Stochastic Linearized Augmented Lagrangian Method for Decentralized Bilevel Optimization. NeurIPS 2022 - [i44]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Brian Kingsbury, George Saon:
Improving End-to-End Models for Set Prediction in Spoken Language Understanding. CoRR abs/2201.12105 (2022) - [i43]Zvi Kons, Aharon Satt, Hong-Kwang Kuo, Samuel Thomas, Boaz Carmeli, Ron Hoory, Brian Kingsbury:
A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets. CoRR abs/2202.10137 (2022) - [i42]Samuel Thomas, Brian Kingsbury, George Saon, Hong-Kwang Jeff Kuo:
Integrating Text Inputs For Training and Adapting RNN Transducer ASR Models. CoRR abs/2202.13155 (2022) - [i41]Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, George Saon:
Towards Reducing the Need for Speech Training Data To Build Spoken Language Understanding Systems. CoRR abs/2203.00006 (2022) - [i40]Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata:
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing. CoRR abs/2203.15176 (2022) - [i39]Vishal Sunder, Samuel Thomas, Hong-Kwang Jeff Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier:
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding. CoRR abs/2204.05169 (2022) - [i38]Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury:
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems. CoRR abs/2204.05188 (2022) - [i37]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan:
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization. CoRR abs/2206.07882 (2022) - [i36]Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury:
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States. CoRR abs/2208.01818 (2022) - [i35]Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogério Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James R. Glass:
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. CoRR abs/2210.03625 (2022) - 2021
- [j16]Xiaodong Cui, Wei Zhang, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung:
Asynchronous Decentralized Distributed Training of Acoustic Models. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3565-3576 (2021) - [c107]George Saon, Zoltán Tüske, Daniel Bolaños, Brian Kingsbury:
Advancing RNN Transducer Technology for Speech Recognition. ICASSP 2021: 5654-5658 - [c106]Xiaodong Cui, Songtao Lu, Brian Kingsbury:
Federated Acoustic Modeling for Automatic Speech Recognition. ICASSP 2021: 6748-6752 - [c105]Edmilson da Silva Morais, Hong-Kwang Jeff Kuo, Samuel Thomas, Zoltán Tüske, Brian Kingsbury:
End-to-End Spoken Language Understanding Using Transformer Networks and Self-Supervised Pre-Trained Features. ICASSP 2021: 7483-7487 - [c104]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models for Spoken Language Understanding. ICASSP 2021: 7493-7497 - [c103]Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang:
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. ICCV 2021: 7992-8001 - [c102]Jatin Ganhotra, Samuel Thomas, Hong-Kwang Jeff Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury:
Integrating Dialog History into End-to-End Spoken Language Understanding Systems. Interspeech 2021: 1254-1258 - [c101]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Brian Chen, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Hilde Kuehne, Rameswar Panda, Rogério Schmidt Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James R. Glass:
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. Interspeech 2021: 1584-1588 - [c100]Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske:
Reducing Exposure Bias in Training Recurrent Neural Network Transducers. Interspeech 2021: 1802-1806 - [c99]Gakuto Kurata, George Saon, Brian Kingsbury, David Haws, Zoltán Tüske:
Improving Customization of Neural Transducers by Mitigating Acoustic Mismatch of Synthesized Audio. Interspeech 2021: 2027-2031 - [c98]Zoltán Tüske, George Saon, Brian Kingsbury:
On the Limit of English Conversational Speech Recognition. Interspeech 2021: 2062-2066 - [c97]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-Bit Quantization of LSTM-Based Speech Recognition Models. Interspeech 2021: 2586-2590 - [c96]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogério Feris, Brian Kingsbury, Michael Picheny, James R. Glass:
Cascaded Multilingual Audio-Visual Learning from Videos. Interspeech 2021: 3006-3010 - [i34]Xiaodong Cui, Songtao Lu, Brian Kingsbury:
Federated Acoustic Modeling For Automatic Speech Recognition. CoRR abs/2102.04429 (2021) - [i33]George Saon, Zoltán Tüske, Daniel Bolaños, Brian Kingsbury:
Advancing RNN Transducer Technology for Speech Recognition. CoRR abs/2103.09935 (2021) - [i32]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models For Spoken Language Understanding. CoRR abs/2104.03842 (2021) - [i31]Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Schmidt Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang:
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. CoRR abs/2104.12671 (2021) - [i30]Zoltán Tüske, George Saon, Brian Kingsbury:
On the limit of English conversational speech recognition. CoRR abs/2105.00982 (2021) - [i29]Ashish R. Mittal, Samarth Bharadwaj, Shreya Khare, Saneem A. Chemmengath, Karthik Sankaranarayanan, Brian Kingsbury:
Representation based meta-learning for few-shot spoken intent recognition. CoRR abs/2106.15238 (2021) - [i28]Jatin Ganhotra, Samuel Thomas, Hong-Kwang Jeff Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury:
Integrating Dialog History into End-to-End Spoken Language Understanding Systems. CoRR abs/2108.08405 (2021) - [i27]Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske:
Reducing Exposure Bias in Training Recurrent Neural Network Transducers. CoRR abs/2108.10803 (2021) - [i26]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-bit Quantization of LSTM-based Speech Recognition Models. CoRR abs/2108.12074 (2021) - [i25]Xiaodong Cui, Wei Zhang, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung:
Asynchronous Decentralized Distributed Training of Acoustic Models. CoRR abs/2110.11199 (2021) - [i24]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogério Feris, Brian Kingsbury, Michael Picheny, James R. Glass:
Cascaded Multilingual Audio-Visual Learning from Videos. CoRR abs/2111.04823 (2021) - [i23]Wei Zhang, Mingrui Liu, Yu Feng, Xiaodong Cui, Brian Kingsbury, Yuhai Tu:
Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent. CoRR abs/2112.01433 (2021) - [i22]Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Hilde Kuehne:
Everything at Once - Multi-modal Fusion Transformer for Video Retrieval. CoRR abs/2112.04446 (2021) - 2020
- [c95]Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David S. Kung, Michael Picheny:
Improving Efficiency in Large-Scale Decentralized Distributed Training. ICASSP 2020: 3022-3026 - [c94]Guojing Cong, Brian Kingsbury, Chih-Chieh Yang, Tianyi Liu:
Fast Training of Deep Neural Networks for Speech Recognition. ICASSP 2020: 6884-6888 - [c93]Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny:
Leveraging Unpaired Text Data for Training End-To-End Speech-to-Intent Systems. ICASSP 2020: 7984-7988 - [c92]Zoltán Tüske, George Saon, Kartik Audhkhasi, Brian Kingsbury:
Single Headed Attention Based Sequence-to-Sequence Model for State-of-the-Art Results on Switchboard. INTERSPEECH 2020: 551-555 - [c91]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis A. Lastras:
End-to-End Spoken Language Understanding Without Full Transcripts. INTERSPEECH 2020: 906-910 - [c90]Ashish R. Mittal, Samarth Bharadwaj, Shreya Khare, Saneem A. Chemmengath, Karthik Sankaranarayanan, Brian Kingsbury:
Representation Based Meta-Learning for Few-Shot Spoken Intent Recognition. INTERSPEECH 2020: 4283-4287 - [c89]Samuel Thomas, Kartik Audhkhasi, Brian Kingsbury:
Transliteration Based Data Augmentation for Training Multilingual ASR Acoustic Models in Low Resource Settings. INTERSPEECH 2020: 4736-4740 - [i21]Zoltán Tüske, George Saon, Kartik Audhkhasi, Brian Kingsbury:
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300. CoRR abs/2001.07263 (2020) - [i20]Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David S. Kung, Michael Picheny:
Improving Efficiency in Large-Scale Decentralized Distributed Training. CoRR abs/2002.01119 (2020) - [i19]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Rogério Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James R. Glass:
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. CoRR abs/2006.09199 (2020) - [i18]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis A. Lastras:
End-to-End Spoken Language Understanding Without Full Transcripts. CoRR abs/2009.14386 (2020) - [i17]Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny:
Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems. CoRR abs/2010.04284 (2020) - [i16]Edmilson da Silva Morais, Hong-Kwang Jeff Kuo, Samuel Thomas, Zoltán Tüske, Brian Kingsbury:
End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features. CoRR abs/2011.08238 (2020)
2010 – 2019
- 2019
- [j15]Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel Hsu, Brian Kingsbury, Michael Picheny, Fei Sha:
Kernel Approximation Methods for Speech Recognition. J. Mach. Learn. Res. 20: 59:1-59:36 (2019) - [c88]George Saon, Zoltán Tüske, Kartik Audhkhasi, Brian Kingsbury, Michael Picheny, Samuel Thomas:
Simplified LSTMS for Speech Recognition. ASRU 2019: 547-553 - [c87]Wei Zhang, Xiaodong Cui, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung, Michael Picheny:
Distributed Deep Learning Strategies for Automatic Speech Recognition. ICASSP 2019: 5706-5710 - [c86]George Saon, Zoltán Tüske, Kartik Audhkhasi, Brian Kingsbury:
Sequence Noise Injected Training for End-to-end Speech Recognition. ICASSP 2019: 6261-6265 - [c85]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. ICASSP 2019: 6455-6459 - [c84]Anna Choromanska, Benjamin Cowen, Sadhana Kumaravel, Ronny Luss, Mattia Rigotti, Irina Rish, Paolo Diachille, Viatcheslav Gurev, Brian Kingsbury, Ravi Tejwani, Djallel Bouneffouf:
Beyond Backprop: Online Alternating Minimization with Auxiliary Variables. ICML 2019: 1193-1202 - [c83]Ziv Goldfeld, Ewout van den Berg, Kristjan H. Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, Yury Polyanskiy:
Estimating Information Flow in Deep Neural Networks. ICML 2019: 2299-2308 - [c82]Michael Picheny, Zoltán Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon:
Challenging the Boundaries of Speech Recognition: The MALACH Corpus. INTERSPEECH 2019: 326-330 - [c81]Kartik Audhkhasi, George Saon, Zoltán Tüske, Brian Kingsbury, Michael Picheny:
Forget a Bit to Learn Better: Soft Forgetting for CTC-Based Automatic Speech Recognition. INTERSPEECH 2019: 2618-2622 - [c80]Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David S. Kung, Michael Picheny:
A Highly Efficient Distributed Deep Learning System for Automatic Speech Recognition. INTERSPEECH 2019: 2628-2632 - [i15]Wei Zhang, Xiaodong Cui, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung, Michael Picheny:
Distributed Deep Learning Strategies For Automatic Speech Recognition. CoRR abs/1904.04956 (2019) - [i14]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. CoRR abs/1904.13258 (2019) - [i13]Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David S. Kung, Michael Picheny:
A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition. CoRR abs/1907.05701 (2019) - [i12]Michael Picheny, Zoltán Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon:
Challenging the Boundaries of Speech Recognition: The MALACH Corpus. CoRR abs/1908.03455 (2019) - 2018
- [c79]Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Michael Picheny:
Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition. ICASSP 2018: 4759-4763 - [i11]Anna Choromanska, Sadhana Kumaravel, Ronny Luss, Irina Rish, Brian Kingsbury, Ravi Tejwani, Djallel Bouneffouf:
Beyond Backprop: Alternating Minimization with co-Activation Memory. CoRR abs/1806.09077 (2018) - [i10]Ziv Goldfeld, Ewout van den Berg, Kristjan H. Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, Yury Polyanskiy:
Estimating Information Flow in Neural Networks. CoRR abs/1810.05728 (2018) - [i9]Vidya Muthukumar, Tejaswini Pedapati, Nalini K. Ratha, Prasanna Sattigeri, Chai-Wah Wu, Brian Kingsbury, Abhishek Kumar, Samuel Thomas, Aleksandra Mojsilovic, Kush R. Varshney:
Understanding Unequal Gender Classification Accuracy from Face Images. CoRR abs/1812.00099 (2018) - 2017
- [j14]Bhuvana Ramabhadran, Nancy F. Chen, Mary P. Harper, Brian Kingsbury, Kate M. Knill:
Introduction to the Special Issue on End-to-End Speech and Language Processing. IEEE J. Sel. Top. Signal Process. 11(8): 1237-1239 (2017) - [j13]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-End ASR-Free Keyword Search From Speech. IEEE J. Sel. Top. Signal Process. 11(8): 1351-1359 (2017) - [j12]I-Hsin Chung, Tara N. Sainath, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Vernon Austel, Upendra V. Chaudhari, Brian Kingsbury:
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q. IEEE Trans. Parallel Distributed Syst. 28(6): 1703-1714 (2017) - [c78]Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Tom Sercu, Kartik Audhkhasi, Abhinav Sethy, Markus Nußbaum-Thom, Andrew Rosenberg:
Knowledge distillation across ensembles of multilingual models for low-resource languages. ICASSP 2017: 4825-4829 - [c77]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-end ASR-free keyword search from speech. ICASSP 2017: 4840-4844 - [c76]Tom Sercu, George Saon, Jia Cui, Xiaodong Cui, Bhuvana Ramabhadran, Brian Kingsbury, Abhinav Sethy:
Network architectures for multilingual speech representation learning. ICASSP 2017: 5295-5299 - [c75]Guojing Cong, Brian Kingsbury, Soumyadip Gosh, George Saon, Fan Zhou:
Accelerating deep neural network learning for speech recognition on a cluster of GPUs. MLHPC@SC 2017: 3:1-3:8 - [i8]Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel J. Hsu, Brian Kingsbury, Michael Picheny, Fei Sha:
Kernel Approximation Methods for Speech Recognition. CoRR abs/1701.03577 (2017) - [i7]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-End ASR-free Keyword Search from Speech. CoRR abs/1701.04313 (2017) - [i6]Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Michael Picheny:
Building competitive direct acoustics-to-word models for English conversational speech recognition. CoRR abs/1712.03133 (2017) - 2016
- [c74]Avner May, Michael Collins, Daniel J. Hsu, Brian Kingsbury:
Compact kernel models for acoustic modeling via random feature selection. ICASSP 2016: 2424-2428 - [c73]Jie Chen, Lingfei Wu, Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran:
Efficient one-vs-one kernel ridge regression for speech recognition. ICASSP 2016: 2454-2458 - [c72]Tom Sercu, Christian Puhrsch, Brian Kingsbury, Yann LeCun:
Very deep multilingual convolutional neural networks for LVCSR. ICASSP 2016: 4955-4959 - [c71]Zhiyun Lu, Dong Guo, Alireza Bagheri Garakani, Kuan Liu, Avner May, Aurélien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha:
A comparison between deep neural nets and kernel acoustic models for speech recognition. ICASSP 2016: 5070-5074 - [c70]Gakuto Kurata, Brian Kingsbury:
Improved Neural Network Initialization by Grouping Context-Dependent Targets for Acoustic Modeling. INTERSPEECH 2016: 27-31 - [c69]Samuel Thomas, Kartik Audhkhasi, Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran:
Multilingual Data Selection for Low Resource Speech Recognition. INTERSPEECH 2016: 3853-3857 - [i5]Zhiyun Lu, Dong Guo, Alireza Bagheri Garakani, Kuan Liu, Avner May, Aurélien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha:
A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition. CoRR abs/1603.05800 (2016) - 2015
- [j11]Tara N. Sainath, Brian Kingsbury, George Saon, Hagen Soltau, Abdel-rahman Mohamed, George E. Dahl, Bhuvana Ramabhadran:
Deep Convolutional Neural Networks for Large-scale Speech Tasks. Neural Networks 64: 39-48 (2015) - [j10]Xiaodong Cui, Vaibhava Goel, Brian Kingsbury:
Data Augmentation for Deep Neural Network Acoustic Modeling. IEEE ACM Trans. Audio Speech Lang. Process. 23(9): 1469-1477 (2015) - [c68]Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran, Abhinav Sethy, Kartik Audhkhasi, Xiaodong Cui, Ellen Kislal, Lidia Mangu, Markus Nußbaum-Thom, Michael Picheny, Zoltán Tüske, Pavel Golik, Ralf Schlüter, Hermann Ney, Mark J. F. Gales, Kate M. Knill, Anton Ragni, Haipeng Wang, Philip C. Woodland:
Multilingual representations for low resource speech recognition and keyword search. ASRU 2015: 259-266 - [c67]Xiaodong Cui, Vaibhava Goel, Brian Kingsbury:
Data augmentation for deep convolutional neural network acoustic modeling. ICASSP 2015: 4545-4549 - [c66]Lidia Mangu, George Saon, Michael Picheny, Brian Kingsbury:
Order-free spoken term detection. ICASSP 2015: 5331-5335 - [c65]Jia Cui, George Saon, Bhuvana Ramabhadran, Brian Kingsbury:
A multi-region deep neural network model in speech recognition. INTERSPEECH 2015: 3244-3248 - [i4]Tom Sercu, Christian Puhrsch, Brian Kingsbury, Yann LeCun:
Very Deep Multilingual Convolutional Neural Networks for LVCSR. CoRR abs/1509.08967 (2015) - 2014
- [c64]Xiaodong Cui, Vaibhava Goel, Brian Kingsbury:
Data Augmentation for deep neural network acoustic modeling. ICASSP 2014: 5582-5586 - [c63]Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George Saon, Bhuvana Ramabhadran:
Improvements to filterbank and delta learning within a deep neural network framework. ICASSP 2014: 6839-6843 - [c62]Jia Cui, Jonathan Mamou, Brian Kingsbury, Bhuvana Ramabhadran:
Automatic keyword selection for keyword search development and tuning. ICASSP 2014: 7839-7843 - [c61]Lidia Mangu, Brian Kingsbury, Hagen Soltau, Hong-Kwang Kuo, Michael Picheny:
Efficient spoken term detection using confusion networks. ICASSP 2014: 7844-7848 - [c60]Jia Cui, Bhuvana Ramabhadran, Xiaodong Cui, Andrew Rosenberg, Brian Kingsbury, Abhinav Sethy:
Recent improvements in neural network acoustic modeling for LVCSR in low resource languages. INTERSPEECH 2014: 840-844 - [c59]Tara N. Sainath, Vijayaditya Peddinti, Brian Kingsbury, Petr Fousek, Bhuvana Ramabhadran, David Nahamoo:
Deep scattering spectra with deep neural networks for LVCSR tasks. INTERSPEECH 2014: 900-904 - [c58]Tara N. Sainath, I-Hsin Chung, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Brian Kingsbury, George Saon, Vernon Austel, Upendra V. Chaudhari:
Parallel deep neural network training for LVCSR tasks using blue gene/Q. INTERSPEECH 2014: 1048-1052 - [c57]Xiaodong Cui, Brian Kingsbury, Jia Cui, Bhuvana Ramabhadran, Andrew Rosenberg, Mohammad Sadegh Rasooli, Owen Rambow, Nizar Habash, Vaibhava Goel:
Improving deep neural network acoustic modeling for audio corpus indexing under the IARPA babel program. INTERSPEECH 2014: 2103-2107 - [c56]I-Hsin Chung, Tara N. Sainath, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Vernon Austel, Upendra V. Chaudhari, Brian Kingsbury:
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q. SC 2014: 745-753 - [p1]Hagen Soltau, George Saon, Lidia Mangu, Hong-Kwang Kuo, Brian Kingsbury, Stephen M. Chu, Fadi Biadsy:
Automatic Speech Recognition. NLP of Semitic Languages 2014: 409-459 - [i3]Zhiyun Lu, Avner May, Kuan Liu, Alireza Bagheri Garakani, Dong Guo, Aurélien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha:
How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets. CoRR abs/1411.4000 (2014) - 2013
- [j9]Tara N. Sainath, Brian Kingsbury, Hagen Soltau, Bhuvana Ramabhadran:
Optimization Techniques to Improve Training Speed of Deep Neural Networks for Large Speech Tasks. IEEE Trans. Speech Audio Process. 21(11): 2267-2276 (2013) - [c55]Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, Bhuvana Ramabhadran:
Learning filter banks within a deep neural network framework. ASRU 2013: 297-302 - [c54]Tara N. Sainath, Lior Horesh, Brian Kingsbury, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Accelerating Hessian-free optimization for Deep Neural Networks by implicit preconditioning and sampling. ASRU 2013: 303-308 - [c53]Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George E. Dahl, George Saon, Hagen Soltau, Tomás Beran, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Improvements to Deep Convolutional Neural Networks for LVCSR. ASRU 2013: 315-320 - [c52]Murat Saraclar, Abhinav Sethy, Bhuvana Ramabhadran, Lidia Mangu, Jia Cui, Xiaodong Cui, Brian Kingsbury, Jonathan Mamou:
An empirical study of confusion modeling in keyword search for low resource languages. ASRU 2013: 464-469 - [c51]Tara N. Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, Bhuvana Ramabhadran:
Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets. ICASSP 2013: 6655-6659 - [c50]Jia Cui, Xiaodong Cui, Bhuvana Ramabhadran, Janice Kim, Brian Kingsbury, Jonathan Mamou, Lidia Mangu, Michael Picheny, Tara N. Sainath, Abhinav Sethy:
Developing speech recognition systems for corpus indexing under the IARPA Babel program. ICASSP 2013: 6753-6757 - [c49]Jing Huang, Brian Kingsbury:
Audio-visual deep learning for noise robust speech recognition. ICASSP 2013: 7596-7599 - [c48]Jonathan Mamou, Jia Cui, Xiaodong Cui, Mark J. F. Gales, Brian Kingsbury, Kate M. Knill, Lidia Mangu, David Nolden, Michael Picheny, Bhuvana Ramabhadran, Ralf Schlüter, Abhinav Sethy, Philip C. Woodland:
System combination and score normalization for spoken term detection. ICASSP 2013: 8272-8276 - [c47]Brian Kingsbury, Jia Cui, Xiaodong Cui, Mark J. F. Gales, Kate M. Knill, Jonathan Mamou, Lidia Mangu, David Nolden, Michael Picheny, Bhuvana Ramabhadran, Ralf Schlüter, Abhinav Sethy, Philip C. Woodland:
A high-performance Cantonese keyword search system. ICASSP 2013: 8277-8281 - [c46]Lidia Mangu, Hagen Soltau, Hong-Kwang Kuo, Brian Kingsbury, George Saon:
Exploiting diversity for spoken term detection. ICASSP 2013: 8282-8286 - [c45]Li Deng, Geoffrey E. Hinton, Brian Kingsbury:
New types of deep neural network learning for speech recognition and related applications: an overview. ICASSP 2013: 8599-8603 - [c44]Tara N. Sainath, Abdel-rahman Mohamed, Brian Kingsbury, Bhuvana Ramabhadran:
Deep convolutional neural networks for LVCSR. ICASSP 2013: 8614-8618 - [c43]Xiaodong Cui, Vaibhava Goel, Brian Kingsbury:
Mixtures of Bayesian joint factor analyzers for noise robust automatic speech recognition. INTERSPEECH 2013: 3012-3016 - [c42]George Saon, Samuel Thomas, Hagen Soltau, Sriram Ganapathy, Brian Kingsbury:
The IBM speech activity detection system for the DARPA RATS program. INTERSPEECH 2013: 3497-3501 - [i2]Tara N. Sainath, Brian Kingsbury, Abdel-rahman Mohamed, George E. Dahl, George Saon, Hagen Soltau, Tomás Beran, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Improvements to deep convolutional neural networks for LVCSR. CoRR abs/1309.1501 (2013) - [i1]Tara N. Sainath, Lior Horesh, Brian Kingsbury, Aleksandr Y. Aravkin, Bhuvana Ramabhadran:
Improving training time of Hessian-free optimization for deep neural networks using preconditioning and sampling. CoRR abs/1309.1508 (2013) - 2012
- [c41]Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran:
Auto-encoder bottleneck features using deep belief networks. ICASSP 2012: 4153-4156 - [c40]Brian Kingsbury, Tara N. Sainath, Hagen Soltau:
Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization. INTERSPEECH 2012: 10-13 - [c39]George Saon, Brian Kingsbury:
Discriminative feature-space transforms using deep neural networks. INTERSPEECH 2012: 14-17 - [c38]Ebru Arisoy, Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran:
Deep Neural Network Language Models. WLM@NAACL-HLT 2012: 20-28 - 2011
- [j8]Michael Picheny, David Nahamoo, Vaibhava Goel, Brian Kingsbury, Bhuvana Ramabhadran, Steven J. Rennie, George Saon:
Trends and advances in speech recognition. IBM J. Res. Dev. 55(5): 2 (2011) - [j7]James Fan, Michael Campbell, Brian Kingsbury:
Artificial intelligence research at IBM. IBM J. Res. Dev. 55(5): 16 (2011) - [c37]Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran, Petr Fousek, Petr Novák, Abdel-rahman Mohamed:
Making Deep Belief Networks effective for large vocabulary continuous speech recognition. ASRU 2011: 30-35 - [c36]Lidia Mangu, Hong-Kwang Kuo, Stephen M. Chu, Brian Kingsbury, George Saon, Hagen Soltau, Fadi Biadsy:
The IBM 2011 GALE Arabic speech transcription system. ASRU 2011: 272-277 - [c35]Brian Kingsbury, Hagen Soltau, George Saon, Stephen M. Chu, Hong-Kwang Kuo, Lidia Mangu, Suman V. Ravuri, Nelson Morgan, Adam Janin:
The IBM 2009 GALE Arabic speech transcription system. ICASSP 2011: 4672-4675 - [c34]Chih-Chieh Cheng, Brian Kingsbury:
Arccosine kernels: Acoustic modeling with infinite neural networks. ICASSP 2011: 5200-5203 - 2010
- [c33]George Saon, Hagen Soltau, Upendra V. Chaudhari, Stephen M. Chu, Brian Kingsbury, Hong-Kwang Kuo, Lidia Mangu, Daniel Povey:
The IBM 2008 GALE Arabic speech transcription system. ICASSP 2010: 4378-4381 - [c32]Hagen Soltau, George Saon, Brian Kingsbury:
The IBM Attila speech recognition toolkit. SLT 2010: 97-102 - [c31]Ea-Ee Jan, Brian Kingsbury:
Rapid and inexpensive development of speech action classifiers for natural language call routing systems. SLT 2010: 348-353
2000 – 2009
- 2009
- [j6]Hagen Soltau, George Saon, Brian Kingsbury, Hong-Kwang Jeff Kuo, Lidia Mangu, Daniel Povey, Ahmad Emami:
Advances in Arabic Speech Transcription at IBM Under the DARPA GALE Program. IEEE Trans. Speech Audio Process. 17(5): 884-894 (2009) - [c30]Brian Kingsbury:
Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling. ICASSP 2009: 3761-3764 - [c29]Bhuvana Ramabhadran, Abhinav Sethy, Jonathan Mamou, Brian Kingsbury, Upendra V. Chaudhari:
Fast decoding for open vocabulary spoken term detection. HLT-NAACL (Short Papers) 2009: 277-280 - [c28]Ruhi Sarikaya, Mohamed Afify, Brian Kingsbury:
Tied-Mixture Language Modeling in Continuous Space. HLT-NAACL 2009: 459-467 - 2008
- [c27]Daniel Povey, Dimitri Kanevsky, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Karthik Visweswariah:
Boosted MMI for model and feature-space discriminative training. ICASSP 2008: 4057-4060 - [c26]Daniel Povey, Brian Kingsbury:
Monte Carlo model-space noise adaptation for speech recognition. INTERSPEECH 2008: 1281-1284 - [c25]Upendra V. Chaudhari, Hong-Kwang Jeff Kuo, Brian Kingsbury:
Discriminative graph training for ultra-fast low-footprint speech indexing. INTERSPEECH 2008: 2175-2178 - [c24]Ruhi Sarikaya, Yonggang Deng, Mohamed Afify, Brian Kingsbury, Yuqing Gao:
Machine translation in continuous space. INTERSPEECH 2008: 2350-2353 - 2007
- [c23]Hong-Kwang Jeff Kuo, Brian Kingsbury, Geoffrey Zweig:
Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition. ICASSP (4) 2007: 45-48 - [c22]Daniel Povey, Brian Kingsbury:
Evaluation of Proposed Modifications to MPE for Large Scale Discriminative Training. ICASSP (4) 2007: 321-324 - [c21]Hagen Soltau, George Saon, Brian Kingsbury, Hong-Kwang Jeff Kuo, Lidia Mangu, Daniel Povey, Geoffrey Zweig:
The IBM 2006 Gale Arabic ASR System. ICASSP (4) 2007: 349-352 - 2006
- [j5]Ran D. Zilca, Brian Kingsbury, Jirí Navrátil, Ganesh N. Ramaswamy:
Pseudo pitch synchronous analysis of speech with applications to speaker recognition. IEEE Trans. Speech Audio Process. 14(2): 467-478 (2006) - [j4]Stanley F. Chen, Brian Kingsbury, Lidia Mangu, Daniel Povey, George Saon, Hagen Soltau, Geoffrey Zweig:
Advances in speech transcription at IBM under the DARPA EARS program. IEEE Trans. Speech Audio Process. 14(5): 1596-1608 (2006) - [c20]Geoffrey Zweig, Olivier Siohan, George Saon, Bhuvana Ramabhadran, Daniel Povey, Lidia Mangu, Brian Kingsbury:
Automated Quality Monitoring in the Call Center with ASR and Maximum Entropy. ICASSP (1) 2006: 589-592 - [c19]Geoffrey Zweig, Olivier Siohan, George Saon, Bhuvana Ramabhadran, Daniel Povey, Lidia Mangu, Brian Kingsbury:
Automated Quality Monitoring for Call Centers using Speech and NLP Technologies. HLT-NAACL 2006 - 2005
- [c18]Olivier Siohan, Bhuvana Ramabhadran, Brian Kingsbury:
Contructing Ensembles of ASR Systems Using Randomized Decision Trees. ICASSP (1) 2005: 197-200 - [c17]Hagen Soltau, Brian Kingsbury, Lidia Mangu, Daniel Povey, George Saon, Geoffrey Zweig:
The IBM 2004 Conversational Telephony System for Rich Transcription. ICASSP (1) 2005: 205-208 - [c16]Daniel Povey, Brian Kingsbury, Lidia Mangu, George Saon, Hagen Soltau, Geoffrey Zweig:
fMPE: Discriminatively Trained Features for Speech Recognition. ICASSP (1) 2005: 961-964 - 2004
- [c15]Mohamed K. Omar, Brian Kingsbury:
An evaluation of a nonlinear feature transformation for conversational speech recognition. ICASSP (1) 2004: 785-788 - 2003
- [c14]Scott Axelrod, Vaibhava Goel, Brian Kingsbury, Karthik Visweswariah, Ramesh A. Gopinath:
Large vocabulary conversational speech recognition with a subspace constraint on inverse covariance matrices. INTERSPEECH 2003: 1613-1616 - [c13]Brian Kingsbury, Lidia Mangu, George Saon, Geoffrey Zweig, Scott Axelrod, Vaibhava Goel, Karthik Visweswariah, Michael Picheny:
Toward domain-independent conversational speech recognition. INTERSPEECH 2003: 1881-1884 - [c12]George Saon, Geoffrey Zweig, Brian Kingsbury, Lidia Mangu, Upendra V. Chaudhari:
An architecture for rapid decoding of large vocabulary conversational speech. INTERSPEECH 2003: 1977-1980 - 2002
- [j3]Mukund Padmanabhan, George Saon, Jing Huang, Brian Kingsbury, Lidia Mangu:
Automatic speech recognition performance on a voicemail transcription task. IEEE Trans. Speech Audio Process. 10(7): 433-442 (2002) - [c11]Brian Kingsbury, George Saon, Lidia Mangu, Mukund Padmanabhan, Ruhi Sarikaya:
Robust speech recognition in Noisy Environments: The 2001 IBM spine evaluation system. ICASSP 2002: 53-56 - [c10]Pratibha Jain, Hynek Hermansky, Brian Kingsbury:
Distributed speech recognition using noise-robust MFCC and traps-estimated manner features. INTERSPEECH 2002: 473-476 - [c9]Brian Kingsbury, Pratibha Jain, André Gustavo Adami:
A hybrid HMM/traps model for robust voice activity detection. INTERSPEECH 2002: 1073-1076 - [c8]Jing Huang, Vaibhava Goel, Ramesh Gopinath, Brian Kingsbury, Peder A. Olsen, Karthik Visweswariah:
Large vocabulary conversational speech recognition with the extended maximum likelihood linear transformation (EMLLT) model. INTERSPEECH 2002: 2597-2600 - 2000
- [c7]Jing Huang, Brian Kingsbury, Lidia Mangu, Mukund Padmanabhan, George Saon, Geoffrey Zweig:
Recent improvements in speech recognition performance on large vocabulary conversational speech (voicemail and switchboard). INTERSPEECH 2000: 338-341
1990 – 1999
- 1998
- [j2]Brian Kingsbury, Nelson Morgan, Steven Greenberg:
Robust speech recognition using the modulation spectrogram. Speech Commun. 25(1-3): 117-132 (1998) - [c6]Su-Lin Wu, Brian Kingsbury, Nelson Morgan, Steven Greenberg:
Incorporating information from syllable-length time scales into automatic speech recognition. ICASSP 1998: 721-724 - [c5]Su-Lin Wu, Brian Kingsbury, Nelson Morgan, Steven Greenberg:
Performance improvements through combining phone- and syllable-scale information in automatic speech recognition. ICSLP 1998 - 1997
- [c4]Brian Kingsbury, Nelson Morgan:
Recognizing reverberant speech with RASTA-PLP. ICASSP 1997: 1259-1262 - [c3]Steven Greenberg, Brian Kingsbury:
The modulation spectrogram: in pursuit of an invariant representation of speech. ICASSP 1997: 1647-1650 - 1996
- [j1]John Wawrzynek, Krste Asanovic, Brian Kingsbury, David Johnson, James Beck, Nelson Morgan:
Spert-II: A Vector Microprocessor System. Computer 29(3): 79-86 (1996) - 1995
- [c2]John Wawrzynek, Krste Asanovic, Brian Kingsbury, James Beck, David Johnson, Nelson Morgan:
SPERT-II: A Vector Microprocessor System and its Application to Large Problems in Backpropagation Training. NIPS 1995: 619-625 - 1992
- [c1]Krste Asanovic, James Beck, Brian Kingsbury, Phil Kohn, Nelson Morgan, John Wawrzynek:
SPERT: a VLIW/SIMD microprocessor for artificial neural network computations. ASAP 1992: 178-190
Coauthor Index
aka: Rogério Schmidt Feris
aka: David Harwath
aka: Hong-Kwang Kuo
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:25 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint