default search action
Hasim Sak
Person information
- affiliation: Google, Inc., USA
- affiliation (PhD 2011): Bogazici University, Department of Computer Engineering, Istanbul, Turkey
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c43]Anshuman Tripathi, Soheil Khorram, Han Lu, Jaeyoung Kim, Qian Zhang, Hasim Sak:
Monte Carlo Self-Training for Speech Recognition. ICASSP 2024: 12802-12806 - [i15]Jaeyoung Kim, Han Lu, Soheil Khorram, Anshuman Tripathi, Qian Zhang, Hasim Sak:
Clustering and Mining Accented Speech for Inclusive and Fair Speech Recognition. CoRR abs/2408.02582 (2024) - 2023
- [c42]Soheil Khorram, Anshuman Tripathi, Jaeyoung Kim, Han Lu, Qian Zhang, Rohit Prabhavalkar, Hasim Sak:
Cross-Training: A Semi-Supervised Training Scheme for Speech Recognition. ICASSP 2023: 1-5 - 2022
- [c41]Soheil Khorram, Jaeyoung Kim, Anshuman Tripathi, Han Lu, Qian Zhang, Hasim Sak:
Contrastive Siamese Network for Semi-Supervised Speech Recognition. ICASSP 2022: 7207-7211 - [c40]Wei Xia, Han Lu, Quan Wang, Anshuman Tripathi, Yiling Huang, Ignacio López-Moreno, Hasim Sak:
Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection. ICASSP 2022: 8077-8081 - [i14]Soheil Khorram, Jaeyoung Kim, Anshuman Tripathi, Han Lu, Qian Zhang, Hasim Sak:
Contrastive Siamese Network for Semi-supervised Speech Recognition. CoRR abs/2205.14054 (2022) - 2021
- [c39]Jaeyoung Kim, Han Lu, Anshuman Tripathi, Qian Zhang, Hasim Sak:
Reducing Streaming ASR Model Delay with Self Alignment. Interspeech 2021: 3440-3444 - [i13]Jaeyoung Kim, Han Lu, Anshuman Tripathi, Qian Zhang, Hasim Sak:
Reducing Streaming ASR Model Delay with Self Alignment. CoRR abs/2105.05005 (2021) - [i12]Wei Xia, Han Lu, Quan Wang, Anshuman Tripathi, Ignacio López-Moreno, Hasim Sak:
Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection. CoRR abs/2109.11641 (2021) - 2020
- [c38]Anshuman Tripathi, Han Lu, Hasim Sak:
End-To-End Multi-Talker Overlapping Speech Recognition. ICASSP 2020: 6129-6133 - [c37]Qian Zhang, Han Lu, Hasim Sak, Anshuman Tripathi, Erik McDermott, Stephen Koo, Shankar Kumar:
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss. ICASSP 2020: 7829-7833 - [c36]Yun Zhu, Parisa Haghani, Anshuman Tripathi, Bhuvana Ramabhadran, Brian Farris, Hainan Xu, Han Lu, Hasim Sak, Isabel Leal, Neeraj Gaur, Pedro J. Moreno, Qian Zhang:
Multilingual Speech Recognition with Self-Attention Structured Parameterization. INTERSPEECH 2020: 4741-4745 - [i11]Qian Zhang, Han Lu, Hasim Sak, Anshuman Tripathi, Erik McDermott, Stephen Koo, Shankar Kumar:
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss. CoRR abs/2002.02562 (2020) - [i10]Erik McDermott, Hasim Sak, Ehsan Variani:
A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition. CoRR abs/2002.11268 (2020) - [i9]Anshuman Tripathi, Jaeyoung Kim, Qian Zhang, Han Lu, Hasim Sak:
Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition. CoRR abs/2010.03192 (2020)
2010 – 2019
- 2019
- [c35]Erik McDermott, Hasim Sak, Ehsan Variani:
A Density Ratio Approach to Language Model Fusion in End-to-End Automatic Speech Recognition. ASRU 2019: 434-441 - [c34]Anshuman Tripathi, Han Lu, Hasim Sak, Hagen Soltau:
Monotonic Recurrent Neural Network Transducer and Decoding Strategies. ASRU 2019: 944-948 - [c33]Brendan Shillingford, Yannis M. Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Misha Denil, Ben Coppin, Ben Laurie, Andrew W. Senior, Nando de Freitas:
Large-Scale Visual Speech Recognition. INTERSPEECH 2019: 4135-4139 - [i8]Ke Hu, Hasim Sak, Hank Liao:
Adversarial Training for Multilingual Acoustic Modeling. CoRR abs/1906.07093 (2019) - 2018
- [c32]Chung-Cheng Chiu, Anshuman Tripathi, Katherine Chou, Chris Co, Navdeep Jaitly, Diana Jaunzeikare, Anjuli Kannan, Patrick Nguyen, Hasim Sak, Ananth Sankar, Justin Tansuwan, Nathan Wan, Yonghui Wu, Xuedong Zhang:
Speech Recognition for Medical Conversations. INTERSPEECH 2018: 2972-2976 - [i7]Kanishka Rao, Hasim Sak, Rohit Prabhavalkar:
Exploring Architectures, Data and Units For Streaming End-to-End Speech Recognition with RNN-Transducer. CoRR abs/1801.00841 (2018) - [i6]Brendan Shillingford, Yannis M. Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Ben Coppin, Ben Laurie, Andrew W. Senior, Nando de Freitas:
Large-Scale Visual Speech Recognition. CoRR abs/1807.05162 (2018) - 2017
- [c31]Hagen Soltau, Hank Liao, Hasim Sak:
Reducing the computational complexity for whole word models. ASRU 2017: 63-68 - [c30]Kanishka Rao, Hasim Sak, Rohit Prabhavalkar:
Exploring architectures, data and units for streaming end-to-end speech recognition with RNN-transducer. ASRU 2017: 193-199 - [c29]Kanishka Rao, Hasim Sak:
Multi-accent speech recognition with hierarchical grapheme based models. ICASSP 2017: 4815-4819 - [c28]Bo Li, Tara N. Sainath, Arun Narayanan, Joe Caroselli, Michiel Bacchiani, Ananya Misra, Izhak Shafran, Hasim Sak, Golan Pundak, Kean K. Chin, Khe Chai Sim, Ron J. Weiss, Kevin W. Wilson, Ehsan Variani, Chanwoo Kim, Olivier Siohan, Mitchel Weintraub, Erik McDermott, Richard Rose, Matt Shannon:
Acoustic Modeling for Google Home. INTERSPEECH 2017: 399-403 - [c27]Hasim Sak, Matt Shannon, Kanishka Rao, Françoise Beaufays:
Recurrent Neural Aligner: An Encoder-Decoder Neural Network Model for Sequence to Sequence Mapping. INTERSPEECH 2017: 1298-1302 - [c26]Hagen Soltau, Hank Liao, Hasim Sak:
Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition. INTERSPEECH 2017: 3707-3711 - [i5]Chung-Cheng Chiu, Anshuman Tripathi, Katherine Chou, Chris Co, Navdeep Jaitly, Diana Jaunzeikare, Anjuli Kannan, Patrick Nguyen, Hasim Sak, Ananth Sankar, Justin Tansuwan, Nathan Wan, Yonghui Wu, Xuedong Zhang:
Speech recognition for medical conversations. CoRR abs/1711.07274 (2017) - 2016
- [c25]Kanishka Rao, Andrew W. Senior, Hasim Sak:
Flat start training of CD-CTC-SMBR LSTM RNN acoustic models. ICASSP 2016: 5405-5409 - [c24]Ian McGraw, Rohit Prabhavalkar, Raziel Alvarez, Montse Gonzalez Arenas, Kanishka Rao, David Rybach, Ouais Alsharif, Hasim Sak, Alexander Gruenstein, Françoise Beaufays, Carolina Parada:
Personalized speech recognition on mobile devices. ICASSP 2016: 5955-5959 - [i4]Ian McGraw, Rohit Prabhavalkar, Raziel Alvarez, Montse Gonzalez Arenas, Kanishka Rao, David Rybach, Ouais Alsharif, Hasim Sak, Alexander Gruenstein, Françoise Beaufays, Carolina Parada:
Personalized Speech recognition on mobile devices. CoRR abs/1603.03185 (2016) - [i3]Hagen Soltau, Hank Liao, Hasim Sak:
Neural Speech Recognizer: Acoustic-to-Word LSTM Model for Large Vocabulary Speech Recognition. CoRR abs/1610.09975 (2016) - 2015
- [c23]Andrew W. Senior, Hasim Sak, Felix de Chaumont Quitry, Tara N. Sainath, Kanishka Rao:
Acoustic modelling with CD-CTC-SMBR LSTM RNNS. ASRU 2015: 604-609 - [c22]Kanishka Rao, Fuchun Peng, Hasim Sak, Françoise Beaufays:
Grapheme-to-phoneme conversion using Long Short-Term Memory recurrent neural networks. ICASSP 2015: 4225-4229 - [c21]Hasim Sak, Andrew W. Senior, Kanishka Rao, Ozan Irsoy, Alex Graves, Françoise Beaufays, Johan Schalkwyk:
Learning acoustic frame labeling for speech recognition with recurrent neural networks. ICASSP 2015: 4280-4284 - [c20]Heiga Zen, Hasim Sak:
Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis. ICASSP 2015: 4470-4474 - [c19]Tara N. Sainath, Oriol Vinyals, Andrew W. Senior, Hasim Sak:
Convolutional, Long Short-Term Memory, fully connected Deep Neural Networks. ICASSP 2015: 4580-4584 - [c18]Andrew W. Senior, Hasim Sak, Izhak Shafran:
Context dependent phone models for LSTM RNN acoustic modelling. ICASSP 2015: 4585-4589 - [c17]Hasim Sak, Andrew W. Senior, Kanishka Rao, Françoise Beaufays:
Fast and accurate recurrent neural network acoustic models for speech recognition. INTERSPEECH 2015: 1468-1472 - [i2]Hasim Sak, Andrew W. Senior, Kanishka Rao, Françoise Beaufays:
Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition. CoRR abs/1507.06947 (2015) - 2014
- [c16]Hasim Sak, Andrew W. Senior, Françoise Beaufays:
Long short-term memory recurrent neural network architectures for large scale acoustic modeling. INTERSPEECH 2014: 338-342 - [c15]Hasim Sak, Oriol Vinyals, Georg Heigold, Andrew W. Senior, Erik McDermott, Rajat Monga, Mark Z. Mao:
Sequence discriminative distributed training of long short-term memory recurrent neural networks. INTERSPEECH 2014: 1209-1213 - [c14]Javier Gonzalez-Dominguez, Ignacio López-Moreno, Hasim Sak, Joaquin Gonzalez-Rodriguez, Pedro J. Moreno:
Automatic language identification using long short-term memory recurrent neural networks. INTERSPEECH 2014: 2155-2159 - [i1]Hasim Sak, Andrew W. Senior, Françoise Beaufays:
Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition. CoRR abs/1402.1128 (2014) - 2013
- [c13]Hasim Sak, Cyril Allauzen, Kaisuke Nakajima, Françoise Beaufays:
Mixture of mixture n-gram language models. ASRU 2013: 31-36 - [c12]Hasim Sak, Françoise Beaufays, Kaisuke Nakajima, Cyril Allauzen:
Language model verbalization for automatic speech recognition. ICASSP 2013: 8262-8266 - [c11]Hasim Sak, Yun-Hsuan Sung, Françoise Beaufays, Cyril Allauzen:
Written-domain language modeling for automatic speech recognition. INTERSPEECH 2013: 675-679 - 2012
- [j4]Hasim Sak, Murat Saraclar, Tunga Gungor:
Morpholexical and Discriminative Language Models for Turkish Automatic Speech Recognition. IEEE Trans. Speech Audio Process. 20(8): 2341-2351 (2012) - [c10]Arda Çelebi, Hasim Sak, Erinç Dikici, Murat Saraclar, Maider Lehr, Emily Tucker Prud'hommeaux, Puyang Xu, Nathan Glenn, Damianos G. Karakos, Sanjeev Khudanpur, Brian Roark, Kenji Sagae, Izhak Shafran, Daniel M. Bikel, Chris Callison-Burch, Yuan Cao, Keith B. Hall, Eva Hasler, Philipp Koehn, Adam Lopez, Matt Post, Darcey Riley:
Semi-supervised discriminative language modeling for Turkish ASR. ICASSP 2012: 5025-5028 - 2011
- [b1]Hasim Sak:
Integrating morphology into automatic speech recognition: Morpholexical and discriminative language models for Turkish (Biçimbilimin otomatik konuşma tanımaya bütünleştirilmesi: Türkçe için biçimsözlüksel ve ayırıcı dil modelleri). Boğaziçi University, Turkey, 2011 - [j3]Marek Hrúz, Pavel Campr, Erinç Dikici, Ahmet Alp Kindiroglu, Zdenek Krnoul, Alexander L. Ronzhin, Hasim Sak, Daniel Schorno, Hülya Yalçin, Lale Akarun, Oya Aran, Alexey Karpov, Murat Saraçlar, Milos Zelezný:
Automatic fingersign-to-speech translation system. J. Multimodal User Interfaces 4(2): 61-79 (2011) - [j2]Hasim Sak, Tunga Güngör, Murat Saraclar:
Resources for Turkish morphological processing. Lang. Resour. Evaluation 45(2): 249-261 (2011) - [c9]Hasim Sak, Murat Saraclar, Tunga Gungor:
Discriminative reranking of ASR hypotheses with morpholexical and N-best-list features. ASRU 2011: 202-207 - 2010
- [c8]Hasim Sak, Murat Saraclar, Tunga Güngör:
Morphology-based and sub-word language modeling for Turkish speech recognition. ICASSP 2010: 5402-5405 - [c7]Hasim Sak, Murat Saraclar, Tunga Güngör:
On-the-fly lattice rescoring for real-time automatic speech recognition. INTERSPEECH 2010: 2450-2453
2000 – 2009
- 2009
- [j1]Ebru Arisoy, Dogan Can, Siddika Parlak, Hasim Sak, Murat Saraclar:
Turkish Broadcast News Transcription and Retrieval. IEEE Trans. Speech Audio Process. 17(5): 874-883 (2009) - [c6]Hasim Sak, Tunga Güngör, Murat Saraclar:
A Stochastic Finite-State Morphological Parser for Turkish. ACL/IJCNLP (2) 2009: 273-276 - [c5]Hasim Sak, Murat Saraclar, Tunga Güngör:
Integrating morphology into automatic speech recognition. ASRU 2009: 354-358 - 2008
- [c4]Hasim Sak, Tunga Güngör, Murat Saraclar:
Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus. GoTAL 2008: 417-427 - 2007
- [c3]Hasim Sak, Tunga Güngör, Murat Saraclar:
Morphological Disambiguation of Turkish Text with Perceptron Algorithm. CICLing 2007: 107-118 - [c2]Ebru Arisoy, Hasim Sak, Murat Saraclar:
Language modeling for automatic turkish broadcast news transcription. INTERSPEECH 2007: 2381-2384 - 2005
- [c1]Hasim Sak, Tunga Gungor, Yasar Safkan:
Generation of synthetic speech from Turkish text. EUSIPCO 2005: 1-4
Coauthor Index
aka: Murat Saraçlar
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:20 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint