![](https://dblp.org/img/logo.320x120.png)
![search dblp search dblp](https://dblp.org/img/search.dark.16x16.png)
![search dblp](https://dblp.org/img/search.dark.16x16.png)
default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 30
Volume 30, 2022
- Qianying Liu
, Wenyu Guan, Sujian Li, Fei Cheng, Daisuke Kawahara
, Sadao Kurohashi
:
RODA: Reverse Operation Based Data Augmentation for Solving Math Word Problems. 1-11 - Kai Zhen
, Jongmo Sung
, Mi Suk Lee, Seungkwon Beack, Minje Kim
:
Scalable and Efficient Neural Speech Coding: A Hybrid Design. 12-25 - Sen Yang, Yang Liu, Dawei Feng, Dongsheng Li
:
Text Generation From Data With Dynamic Planning. 26-34 - Stefan Liebich
, Peter Vary
:
Occlusion Effect Cancellation in Headphones and Hearing Devices - The Sister of Active Noise Cancellation. 35-48 - Zhuosheng Zhang
, Haojie Yu, Hai Zhao
, Masao Utiyama
:
Which Apple Keeps Which Doctor Away? Colorful Word Representations With Visual Oracles. 49-59 - Zhenyu Wang, John H. L. Hansen
:
Multi-Source Domain Adaptation for Text-Independent Forensic Speaker Recognition. 60-75 - Kengtao Zheng
, Nankai Lin
, Shengyi Jiang:
Unsupervised Character Embedding Correction and Candidate Word Denoising. 76-86 - Bing Ma, Haifeng Sun
, Jingyu Wang
, Qi Qi, Jianxin Liao:
Extractive Dialogue Summarization Without Annotation Based on Distantly Supervised Machine Reading Comprehension in Customer Service. 87-97 - Shengcai Liu
, Ning Lu
, Cheng Chen
, Ke Tang
:
Efficient Combinatorial Optimization for Word-Level Adversarial Textual Attack. 98-111 - Alessandro Terenzi
, Nicola Ortolani, Inês Nolasco
, Emmanouil Benetos
, Stefania Cecchi
:
Comparison of Feature Extraction Methods for Sound-Based Classification of Honey Bee Activity. 112-122 - Shuiyang Mao
, P. C. Ching
, Tan Lee
:
Enhancing Segment-Based Speech Emotion Recognition by Iterative Self-Learning. 123-134 - Abdolreza Sabzi Shahrebabaki
, Giampiero Salvi
, Torbjørn Svendsen
, Sabato Marco Siniscalchi
:
Acoustic-to-Articulatory Mapping With Joint Optimization of Deep Speech Enhancement and Articulatory Inversion Models. 135-147 - Javier Jorge
, Adrià Giménez
, Joan Albert Silvestre-Cerdà
, Jorge Civera
, Alberto Sanchís
, Alfons Juan
:
Live Streaming Speech Recognition Using Deep Bidirectional LSTM Acoustic Models and Interpolated Language Models. 148-161 - P. V. Muhammed Shifas
, Catalin Zorila, Yannis Stylianou:
End-to-End Neural Based Modification of Noisy Speech for Speech-in-Noise Intelligibility Improvement. 162-173 - Joon-Young Yang, Joon-Hyuk Chang
:
VACE-WPE: Virtual Acoustic Channel Expansion Based on Neural Networks for Weighted Prediction Error-Based Speech Dereverberation. 174-189 - Chenpeng Du
, Kai Yu
:
Phone-Level Prosody Modelling With GMM-Based MDN for Diverse and Controllable Speech Synthesis. 190-201 - Haibin Wu
, Xu Li
, Andy T. Liu
, Zhiyong Wu
, Helen Meng, Hung-Yi Lee
:
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning. 202-217 - Mixiao Hou
, Zheng Zhang
, Qi Cao
, David Zhang
, Guangming Lu
:
Multi-View Speech Emotion Recognition Via Collective Relation Construction. 218-229 - Da-Rong Liu, Po-Chun Hsu, Yi-Chen Chen, Sung-Feng Huang
, Shun-Po Chuang, Da-Yi Wu, Hung-yi Lee
:
Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network. 230-243 - Yuting Zhao
, Mamoru Komachi
, Tomoyuki Kajiwara, Chenhui Chu
:
Word-Region Alignment-Guided Multimodal Neural Machine Translation. 244-259 - Zhuosheng Zhang
, Yiqing Zhang, Hai Zhao
:
Syntax-Aware Multi-Spans Generation for Reading Comprehension. 260-268 - Pengfei Zhu
, Zhuosheng Zhang
, Hai Zhao
, Xiaoguang Li:
DUMA: Reading Comprehension With Transposition Thinking. 269-279 - Jiayuan Xie, Ningxin Peng, Yi Cai
, Tao Wang
, Qingbao Huang
:
Diverse Distractor Generation for Constructing High-Quality Multiple Choice Questions. 280-291 - Jie Zhang
, Guanghui Zhang
:
A Parametric Unconstrained Beamformer Based Binaural Noise Reduction for Assistive Hearing. 292-304 - Luca Turchet
, Johan Pauwels
:
Music Emotion Recognition: Intention of Composers-Performers Versus Perception of Musicians, Non-Musicians, and Listening Machines. 305-316 - Wenxin Hou, Han Zhu
, Yidong Wang, Jindong Wang
, Tao Qin, Renjun Xu
, Takahiro Shinozaki
:
Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition. 317-329 - Kehai Chen
, Rui Wang
, Masao Utiyama
, Eiichiro Sumita
:
Integrating Prior Translation Knowledge Into Neural Machine Translation. 330-339 - Keqi Deng
, Gaofeng Cheng
, Runyan Yang
, Yonghong Yan
:
Alleviating ASR Long-Tailed Problem by Decoupling the Learning of Representation and Classification. 340-354 - Zuchao Li
, Junru Zhou, Hai Zhao
, Kevin Parnow:
HPSG-Inspired Joint Neural Constituent and Dependency Parsing in O($n^3$) Time Complexity. 355-366 - Xuan Shi
, Erica Cooper
, Junichi Yamagishi
:
Use of Speaker Recognition Approaches for Learning and Evaluating Embedding Representations of Musical Instrument Sounds. 367-377 - Zengwei Yao
, Wenjie Pei, Fanglin Chen
, Guangming Lu
, David Zhang
:
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-Order Latent Domain. 378-393 - Yanmin Qian
, Zhikai Zhou
:
Optimizing Data Usage for Low-Resource Speech Recognition. 394-403 - Narla John Metilda Sagaya Mary
, Srinivasan Umesh, Sandesh Varadaraju Katta:
S-Vectors and TESA: Speaker Embeddings and a Speaker Authenticator Based on Transformer Encoder. 404-413 - Bengt J. Borgström
:
Bayesian Estimation of PLDA in the Presence of Noisy Training Labels, With Applications to Speaker Verification. 414-428 - Menglong Lu
, Zhen Huang, Binyang Li, Yunxiang Zhao
, Zheng Qin, Dong Sheng Li
:
SIFTER: A Framework for Robust Rumor Detection. 429-442 - Lantian Li
, Dong Wang
, Jiawen Kang, Renyu Wang, Jing Wu, Zhendong Gao, Xiao Chen:
A Principle Solution for Enroll-Test Mismatch in Speaker Recognition. 443-455 - Feiran Yang
:
Analysis of Deficient-Length Partitioned-Block Frequency-Domain Adaptive Filters. 456-467 - Hui Jiang
, Linfeng Song
, Yubin Ge, Fandong Meng, Junfeng Yao
, Jinsong Su
:
An AST Structure Enhanced Decoder for Code Generation. 468-476 - Anssi Kanervisto
, Ville Hautamäki
, Tomi Kinnunen, Junichi Yamagishi
:
Optimizing Tandem Speaker Verification and Anti-Spoofing Systems. 477-488 - Xin Ni
, Jia Ren:
FC-U2-Net: A Novel Deep Neural Network for Singing Voice Separation. 489-494 - Neil Zeghidour
, Alejandro Luebs, Ahmed Omran, Jan Skoglund
, Marco Tagliasacchi
:
SoundStream: An End-to-End Neural Audio Codec. 495-507 - Wageesha Manamperi
, Thushara D. Abhayapala
, Jihui Zhang
, Prasanga N. Samarasinghe
:
Drone Audition: Sound Source Localization Using On-Board Microphones. 508-519 - Qian Li, Hao Peng
, Jianxin Li
, Jia Wu
, Yuanxing Ning, Lihong Wang, Philip S. Yu
, Zheng Wang
:
Reinforcement Learning-Based Dialogue Guided Event Extraction to Exploit Argument Relations. 520-533 - Santiago Ruiz
, Toon van Waterschoot
, Marc Moonen
:
Distributed Combined Acoustic Echo Cancellation and Noise Reduction in Wireless Acoustic Sensor and Actuator Networks. 534-547 - Lukas Grinewitschus
, Peter Jung
:
The Harmonic Shift Algorithm for Efficient Multi-Pitch Detection. 548-561 - Ziyao Lu
, Xiang Li, Yang Liu
, Chulun Zhou, Jianwei Cui, Bin Wang, Min Zhang, Jinsong Su
:
Exploring Multi-Stage Information Interactions for Multi-Source Neural Machine Translation. 562-570 - Jingxuan Yang
, Si Li
, Sheng Gao
, Jun Guo
:
CorefDPR: A Joint Model for Coreference Resolution and Dropped Pronoun Recovery in Chinese Conversations. 571-581 - Timuçin Berk Atalay
, Zühre Sü Gül, Enzo De Sena
, Zoran Cvetkovic
, Hüseyin Hacihabiboglu
:
Scattering Delay Network Simulator of Coupled Volume Acoustics. 582-593 - Yi Zhang
, Lei Li, Yunfang Wu, Qi Su
, Xu Sun
:
Alleviating the Knowledge-Language Inconsistency: A Study for Deep Commonsense Knowledge. 594-604 - Ke Tan
, Zhong-Qiu Wang
, DeLiang Wang
:
Neural Spectrospatial Filtering. 605-621 - Qianren Mao
, Jianxin Li
, Chenghua Lin
, Congwen Chen, Hao Peng
, Lihong Wang
, Philip S. Yu
:
Adaptive Pre-Training and Collaborative Fine-Tuning: A Win-Win Strategy to Improve Review Analysis Tasks. 622-634 - Zifeng Cheng
, Zhiwei Jiang
, Yafeng Yin
, Cong Wang
, Qing Gu
:
Learning to Classify Open Intent via Soft Labeling and Manifold Mixup. 635-645 - Xiaochun An, Frank K. Soong, Lei Xie
:
Disentangling Style and Speaker Attributes for TTS Style Transfer. 646-658 - Zhuang Chen
, Tieyun Qian
:
Retrieve-and-Edit Domain Adaptation for End2End Aspect Based Sentiment Analysis. 659-672 - Jian Liu
, Mengshi Yu, Yufeng Chen, Jinan Xu:
Cross-Domain Slot Filling as Machine Reading Comprehension: A New Perspective. 673-685 - Yongkang Liu, Qingbao Huang, Jing Li, Linzhang Mo, Yi Cai, Qing Li:
SSAP: Storylines and Sentiment Aware Pre-Trained Model for Story Ending Generation. 686-694 - Ying Zhou
, Xuefeng Liang
, Yu Gu
, Yifei Yin, Longshan Yao:
Multi-Classifier Interactive Learning for Ambiguous Speech Emotion Recognition. 695-705 - Poul Hoang
, Jan Mark de Haan, Zheng-Hua Tan
, Jesper Jensen
:
Multichannel Speech Enhancement With Own Voice-Based Interfering Speech Suppression for Hearing Assistive Devices. 706-720 - Weijie Yu
, Chen Xu, Jun Xu
, Liang Pang, Ji-Rong Wen:
Distribution Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains. 721-733 - Heming Wang
, DeLiang Wang
:
Neural Cascade Architecture With Triple-Domain Loss for Speech Enhancement. 734-743 - Riccardo R. De Lucia
, Antonio Canclini
, Fabio Antonacci, Augusto Sarti
:
Group Dictionary Equivalent Source Method for Sparse Nearfield Acoustic Holography. 744-757 - Tong Ma
, Ying Wei
, Xin Lou
:
Reconfigurable Nonuniform Filter Bank for Hearing Aid Systems. 758-771 - Victoria Mingote
, Antonio Miguel
, Dayana Ribas
, Alfonso Ortega
, Eduardo Lleida
:
aDCF Loss Function for Deep Metric Learning in End-to-End Text-Dependent Speaker Verification Systems. 772-784 - Quansheng Tu, Huawei Chen
:
Theoretical Lower Bounds on the Performance of the First-Order Differential Microphone Arrays With Sensor Imperfections. 785-801 - Taihui Wang
, Feiran Yang
, Jun Yang
:
Convolutive Transfer Function-Based Multichannel Nonnegative Matrix Factorization for Overdetermined Blind Source Separation. 802-815 - Yi Zhang
, Guangyou Zhou
, Zhiwen Xie
, Jimmy Xiangji Huang
:
HGEN: Learning Hierarchical Heterogeneous Graph Encoding for Math Word Problem Solving. 816-828 - Eduardo Fonseca
, Xavier Favory, Jordi Pons
, Frederic Font, Xavier Serra:
FSD50K: An Open Dataset of Human-Labeled Sound Events. 829-852 - Yi Lei, Shan Yang, Xinsheng Wang
, Lei Xie
:
MsEmoTTS: Multi-Scale Emotion Transfer, Prediction, and Control for Emotional Speech Synthesis. 853-864 - Tao Wang
, Ruibo Fu
, Jiangyan Yi
, Jianhua Tao
, Zhengqi Wen:
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation. 865-878 - Simon Stone
, Yingming Gao
, Peter Birkholz
:
Articulatory Synthesis of Vocalized /r/ Allophones in German. 879-889 - Prashant Serai
, Vishal Sunder, Eric Fosler-Lussier
:
Hallucination of Speech Recognition Errors With Sequence to Sequence Learning. 890-900 - Bin Wu
, Sakriani Sakti
, Jinsong Zhang, Satoshi Nakamura
:
Modeling Unsupervised Empirical Adaptation by DPGMM and DPGMM-RNN Hybrid Model to Extract Perceptual Features for Low-Resource ASR. 901-916 - Mi Zhang, Tieyun Qian
, Bing Liu
:
Exploit Feature and Relation Hierarchy for Relation Extraction. 917-930 - Wenxiang Jiao
, Xing Wang, Shilin He, Zhaopeng Tu, Irwin King
, Michael R. Lyu:
Exploiting Inactive Examples for Natural Language Generation With Data Rejuvenation. 931-943 - Youzhi Tu
, Man-Wai Mak
:
Aggregating Frame-Level Information in the Spectral Domain With Self-Attention for Speaker Embedding. 944-957 - Zhixing Tan
, Zeyuan Yang
, Meng Zhang, Qun Liu
, Maosong Sun
, Yang Liu
:
Dynamic Multi-Branch Layers for On-Device Neural Machine Translation. 958-967 - Weiwei Lin
, Man-Wai Mak
:
Mixture Representation Learning for Deep Speaker Embedding. 968-978 - Peng Zhu
, Dawei Cheng
, Fangzhou Yang, Yifeng Luo
, Dingjiang Huang, Weining Qian, Aoying Zhou:
Improving Chinese Named Entity Recognition by Large-Scale Syntactic Dependency Graph. 979-991 - Xiaobo Liang, Lijun Wu
, Juntao Li
, Tao Qin
, Min Zhang, Tie-Yan Liu:
Multi-Teacher Distillation With Single Model for Neural Machine Translation. 992-1002 - Xiaofeng Chen, Guohua Wang, Haopeng Ren, Yi Cai
, Ho-fung Leung
, Tao Wang
:
Task-Adaptive Feature Fusion for Generalized Few-Shot Relation Classification in an Open World Environment. 1003-1015 - Yu-Chen Lin
, Cheng Yu, Yi-Te Hsu, Szu-Wei Fu
, Yu Tsao
, Tei-Wei Kuo
:
SEOFP-NET: Compression and Acceleration of Deep Neural Networks for Speech Enhancement Using Sign-Exponent-Only Floating-Points. 1016-1031 - Tomohiro Nakatani
, Rintaro Ikeshita
, Keisuke Kinoshita
, Hiroshi Sawada
, Naoyuki Kamo, Shoko Araki:
Switching Independent Vector Analysis and its Extension to Blind and Spatially Guided Convolutional Beamforming Algorithms. 1032-1047 - Jianhua Geng
, Sifan Wang
, Qinglai Liu
, Xin Lou
:
Multi-Level Time-Frequency Bins Selection for Direction of Arrival Estimation Using a Single Acoustic Vector Sensor. 1048-1060 - Qinzhuo Wu
, Qi Zhang
, Xuanjing Huang
:
Automatic Math Word Problem Generation With Topic-Expression Co-Attention Mechanism and Reinforcement Learning. 1061-1072 - Michael Nigro
, Sridhar Krishnan
:
Multimodal System for Audio Scene Source Counting and Analysis. 1073-1082 - Yishu Peng
, Sheng Zhang
, Jiashu Zhang
, Wei Xing Zheng
:
Combined-Sample Multiband-Structured Subband Filtering Algorithms. 1083-1092 - Shoukang Hu
, Xurong Xie, Mingyu Cui, Jiajun Deng
, Shansong Liu, Jianwei Yu
, Mengzhe Geng
, Xunying Liu
, Helen Meng:
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks. 1093-1107 - Xudong Dang
, Wen Ma, Emanuël A. P. Habets
, Hongyan Zhu
:
TDOA-Based Robust Sound Source Localization With Sparse Regularization in Wireless Acoustic Sensor Networks. 1108-1123 - Shan Gao, Jing Lin, Xihong Wu, Tianshu Qu
:
Sparse DNN Model for Frequency Expanding of Higher Order Ambisonics Encoding Process. 1124-1135 - Giovanni Pepe
, Leonardo Gabrielli
, Stefano Squartini
, Carlo Tripodi, Nicolo Strozzi
:
Deep Optimization of Parametric IIR Filters for Audio Equalization. 1136-1149 - Moa Lee, Junmo Lee, Joon-Hyuk Chang
:
Non-Autoregressive Fully Parallel Deep Convolutional Neural Speech Synthesis. 1150-1159 - Liam Barrett
, Junchao Hu
, Peter Howell
:
Systematic Review of Machine Learning Approaches for Detecting Developmental Stuttering. 1160-1172 - Sang-Hoon Lee
, Hyeong-Rae Noh
, Woo-Jeoung Nam
, Seong-Whan Lee
:
Duration Controllable Voice Conversion via Phoneme-Based Information Bottleneck. 1173-1183 - Zhihong Shao
, Zhongqin Wu, Minlie Huang
:
AdvExpander: Generating Natural Language Adversarial Examples by Expanding Text. 1184-1196 - Dhanunjaya Varma Devalraju
, Padmanabhan Rajan:
Multiview Embeddings for Soundscape Classification. 1197-1206 - Chengyu Wang
, Suyang Dai, Yipeng Wang, Fei Yang, Minghui Qiu
, Kehan Chen, Wei Zhou, Jun Huang:
ARoBERT: An ASR Robust Pre-Trained Language Model for Spoken Language Understanding. 1207-1218 - Jonah Ong
, Ba-Tuong Vo
, Sven Nordholm
, Ba-Ngu Vo
, Diluka Moratuwage
, Changbeom Shim
:
Audio-Visual Based Online Multi-Source Separation. 1219-1234 - Leyang Cui
, Yafu Li
, Yue Zhang
:
Label Attention Network for Structured Prediction. 1235-1248 - Sarinah Sutojo
, Tobias May
, Steven van de Par:
Segmentation of Multitalker Mixtures Based on Local Feature Contrasts and Auditory Glimpses. 1249-1262 - Hao Gao
, Xuelei Feng
, Yong Shen
:
Weighted Loudspeaker Placement Method for Sound Field Reproduction. 1263-1276 - Gongping Huang
, Jacob Benesty
, Israel Cohen
, Jingdong Chen
:
Kronecker Product Multichannel Linear Filtering for Adaptive Weighted Prediction Error-Based Speech Dereverberation. 1277-1289 - Takehiro Sugimoto
:
Loudness-Level-Chasing Algorithm for Multiformat Live Audio Production. 1290-1304 - Junshuang Wu
, Richong Zhang
, Yongyi Mao, Jinpeng Huai:
Dealing With Hierarchical Types and Label Noise in Fine-Grained Entity Typing. 1305-1318 - Anton Ragni
, Mark J. F. Gales
, Oliver Rose, Katherine M. Knill
, Alexandros Kastanos, Qiujia Li
, Preben Ness:
Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition. 1319-1329 - Zhongxin Bai
, Jianyu Wang, Xiao-Lei Zhang
, Jingdong Chen
:
End-to-End Speaker Verification via Curriculum Bipartite Ranking Weighted Binary Cross-Entropy. 1330-1344 - Shang-Yi Chuang
, Hsin-Min Wang
, Yu Tsao
:
Improved Lite Audio-Visual Speech Enhancement. 1345-1359 - Gaofeng Cheng
, Haoran Miao
, Runyan Yang
, Keqi Deng
, Yonghong Yan
:
ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture. 1360-1373 - Ashutosh Pandey
, DeLiang Wang
:
Self-Attending RNN for Speech Enhancement to Improve Cross-Corpus Generalization. 1374-1385 - Di Jin
, Shuyang Gao, Seokhwan Kim
, Yang Liu, Dilek Hakkani-Tür:
Towards Textual Out-of-Domain Detection Without In-Domain Labels. 1386-1395 - K. Mrinalini
, P. Vijayalakshmi
, T. Nagarajan
:
SBSim: A Sentence-BERT Similarity-Based Evaluation Metric for Indian Language Neural Machine Translation Systems. 1396-1406 - Changhong Wang
, Emmanouil Benetos
, Vincent Lostanlen
, Elaine Chew
:
Adaptive Scattering Transforms for Playing Technique Recognition. 1407-1421 - Danwei Cai
, Weiqing Wang, Ming Li
:
Incorporating Visual Information in Audio Based Self-Supervised Speaker Recognition. 1422-1435 - Yu Luo
, Lina Pu
:
EC-ANC: Edge Case-Enhanced Active Noise Cancellation for True Wireless Stereo Earbuds. 1436-1447 - Tao Li
, Xinsheng Wang
, Qicong Xie, Zhichao Wang, Lei Xie
:
Cross-Speaker Emotion Disentangling and Transfer for End-to-End Speech Synthesis. 1448-1460 - Yilin Zhao
, Zhuosheng Zhang
, Hai Zhao
:
Reference Knowledgeable Network for Machine Reading Comprehension. 1461-1473 - Fu-Hao Yu, Kuan-Yu Chen
, Ke-Han Lu
:
Non-Autoregressive ASR Modeling Using Pre-Trained Language Models for Chinese Speech Recognition. 1474-1482 - Yiming Cui
, Ting Liu
, Wanxiang Che
, Zhigang Chen, Shijin Wang:
Teaching Machines to Read, Answer and Explain. 1483-1492 - Shota Horiguchi
, Yusuke Fujita
, Shinji Watanabe
, Yawen Xue, Paola García
:
Encoder-Decoder Based Attractors for End-to-End Neural Diarization. 1493-1507 - Chenda Li
, Zhuo Chen, Yanmin Qian
:
Dual-Path Modeling With Memory Embedding Model for Continuous Speech Separation. 1508-1520 - Yu Tong
, Jingzhi Guo
, Jizhe Zhou
:
Separation Inference: A Unified Framework for Word Segmentation in East Asian Languages. 1521-1530 - Hitoshi Suda
, Daisuke Saito, Satoru Fukayama, Tomoyasu Nakano, Masataka Goto
:
Singer Diarization for Polyphonic Music With Unison Singing. 1531-1545 - Xinnian Liang
, Jing Li, Shuangzhi Wu, Mu Li, Zhoujun Li
:
Improving Unsupervised Extractive Summarization by Jointly Modeling Facet and Redundancy. 1546-1557 - Sung-Feng Huang
, Chyi-Jiunn Lin
, Da-Rong Liu, Yi-Chen Chen
, Hung-yi Lee
:
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech. 1558-1571 - Ziyi Xu
, Maximilian Strake, Tim Fingscheidt
:
Deep Noise Suppression Maximizing Non-Differentiable PESQ Mediated by a Non-Intrusive PESQNet. 1572-1585 - Lin Li
, Fuchuan Tong
, Qingyang Hong
:
When Speaker Recognition Meets Noisy Labels: Optimizations for Front-Ends and Back-Ends. 1586-1599 - Vinay Kothapally
, John H. L. Hansen
:
SkipConvGAN: Monaural Speech Dereverberation Using Generative Adversarial Networks via Complex Time-Frequency Masking. 1600-1613 - Hiromu Yakura
, Kento Watanabe
, Masataka Goto
:
Self-Supervised Contrastive Learning for Singing Voices. 1614-1623 - Chen Gong
, Zhenghua Li
, Min Zhang:
Neural Coupled Sequence Labeling for Heterogeneous Annotation Conversion. 1624-1636 - Fangfang Su
, Yue Zhang
, Fei Li
, Donghong Ji:
Balancing Precision and Recall for Neural Biomedical Event Extraction. 1637-1649 - Zexu Pan
, Ruijie Tao, Chenglin Xu
, Haizhou Li
:
Selective Listening by Synchronizing Speech With Lips. 1650-1664 - Qianren Mao
, Jianxin Li
, Hao Peng
, Shizhu He
, Lihong Wang
, Philip S. Yu
, Zheng Wang
:
Fact-Driven Abstractive Summarization by Utilizing Multi-Granular Multi-Relational Knowledge. 1665-1678 - Neeraj Kumar
, Ankur Narang, Brejesh Lall
:
Zero-Shot Normalization Driven Multi-Speaker Text to Speech Synthesis. 1679-1693 - Carlos Tarjano
, Valdecy Pereira
:
An Efficient Algorithm for Segmenting Quasi-Periodic Digital Signals Into Pseudo Cycles: Application in Lossy Audio Compression. 1694-1703 - Han Wang
, Hongling Sun
, Jianfeng Guo
, Ming Wu, Jun Yang
:
Analysis of the Frequency Interference in the Narrowband Active Noise Control System. 1704-1717 - Maryam Hosseini
, Luca Celotti, Eric Plourde
:
End-to-End Brain-Driven Speech Enhancement in Multi-Talker Conditions. 1718-1733 - Mathieu Fontaine
, Kouhei Sekiguchi
, Aditya Arie Nugraha
, Yoshiaki Bando
, Kazuyoshi Yoshii
:
Generalized Fast Multichannel Nonnegative Matrix Factorization Based on Gaussian Scale Mixtures for Blind Source Separation. 1734-1748 - Thi Ngoc Tho Nguyen
, Karn N. Watcharasupat
, Ngoc Khanh Nguyen, Douglas L. Jones, Woon-Seng Gan
:
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection. 1749-1762 - Changfeng Gao
, Gaofeng Cheng
, Ta Li
, Pengyuan Zhang
, Yonghong Yan
:
Self-Supervised Pre-Training for Attention-Based Encoder-Decoder ASR Model. 1763-1774 - Jiacheng Ye
, Xiang Zhou, Xiaoqing Zheng
, Tao Gui
, Qi Zhang
:
Uncertainty-Aware Sequence Labeling. 1775-1788 - Rui Liu
, Berrak Sisman
, Guanglai Gao, Haizhou Li
:
Decoding Knowledge Transfer for Neural Text-to-Speech Training. 1789-1802 - Weiquan Fan
, Xiangmin Xu
, Bolun Cai, Xiaofen Xing
:
ISNet: Individual Standardization Network for Speech Emotion Recognition. 1803-1814 - Jianfeng Wu
, Sijie Mai
, Haifeng Hu
:
Interpretable Multimodal Capsule Fusion. 1815-1826 - Qian Wang
, Jiajun Zhang
, Chengqing Zong
:
Synchronous Inference for Multilingual Neural Machine Translation. 1827-1839 - Jilu Jin
, Jacob Benesty
, Gongping Huang
, Jingdong Chen
:
On Differential Beamforming With Nonuniform Linear Microphone Arrays. 1840-1852 - Chiranjibi Sitaula
, Jinyuan He, Archana Priyadarshi, Mark B. Tracy, Omid Kavehei
, Murray Hinder, Anusha Withana
, Alistair Lee McEwan
, Faezeh Marzbanrad
:
Neonatal Bowel Sound Detection Using Convolutional Neural Network and Laplace Hidden Semi-Markov Model. 1853-1864 - Chao Pan
, Jingdong Chen
, Jacob Benesty
:
Microphone Array Beamforming With High Flexible Interference Attenuation and Noise Reduction. 1865-1876 - Han Li
, Kean Chen
, Bernhard U. Seeber
:
Gestalt Principles Emerge When Learning Universal Sound Source Separation. 1877-1891 - Juan Manuel Miramont
, Marcelo Alejandro Colominas
, Gastón Schlotthauer
:
Emulating Perceptual Evaluation of Voice Using Scattering Transform Based Features. 1892-1901 - Lucas Ondel
, Bolaji Yusuf
, Lukás Burget
, Murat Saraçlar
:
Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery. 1902-1917 - Jialu Li
, Mark Hasegawa-Johnson
:
Autosegmental Neural Nets 2.0: An Extensive Study of Training Synchronous and Asynchronous Phones and Tones for Under-Resourced Tonal Languages. 1918-1926 - Yu Lu
, Jiajun Zhang
, Jiali Zeng
, Shuangzhi Wu, Chengqing Zong
:
Attention Analysis and Calibration for Transformer in Natural Language Generation. 1927-1938 - Minseung Kim
, Jong Won Shin
:
Improved Speech Enhancement Considering Speech PSD Uncertainty. 1939-1951 - Avital Kleiman
, Israel Cohen
, Baruch Berdugo:
Constant-Beamwidth Beamforming With Nonuniform Concentric Ring Arrays. 1952-1962 - Jacopo de Berardinis
, Angelo Cangelosi
, Eduardo Coutinho
:
Measuring the Structural Complexity of Music: From Structural Segmentations to the Automatic Evaluation of Models for Music Generation. 1963-1976 - Dino Oglic
, Zoran Cvetkovic
, Peter Sollich
, Steve Renals
, Bin Yu:
Towards Robust Waveform-Based Acoustic Models. 1977-1992 - Bo Chen
, Chenpeng Du
, Kai Yu
:
Neural Fusion for Voice Cloning. 1993-2001 - Saurabhchand Bhati
, Jesús Villalba
, Piotr Zelasko
, Laureano Moro-Velázquez
, Najim Dehak
:
Unsupervised Speech Segmentation and Variable Rate Representation Learning Using Segmental Contrastive Predictive Coding. 2002-2014 - Bo Yang
, Lijun Wu
, Jinhua Zhu
, Bo Shao, Xiaola Lin, Tie-Yan Liu
:
Multimodal Sentiment Analysis With Two-Phase Multi-Task Learning. 2015-2024 - Songbin Li
, Jingang Wang
, Peng Liu
:
General Frame-Wise Steganalysis of Compressed Speech Based on Dual-Domain Representation and Intra-Frame Correlation Leaching. 2025-2035 - Yang Ai
, Zhen-Hua Ling
, Wei-Lu Wu, Ang Li:
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Statistical Parametric Speech Synthesis. 2036-2048 - Alastair H. Moore
, Sina Hafezi
, Rebecca R. Vos, Patrick A. Naylor
, Mike Brookes
:
A Compact Noise Covariance Matrix Model for MVDR Beamforming. 2049-2061 - Leo McCormack
, Archontis Politis
, Raimundo Gonzalez
, Tapio Lokki, Ville Pulkki
:
Parametric Ambisonic Encoding of Arbitrary Microphone Arrays. 2062-2075 - Adriana Fernandez-Lopez
, Federico M. Sukno
:
End-to-End Lip-Reading Without Large-Scale Data. 2076-2090 - Jinwon An
, Sungzoon Cho
, Junseong Bang, Misuk Kim
:
Domain-Slot Relationship Modeling Using a Pre-Trained Language Encoder for Multi-Domain Dialogue State Tracking. 2091-2102 - Roberto San Millán-Castillo
, Luca Martino
, Eduardo Morgado
, Fernando Llorente
:
An Exhaustive Variable Selection Study for Linear Models of Soundscape Emotions: Rankings and Gibbs Analysis. 2460-2474 - Kunkun SongGong
, Wenwu Wang
, Huawei Chen
:
Acoustic Source Localization in the Circular Harmonic Domain Using Deep Learning Architecture. 2475-2491 - Yanjue Song
, Nilesh Madhu
:
Improved CEM for Speech Harmonic Enhancement in Single Channel Noise Suppression. 2492-2503 - Anderson Queiroz
, Rosângela Coelho
:
Noisy Speech Based Temporal Decomposition to Improve Fundamental Frequency Estimation. 2504-2513 - Myeongjun Jang
, Thomas Lukasiewicz:
NoiER: An Approach for Training More Reliable Fine-Tuned Downstream Task Models. 2514-2525 - Jiayi Wang
, Rongzhou Bao
, Zhuosheng Zhang
, Hai Zhao
:
Rethinking Textual Adversarial Defense for Pre-Trained Language Models. 2526-2540 - Sungho Lee
, Hyeong-Seok Choi
, Kyogu Lee
:
Differentiable Artificial Reverberation. 2541-2556 - Chuang Fan
, Jiaming Li, Xuan Luo
, Ruifeng Xu
:
Enhancing Structure Preservation in Coreference Resolution by Constrained Graph Encoding. 2557-2567 - Richong Zhang
, Qianben Chen
, Yaowei Zheng
, Samuel Mensah
, Yongyi Mao:
Aspect-Level Sentiment Analysis via a Syntax-Based Neural Network. 2568-2583 - Moti Lugasi
, Anjali Menon, Vladimir Tourbabin, Boaz Rafaely
:
Spatial Audio Signal Enhancement by a Two-Stage Source - System Estimation With Frequency Smoothing for Improved Perception. 2584-2596 - Mengzhe Geng
, Xurong Xie, Zi Ye
, Tianzi Wang, Guinan Li
, Shujie Hu
, Xunying Liu
, Helen Meng:
Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition. 2597-2611 - Elior Hadad
, Simon Doclo
, Sven Nordholm
, Sharon Gannot
:
A Class of Pareto Optimal Binaural Beamformers. 2612-2628 - Guochen Yu
, Andong Li, Hui Wang, Yutian Wang
, Yuxuan Ke, Chengshi Zheng
:
DBT-Net: Dual-Branch Federative Magnitude and Phase Estimation With Attention-in-Attention Transformer for Monaural Speech Enhancement. 2629-2644 - Weiqing Wang
, Qingjian Lin, Danwei Cai
, Ming Li
:
Similarity Measurement of Segment-Level Speaker Embeddings in Speaker Diarization. 2645-2658 - Sunwoo Kim, Minje Kim
:
Boosted Locality Sensitive Hashing: Discriminative, Efficient, and Scalable Binary Codes for Source Separation. 2659-2672 - Sashi Novitasari
, Sakriani Sakti
, Satoshi Nakamura
:
A Machine Speech Chain Approach for Dynamically Adaptive Lombard TTS in Static and Dynamic Noise Environments. 2673-2688 - Qiupu Chen
, Guimin Huang, Yabing Wang:
The Weighted Cross-Modal Attention Mechanism With Sentiment Prediction Auxiliary Task for Multimodal Sentiment Analysis. 2689-2695 - Leilei Gan
, Zhiyang Teng, Yue Zhang
, Linchao Zhu
, Fei Wu
, Yi Yang:
SemGloVe: Semantic Co-Occurrences for GloVe From BERT. 2696-2704 - Suliang Bu, Yunxin Zhao
, Tuo Zhao
, Shaojun Wang, Mei Han:
Modeling Speech Structure to Improve T-F Masks for Speech Enhancement and Recognition. 2705-2715 - Hendrik Schröter
, Tobias Rosenkranz, Alberto N. Escalante-B., Andreas K. Maier
:
Low Latency Speech Enhancement for Hearing Aids Using Deep Filtering. 2716-2728 - Zhihao Zhang
, Yuan Zuo
, Junjie Wu
:
Aspect Sentiment Triplet Extraction: A Seq2Seq Approach With Span Copy Enhanced Dual Decoder. 2729-2742 - Leonardo Gabrielli
, Stefano D'Angelo, Pier Paolo La Pastina, Stefano Squartini
:
Antiderivative Antialiasing for Arbitrary Waveform Generation. 2743-2753 - Huadong Wang
, Xin Shen, Mei Tu, Yimeng Zhuang, Zhiyuan Liu
:
Improved Transformer With Multi-Head Dense Collaboration. 2754-2767 - Chuang Shi
, Feiyu Du, Qianyang Wu:
A Digital Twin Architecture for Wireless Networked Adaptive Active Noise Control. 2768-2777 - Wenmeng Xiong
, Changchun Bao
, Mao-shen Jia
, José Picheral:
Speech Enhancement With Robust Beamforming for Spatially Overlapped and Distributed Sources. 2778-2790 - Hassan Taherian
, Ke Tan
, DeLiang Wang
:
Multi-Channel Talker-Independent Speaker Separation Through Location-Based Training. 2791-2800 - Han Zhang
, Bin Liang, Min Yang, Hui Wang, Ruifeng Xu
:
Prompt-Based Prototypical Framework for Continual Relation Extraction. 2801-2813 - Christof Weiß
, Geoffroy Peeters
:
Comparing Deep Models and Evaluation Strategies for Multi-Pitch Estimation in Music Recordings. 2814-2827 - Laura-Maria Dogariu, Jacob Benesty
, Constantin Paleologu
, Silviu Ciochina
:
Identification of Room Acoustic Impulse Responses via Kronecker Product Decompositions. 2828-2841 - Yanmin Qian
, Xun Gong
, Houjun Huang:
Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition. 2842-2853 - Liumeng Xue
, Frank K. Soong, Shaofei Zhang, Lei Xie
:
ParaTTS: Learning Linguistic and Prosodic Cross-Sentence Information in Paragraph-Based TTS. 2854-2864 - Soojoong Hwang, Minseung Kim
, Jong Won Shin
:
Dual Microphone Speech Enhancement Based on Statistical Modeling of Interchannel Phase Difference. 2865-2874 - Chao Pan
, Jingdong Chen
:
A Framework of Directional-Gain Beamforming and a White-Noise-Gain-Controlled Solution. 2875-2887 - Fei He
, Xiaoyi Hu, Ce Zhu, Ying Li, Yipeng Liu:
Multi-Scale Spatial and Temporal Speech Associations to Swallowing for Dysphagia Screening. 2888-2899 - Boyang Xue
, Shoukang Hu
, Junhao Xu
, Mengzhe Geng
, Xunying Liu
, Helen Meng:
Bayesian Neural Network Language Modeling for Speech Recognition. 2900-2917 - Kang Xu
, Fei Li
, Dongdong Xie, Donghong Ji:
Revisiting Aspect-Sentiment-Opinion Triplet Extraction: Detailed Analyses Towards a Simple and Effective Span-Based Model. 2918-2927 - Koichi Saito, Tomohiko Nakamura
, Kohei Yatabe
, Hiroshi Saruwatari
:
Sampling-Frequency-Independent Convolutional Layer and its Application to Audio Source Separation. 2928-2943 - Juliano G. C. Ribeiro
, Natsuki Ueno
, Shoichi Koyama
, Hiroshi Saruwatari
:
Region-to-Region Kernel Interpolation of Acoustic Transfer Functions Constrained by Physical Properties. 2944-2954 - Ruixin Hong
, Hongming Zhang, Xintong Yu
, Changshui Zhang
:
Learning Event Extraction From a Few Guideline Examples. 2955-2967 - Zhengjun Yue
, Erfan Loweimi
, Heidi Christensen
, Jon Barker, Zoran Cvetkovic
:
Acoustic Modelling From Raw Source and Filter Components for Dysarthric Speech Recognition. 2968-2980 - Gaku Kotani
, Daisuke Saito, Nobuaki Minematsu
:
Voice Conversion Based on Deep Neural Networks for Time-Variant Linear Transformations. 2981-2992 - Xiaoyu Bie
, Simon Leglaive
, Xavier Alameda-Pineda
, Laurent Girin
:
Unsupervised Speech Enhancement Using Dynamical Variational Autoencoders. 2993-3007 - Yi Luo
:
A Time-Domain Real-Valued Generalized Wiener Filter for Multi-Channel Neural Separation Systems. 3008-3019 - Kaile Shi, Xiaoyan Cai
, Libin Yang
, Jintao Zhao, Shirui Pan
:
StarSum: A Star Architecture Based Model for Extractive Summarization. 3020-3031 - Zexu Pan
, Meng Ge
, Haizhou Li
:
USEV: Universal Speaker Extraction With Visual Cue. 3032-3045 - Yutao Xie
, Qiyu Wu, Wei Chen
, Tengjiao Wang
:
Stable Contrastive Learning for Self-Supervised Sentence Embeddings With Pseudo-Siamese Mutual Learning. 3046-3059 - Lei Luo
, Wenzhao Zhu:
An Optimized Zero-Attracting LMS Algorithm for the Identification of Sparse System. 3060-3073 - Gongping Huang
, Jacob Benesty
, Jingdong Chen
:
Fundamental Approaches to Robust Differential Beamforming With High Directivity Factors. 3074-3088 - Xiaoqiang Wang
, Yanqing Liu, Jinyu Li
, Veljko Miljanic, Sheng Zhao, Hosam Khalil:
Towards Contextual Spelling Correction for Customization of End-to-End Speech Recognition Systems. 3089-3097 - Jiaxin Zhong
, Tao Zhuang
, Ray Kirby
, Mahmoud Karimi
, Xiaojun Qiu
, Haishan Zou
, Jing Lu
:
Low Frequency Audio Sound Field Generated by a Focusing Parametric Array Loudspeaker. 3098-3109 - Jens Ahrens
, Hannes Helmholz
, David Lou Alon, Sebastià V. Amengual Garí
:
Spherical Harmonic Decomposition of a Sound Field Using Microphones on a Circumferential Contour Around a Non-Spherical Baffle. 3110-3119 - Yonggang Hu
, Prasanga N. Samarasinghe
, Sharon Gannot
, Thushara D. Abhayapala
:
Decoupled Multiple Speaker Direction-of-Arrival Estimator Under Reverberant Environments. 3120-3133 - Heming Wang
, Xueliang Zhang
, DeLiang Wang
:
Fusing Bone-Conduction and Air-Conduction Sensors for Complex-Domain Speech Enhancement. 3134-3143 - Joon-Young Yang
, Joon-Hyuk Chang
:
Task-Specific Optimization of Virtual Channel Linear Prediction-Based Speech Dereverberation Front-End for Far-Field Speaker Verification. 3144-3159 - Jian Liu
, Yufeng Chen, Jinan Xu:
MRCAug: Data Augmentation via Machine Reading Comprehension for Document-Level Event Argument Extraction. 3160-3172 - Wangyou Zhang
, Xuankai Chang
, Christoph Böddeker, Tomohiro Nakatani
, Shinji Watanabe
, Yanmin Qian
:
End-to-End Dereverberation, Beamforming, and Speech Recognition in a Cocktail Party. 3173-3188 - Michele Ducceschi
, Stefan Bilbao
:
Non-Iterative Simulation Methods for Virtual Analog Modelling. 3189-3198 - Miguel Ferrer
, Maria de Diego
, Amin Hassani
, Marc Moonen
, Gema Piñero
, Alberto González
:
Multi-Tone Active Noise Equalizer With Spatially Distributed User-Selected Profiles. 3199-3213 - Gasper Begus
, Alan Zhou
:
Interpreting Intermediate Convolutional Layers of Generative CNNs Trained on Waveforms. 3214-3229 - Sixing Wu
, Ying Li
, Dawei Zhang, Zhonghai Wu:
Generating Rational Commonsense Knowledge-Aware Dialogue Responses With Channel-Aware Knowledge Fusing Network. 3230-3239 - Donghui Zhu
, Ning Chen
:
Multi-Source Domain Adaptation and Fusion for Speaker Verification. 2103-2116 - Daniel Yang
, Thaxter Shaw
, Timothy Tsai
:
A Study of Parallelizable Alternatives to Dynamic Time Warping for Aligning Long Sequences. 2117-2127 - Yi Yu
, Hongsen He
, Rodrigo C. de Lamare
, Badong Chen
:
General Robust Subband Adaptive Filtering: Algorithms and Applications. 2128-2140 - Mahdie Karbasi
, Steffen Zeiler, Dorothea Kolossa
:
Microscopic and Blind Prediction of Speech Intelligibility: Theory and Practice. 2141-2155 - Andong Li
, Chengshi Zheng
, Guochen Yu
, Juanjuan Cai
, Xiaodong Li
:
Filtering and Refining: A Collaborative-Style Framework for Single-Channel Speech Enhancement. 2156-2172 - Silin Gao
, Ryuichi Takanobu
, Antoine Bosselut
, Minlie Huang
:
End-to-End Task-Oriented Dialog Modeling With Semi-Structured Knowledge Management. 2173-2187 - Yingying Zhu, Haiquan Zhao
, Xiaoqiong He, Zeliang Shu, Badong Chen
:
Cascaded Random Fourier Filter for Robust Nonlinear Active Noise Control. 2188-2200 - Siyuan Wang
, Zhongkun Liu, Wanjun Zhong, Ming Zhou
, Zhongyu Wei
, Zhumin Chen, Nan Duan
:
From LSAT: The Progress and Challenges of Complex Reasoning. 2201-2216 - Cheng Lu
, Yuan Zong
, Wenming Zheng
, Yang Li
, Chuangao Tang
, Björn W. Schuller
:
Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition. 2217-2230 - Bo Zhang
, Jian Wang
, Hongfei Lin
, Hui Ma
, Bo Xu
:
Exploiting Pairwise Mutual Information for Knowledge-Grounded Dialogue. 2231-2240 - Tao Wang
, Jiangyan Yi
, Ruibo Fu
, Jianhua Tao
, Zhengqi Wen:
CampNet: Context-Aware Mask Prediction for End-to-End Text-Based Speech Editing. 2241-2254 - Ying-Ren Chien
, Chih-Hsiang Yu, Hen-Wai Tsao
:
Affine-Projection-Like Maximum Correntropy Criteria Algorithm for Robust Active Noise Control. 2255-2266 - Rui Wang
, Zhihua Wei
, Haoran Duan
, Shouling Ji
, Yang Long, Zhen Hong
:
EfficientTDNN: Efficient Architecture Search for Speaker Recognition. 2267-2279 - Xiaoxue Gao
, Chitralekha Gupta
, Haizhou Li
:
Automatic Lyrics Transcription of Polyphonic Music With Lyrics-Chord Multi-Task Learning. 2280-2294 - Jirí Málek
, Jakub Janský, Zbynek Koldovský
, Tomás Kounovský, Jaroslav Cmejla, Jindrich Zdánský:
Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification. 2295-2309 - Jung-Woo Choi
, Franz Zotter
, Byeongho Jo
, Jae-Hyoun Yoo
:
Multiarray Eigenbeam-ESPRIT for 3D Sound Source Localization With Multiple Spherical Microphone Arrays. 2310-2325 - Hao Zhang
, DeLiang Wang
:
Neural Cascade Architecture for Multi-Channel Acoustic Echo Suppression. 2326-2336 - Alessandro Opinto
, Marco Martalò
, Alessandro Costalunga, Nicolo Strozzi
, Carlo Tripodi
, Riccardo Raheli
:
Experimental Analysis and Design Guidelines for Microphone Virtualization in Automotive Scenarios. 2337-2346 - Xing Tian
, Jie Huang, Xuelei Feng
, Yong Shen
:
An Intermittent FxLMS Algorithm for Active Noise Control Systems With Saturation Nonlinearity. 2347-2356 - Haichao Zhu
, Li Dong, Furu Wei, Bing Qin
, Ting Liu:
Transforming Wikipedia Into Augmented Data for Query-Focused Summarization. 2357-2367 - Kouhei Sekiguchi
, Yoshiaki Bando
, Aditya Arie Nugraha
, Mathieu Fontaine
, Kazuyoshi Yoshii
, Tatsuya Kawahara
:
Autoregressive Moving Average Jointly-Diagonalizable Spatial Covariance Analysis for Joint Source Separation and Dereverberation. 2368-2382 - Brij Mohan Lal Srivastava
, Mohamed Maouche, Md. Sahidullah
, Emmanuel Vincent
, Aurélien Bellet, Marc Tommasi, Natalia A. Tomashenko
, Xin Wang
, Junichi Yamagishi
:
Privacy and Utility of X-Vector Based Speaker Anonymization. 2383-2395 - Luciana Ferrer
, Diego Castán, Mitchell McLaren, Aaron Lawson:
A Discriminative Hierarchical PLDA-Based Model for Spoken Language Recognition. 2396-2410 - Xiaohuai Le
, Tong Lei, Kai Chen, Jing Lu
:
Inference Skipping for More Efficient Real-Time Speech Enhancement With Parallel RNNs. 2411-2421 - Chitralekha Gupta
, Haizhou Li
, Masataka Goto
:
Deep Learning Approaches in Topics of Singing Information Processing. 2422-2451 - Atharva Anand Joshi
, Harshavardhan Settibhaktini
, Ananthakrishna Chintanpalli
:
Modeling Concurrent Vowel Scores Using the Time Delay Neural Network and Multitask Learning. 2452-2459
![](https://dblp.org/img/cog.dark.24x24.png)
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.