User profiles for Shanbo Cheng

Shanbo Cheng

ByteDance Seed
Verified email at bytedance.com
Cited by 682

Eliciting the translation ability of large language models via multilingual finetuning with translation instructions

J Li, H Zhou, S Huang, S Cheng… - Transactions of the …, 2024 - direct.mit.edu
Large-scale pretrained language models (LLMs), such as ChatGPT and GPT4, have shown
strong abilities in multilingual translation, without being explicitly trained on parallel corpora. …

Acquiring knowledge from pre-trained model to neural machine translation

R Weng, H Yu, S Huang, S Cheng, W Luo - Proceedings of the AAAI …, 2020 - ojs.aaai.org
Pre-training and fine-tuning have achieved great success in natural language process field.
The standard paradigm of exploiting them includes two steps: first, pre-training a model, eg …

[PDF][PDF] Sogou neural machine translation systems for WMT17

Y Wang, S Cheng, L Jiang, J Yang… - Proceedings of the …, 2017 - aclanthology.org
We describe the Sogou neural machine translation systems for the WMT 2017 Chinese↔
English news translation tasks. Our systems are based on a multilayer encoder-decoder …

Speech translation with large language models: An industrial practice

Z Huang, R Ye, T Ko, Q Dong, S Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org
Given the great success of large language models (LLMs) across various tasks, in this paper,
we introduce LLM-ST, a novel and effective speech translation model constructed upon a …

G-dig: Towards gradient-based diverse and high-quality instruction data selection for machine translation

…, L Huang, L Kang, Z Liu, Y Lu, S Cheng - arXiv preprint arXiv …, 2024 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable abilities in general scenarios.
Instruction finetuning empowers them to align with humans in various tasks. Nevertheless…

Retaining key information under high compression ratios: Query-guided compressor for llms

…, Q Cao, Y Lu, N Peng, L Huang, S Cheng… - arXiv preprint arXiv …, 2024 - arxiv.org
The growing popularity of Large Language Models has sparked interest in context compression
for Large Language Models (LLMs). However, the performance of previous methods …

Towards achieving human parity on end-to-end simultaneous speech translation via llm agent

S Cheng, Z Huang, T Ko, H Li, N Peng, L Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
In this paper, we present Cross Language Agent -- Simultaneous Interpretation, CLASI, a
high-quality and human-like Simultaneous Speech Translation (SiST) System. Inspired by …

Language tags matter for zero-shot neural machine translation

L Wu, S Cheng, M Wang, L Li - arXiv preprint arXiv:2106.07930, 2021 - arxiv.org
Multilingual Neural Machine Translation (MNMT) has aroused widespread interest due to its
efficiency. An exciting advantage of MNMT models is that they could also translate between …

Learning kernel-smoothed machine translation with retrieved examples

Q Jiang, M Wang, J Cao, S Cheng, S Huang… - arXiv preprint arXiv …, 2021 - arxiv.org
How to effectively adapt neural machine translation (NMT) models according to emerging
cases without retraining? Despite the great success of neural machine translation, updating …

Language-aware interlingua for multilingual neural machine translation

C Zhu, H Yu, S Cheng, W Luo - … of the 58th Annual Meeting of the …, 2020 - aclanthology.org
Multilingual neural machine translation (NMT) has led to impressive accuracy improvements
in low-resource scenarios by sharing common linguistic information across languages. …