1 Introduction

With the acceleration of economic globalization, the cross-cultural communication and the exchange of multilingual information are increasingly valued by people [1]. The current machine translation methods still face limitations in precision and fluency when solving multilingual translation problems. This restriction not only hinders the precise transmission of cross-linguistic information, but also affects the efficiency and quality of multilingual communication. How to improve the adaptability and coherence of multilingual translation while ensuring quality is an urgent and important issue that needs to be addressed. Text-To-Text Transfer Transformer (T5), as a powerful automatic generation model, has shown good performance in various types of text automatic generation [2]. Model-Agnostic Meta-Learning (MAML) adopts model independent meta learning methods, which can improve the adaptability of the model to new data [3]. Integrating the two organically and continuously improving the quality of multilingual English translation has important practical value for promoting the in-depth development of multilingual translation technology, facilitating cross-cultural communication and understanding.

To improve translation quality and promote cultural exchange and communication, this article combines T5 and MAML to study their application in multilingual English translation. Through the T5 model, powerful language representation capabilities are obtained during large-scale pre training, capturing the structural and semantic features of multiple languages. Combined with the MAML mechanism, the model is endowed with the ability to quickly adapt to new language pairs. Quickly adjust to adapt to other languages. In pre training, cross language alignment techniques are used to further enhance understanding of different language structures, ensuring the accuracy and naturalness of translation results. In the experimental analysis, this article conducts experimental analysis from three aspects: baseline comparison advanced methods comparison, and ablation experiments. At the baseline comparison level, compared to the Open Neural Machine Translation (OpenNMT), Transformer, and Open Parallel Corpus-Machine Translation (Opus-MT) baseline models, the translation model based on the T5-MAML has a mean Bilingual Evaluation Understudy (BLEU) score that is 6.05%, 2.59%, and 2.05% higher, respectively, and a mean Translation Error Rate (TER) that is 11.13%, 6.03%, and 6.09% lower in different language pairs, respectively. The results under the METEOR indicator evaluation are also more ideal. In comparison with advanced methods, our model outperforms the mBART model and XLM-R model in BLEU, TER, and METEOR evaluations. At the level of ablation experiments, compared to the T5-Large model, the mean BLEU score of the model in this article is 3.43% higher, the mean TER result is 8.52% lower than that of the T5-Large model without meta learning, and the METEOR mean is 0.08 higher than the T5 Large model. In practical applications, using T5 and MAML for multilingual English translation can help improve translation quality and promote cultural understanding and communication.

2 Related work

With the rapid development and application of machine translation, multilingual translation models have also achieved certain results [4, 5]. Fan Angela created a multilingual translation model through large-scale mining, utilizing a combination of dense scaling and specific language sparse parameters to effectively increase model capacity. In non-English translation practice, the proposed model could achieve benefits of over 10 BLEU [6]. To solve the machine translation problem in multilingual environments, Singh Salam Michael trained a multilingual translation system based on long short-term memory (LSTM), which integrated cross-language functionality. The experimental results showed that for Manipuri to translation tasks, the proposed multilingual model improved both multilingual and bilingual baselines more than vanilla, verifying the good translation quality of the model [7]. Lalrempuii Candy proposed a translation model based on statistics and modern neural networks. The translation performance predicted by the training model was evaluated using automatic and manual evaluation methods through different tagging methods, architectures, and configurations, and the model performance was compared with existing baselines. The results indicated that the proposed model had advantages in prediction error and prediction quality [8]. Escolano Carlos proposed a new architecture based on introducing interlingual loss as an additional training objective. By adding and enforcing this interlingual loss, multiple encoders and decoders were trained for each language, sharing a common intermediate representation between them. The results showed that the BLEU of the proposed architecture improved by 2.8 points [9]. Although existing models have certain support capabilities for improving the quality of multilingual translation, promoting multilingual communication and collaboration, most models perform poorly in low resource scenarios.

The development of T5 and MAML provides more possibilities for improving the generalization performance of models in multiple languages [10, 11]. To avoid the cost and time intensive data collection and annotation in low resource scenarios, Fuad Ahlam explored the effectiveness of cross language transfer learning in building an end-to-end Arabic task-oriented dialogue system using a multilingual T5 model. The results suggested that compared with Chinese literature using the same settings, the T5 model outperformed traditional cross-language pre-trained methods [12]. Regarding the data preprocessing method for Thai, Phakmongkol Puri generated more questions and answers using multilingual T5. The results showed that compared with other modern transformer models, the proposed enhanced model exhibited more ideal translation performance on the dataset [13]. To achieve accurate classification of low resource languages, Awal Md Rabiul proposed the MAML framework, which utilized self-supervised strategies to overcome the limitations of scarce data in low-resource scenarios and generate better fine-tuning language model initialization to quickly adapt to cross language transfer. The experimental results of the dataset showed that in cross-domain multilingual transfer settings, the performance of the MAML framework was more than 3% higher than the advanced baseline [14]. In the Event Detection (ED) task of low resource languages, Roy Aniruddha proposed a MAML model to address the issue of data scarcity by training instances for cross-language ED. The results demonstrated that this method could find good parameter initialization and quickly adapt to new low-resource languages [15]. By utilizing the powerful pre-training capability of T5 and the fast adaptability of MAML, the model can effectively adapt to new tasks and improve the overall performance of the translation system. However, current research mostly focuses on using T5 or MAML alone for rapid adaptation, lacking research on combining the two to further improve model generalization performance.

3 Multilingual english translation based on T5 and MAML

3.1 Model architecture design

3.1.1 T5 model

The T5 model is a unified framework language model open-source by Google, whose core idea is to transform various natural language processing (NLP) problems into “text-to-text” tasks [16, 17]. As a universal language model. T5 can be widely applied in various NLP tasks such as language translation [18]. The input and output of the model are both text strings, which guide the model to perform and process different NLP tasks by adding task specific prefixes. This article uses autoregressive learning to fine tune the pre-trained parameters of the T5 model, and a generative multilingual English translation model is constructed, as shown in Fig. 1.

Fig. 1
figure 1

Generative multilingual English translation model

In Fig. 1, the model is trained using a combination of text input and text output. In the pre-training stage, the autoregressive learning method is used to predict each text segment from left to right based on contextual information, gradually forming a complete text sequence.

The T5 model mainly consists of an encoder and a decoder [19]. The encoder is a multi-level Transformer encoder used to encode and express input text. It contains information about the location of the text. By converting a series of input texts into vector representations with contextual associations and segmenting the text, the model can better understand the relationships between words. The decoder is also based on the Transformer architecture, which can convert the output of the encoder into the target text based on the context vector provided by the encoder.

During the training process, the T5 model adopts a text-to-text architecture, treating multilingual translation tasks as text-to-text, that is, obtaining corresponding translation outputs in text form. The input and output can use the same objective function in the pre-training and fine-tuning, and the same decoding process can also be used in the testing phase.

When fine-tuning parameters, the source language text string is used as the input text sequence and the target language text string as the output text sequence. By using autoregression, which generates one label at a time and uses it as input for the next time step, a complete output sequence is generated. The steps are shown in Table 1:

Table 1 Model parameter fine-tuning

In the autoregressive learning, parameter learning is achieved by maximizing the conditional probability of the target sequence. This step is represented by the formula [20, 21]:

$$P\left(y|x\right)=\prod_{t=1}^{m}P\left({y}_{t}|{y}_{<t},x\right)$$
(1)

Among them, the definitions of the variables in Formula (1) are shown in Table 2:

Table 2 Definitions of variables in Formula (1)

In Table 2, there is \({y}_{<t}=\left({y}_{1},{y}_{2},\cdots ,{y}_{<t}\right)\).

To optimize the model parameters, cross entropy is used as the loss function to measure the difference between the predicted and actual values of the model. The cross entropy loss function is defined as:

$${\text{L = }}\,{\text{ - }}\sum\nolimits_{{{\text{t = 1}}}}^{{\text{m}}} {{\text{P}}\left( {{\text{y}}_{{\text{t}}} {\text{|y}}_{{{\text{ < t}}}} {\text{,x}}} \right)}$$
(2)

By minimizing \(\mathcal{L}\) and optimizing the parameters of the model, the generation precision of the target language sequence in the translation process can be improved.

During the training phase, the gradient descent is used to update the parameters of the model. Assuming the parameters of the model are \(\theta\), the formula for parameter update is:

$$\theta \leftarrow \theta -\eta {\nabla }_{\theta }\mathcal{L}$$
(3)

Among them, \(\eta\) is the learning rate, and \({\nabla }_{\theta }\mathcal{L}\) is the gradient of the loss function with respect to parameter \(\theta\).

3.1.2 MAML

Different languages have significant differences in grammar, vocabulary, syntax, and other aspects, which to some extent leads to the high complexity of T5 in NLP processes [22]. Low-resource languages in multilingual environments often lack restructured language sample data, and building a universal multilingual model requires massive amounts of data and computing resources. When faced with new language pairs, it is often necessary to retrain or fine tune them, which increases computational costs and restricts their adaptability in real scenarios. In response to this, this article combines the MAML framework and trains the model on multiple tasks to enable rapid adaptation on new task data, as shown in Fig. 2.

Fig. 2
figure 2

MAML framework

In Fig. 2, the MAML framework considers each language pair as a separate task. By training and learning through MAML, the model can quickly adjust parameters in low-resource environments, learn new language pairs quickly, and achieve better translation results on new language pairs.

It is assumed that \(\left(r-th(1\le r\le R\right)\)) batches are sampled in a specific batch of tasks. This batch includes \(Q\) tasks. Given the model parameter \(\theta\), the traversal through each task in the current batch is performed to calculate the loss gradient \({\nabla }_{\theta }{\mathcal{L}}_{{{T}_{i}}_{support}}f(\theta )\) of the current task support set. The intermediate variable \({\theta }_{i}^{\prime}\) is further updated based on the loss gradient [23, 24]:

$${\theta }_{i}^{\prime}=\theta -\eta {\nabla }_{\theta }{\mathcal{L}}_{{{T}_{i}}_{support}}f(\theta )$$
(4)

The loss \({\mathcal{L}}_{{{T}_{i}}_{query}}f({\theta }_{i}^{\prime})\) of the current task query set related to \({\theta }_{i}^{\prime}\) is calculated. The cumulative loss and \(\sum {\mathcal{L}}_{{{T}_{i}}_{query}}f({\theta }_{i}^{\prime})\) are recorded.

\({\nabla }_{\theta }{\mathcal{L}}_{{{T}_{i}}_{query}}f({\theta }_{i}^{\prime})\) is used to represent the cumulative loss related to the model parameter \(\theta\), and the final model parameter \(\theta\) is updated [25, 26]:

$$\theta =\theta -\eta {\nabla }_{\theta }\sum {\mathcal{L}}_{{{T}_{i}}_{query}}f({\theta }_{i}^{\prime})$$
(5)

3.2 Data collection and preprocessing

Starting from the needs of practical applications and the availability of data, this article takes English as the main target language and Chinese, Japanese, German, Spanish, and French as source languages. In the construction of multilingual parallel corpora, English monolingual data is first acquired from authoritative websites such as Wikipedia and Europarl to obtain sufficient English monolingual text data. Then, by cleaning and preprocessing the obtained English text data, English monolingual corpus is obtained. On this basis, using Google Translate technology, the English monolingual corpus is translated into five languages: Chinese, Japanese, German, Spanish, and French, forming a parallel corpus of multiple languages. The overall framework is shown in Fig. 3:

Fig. 3
figure 3

Process of constructing multilingual parallel corpora

The data collection is carried out through web crawlers, as shown in Fig. 4. Before performing the collection work, the crawler object is first determined. On Wikipedia, web pages that contain English texts of historical events and scientific articles are selected to ensure that the pages have corresponding versions in multiple languages. On the website of Europarl, relevant meeting minutes and report text data are crawled in a centralized manner. The open-source scraping tool Scrapy is used to obtain real data. Firstly, the Scrapy project is configured, and its target pages and rules are defined. The selector function of Scrapy is used to accurately find the corresponding Hyper Text Markup Language (HTML) element. On Wikipedia, the language linking section is found, and the language text of the same content is retrieved through linking relationships. On Europarl, the Uniform Resource Locator (URL) structure is used to find the meeting minutes page. Normal browser behaviors are simulated by configuring request headers and proxy servers. The collection time and delay are set to reduce the load on the site server.

Fig. 4
figure 4

Steps of web crawler

After the data collection work is completed, the collected text is preliminarily cleaned and processed. Regular expressions are utilized to extract valuable text content from web pages, removing irrelevant content such as advertisements and navigation bars. Some examples are shown in Table 3:

Table 3 Examples of page content extraction

On the basis of extracting content from Table 3 pages, a translation model is used to match different language texts of English monolingual data to ensure that each corpus is parallel to each other. At the same time, during the acquisition process, language detection is performed on the collected text, and consistency checks are performed on it. To avoid bias or errors in the training corpus generated by Google Translate, a bidirectional translation method is adopted to verify and improve translation quality. Firstly, select a portion of the original English text as the initial input. The sentence structure covered by the input text mainly includes simple declarative sentences, complex subordinate clauses, and sentences containing professional terminology. Then use Google Translate to translate the input English text into Chinese, Japanese, German, Spanish, and French; Translate these translated texts back into English. In this process, compare the differences between the original English text and the reverse translated English text in terms of word meaning, word order, and grammatical structure. Including differences in word order, deviations in word meaning, changes in grammatical structure, and misunderstandings in context. Establish a terminology list for fixed phrases or proper nouns, and automatically adjust word order and grammar structure based on differences in word order and grammar structure through integrated conversion rules. After completing all necessary corrections, update the corpus and add validated and modified parallel corpora to the training set. The open-source software Natural Language Toolkit (NLTK) is used to complete text processing, standardize the format and content of text, and remove unnecessary punctuation and special text. Finally, the data is saved and preprocessed in JavaScript Object Notation (JSON) standard format.

In order to expand the training dataset and improve its applicability in different translation scenarios, this paper adopts a cross language alignment method. Align data from different language pairs to enhance the performance of the model in multilingual conversion. Firstly, extract features from the text of each language. Use word1Vec word vectors to represent each word in the text as a vector, in order to capture the semantic relationship between words. For each sentence, calculate the average of all word vectors as the vector representation of the sentence:

$$v_{{sentence}} = \frac{1}{n}\sum\nolimits_{{i = 1}}^{n} {v_{i} }$$
(6)

Among them, \({v}_{i}\) is the vector representation of the \(i\)-th word, and \(n\) is the number of words in the sentence.

Use dependency parsing to obtain the syntax tree of a sentence and convert it into a vector representation. For each occurrence in the sentence, determine its subject verb relationship and determine its dependency relationship through the adjacency matrix \(A\). \({A}_{ij}=1\) indicates that the word depends on the word, otherwise \({A}_{ij}=0\). Namely:

$${A}_{ij}=\left\{\begin{array}{c}1, if \,{w}_{i}\,depends\,on {w}_{j}\\ 0,\,otherwise\end{array}\right.$$
(7)

Among them, \({w}_{i}\) and \({w}_{j}\) are the vocabulary in the sentence, respectively. Use Tree Structured Neural Networks (TreeNN) to transform dependency graphs into vector representations. TreeNN recursively passes information along the dependency graph and ultimately aggregates the vector representation of the entire sentence

$${h}_{i}=ReLU(W[{v}_{i}; \sum_{j\in children(i)}\,{h}_{j}]+b)$$
(8)

\(children(i)\) represents all child nodes of word \(i\), \(W\) and \(b\) are learnable parameter matrices and bias terms, and \({h}_{i}\) is the hidden state vector of the word.

Finally, the vector representation of the sentence is obtained by aggregating the hidden state vectors of all words:

$$\text{s}=\frac{1}{\left|\text{V}\right|}\sum_{\text{i}\in \text{V}}{h}_{i}$$
(9)
$$s=\underset{i\in V}{\mathit{max}}{h}_{i}$$
(10)

Among them, \(\left|\text{V}\right|\) is the number of words in the sentence, and \(s\) is the vector representation of the sentence.

Using LDA (Latent Dirichlet Allocation) topic model to extract text topic distribution as advanced sentence features:

$$\uptheta =\text{LDA}(\text{D})$$
(11)

In cross linguistic feature mapping, feature vectors from different languages are mapped to the same feature space for effective alignment. For each language pair, use existing word vectors to represent vocabulary. Then, use a linear transformation matrix to align the word vector spaces of the two languages. The goal is to minimize the distance between two sets of word vectors:

$${W}^{*}=arg\underset{W}{\text{min}}{\Vert WX-Y\Vert }_{F}^{2}$$
(12)

Among them, \(X\) and \(Y\) represent the word vector matrices of the source language and the target language, respectively, and \(W\) is the alignment matrix. Using Siamese networks to learn sentence similarity, mapping relationships between sentences in different languages, and using them for alignment tasks:

$$Similarity=f({v}_{i},{v}_{j})$$
(13)

By analyzing the structure and semantic features of texts in different languages, aligning sentences in different languages, a diverse and balanced multilingual parallel corpus can be constructed.

The final collected data is shown in Table 4:

Table 4 Data set information

According to Table 4, 7592 monolingual English data have been collected, while 3766, 3082, 3615, 3044, and 3128 language pair data have been collected for Chinese-English, Japanese-English, German-English, Spanish–English, and French–English, respectively.

3.3 Quality evaluation indicators

3.3.1 BLEU

The BLEU algorithm determines the optimal number of bytes by comparing the continuously generated n-tuples in the candidate translation with the n-tuples in the standard reference translation [27]. On this basis, the number of n-tuples in the candidate translations is counted. In addition, there is a new parameter in the BLEU algorithm, namely the Brevity Penalty (BP), which penalizes candidate translations with lengths shorter than the standard reference translation. The BLEU calculation formula is [28]:

$$BLEU = BP \times EXP\left( {\sum\nolimits_{{n = 1}}^{N} {a_{n} } logp_{n} } \right)$$
(14)

The calculation formula of \(BP\) is expressed as:

$$BP=\left\{\begin{array}{c}1, c>r\\ exp(1-r/c), c\le r\end{array}\right.$$
(15)

Among them, the definitions of formula variables are shown in Table 5:

Table 5 Definitions of variables in Formulas (6) and (7)

The value of BLEU is between [0,1]. In the BLEU algorithm, the N-gram mechanism is used to evaluate the quality of the translated text generated by the model. Here, N means that a set contains N adjacent words for matching. Taking “The Earth revolves around the Sun.” as an example, its matching examples under the N-gram mechanism are shown in Table 6:

Table 6 Matching examples of the N-gram mechanism

In the evaluation of translation quality, N is generally taken as 4, which is BLEU-4. This article evaluates the translation quality of the model using the BLEU-4 standard.

3.4 TER

TER is an indicator that measures the minimum amount of editing operations required to translate a result into a reference translation. These editing actions include insertion, deletion, and substitution. Its specific implementation is shown in Fig. 5:

Fig. 5
figure 5

Mechanism of TER

The formula is [29]:

$$TER=\frac{{min E}_{n}}{{W}_{t}}\times 100\%$$
(16)

Among them, \({min E}_{n}\) is the minimum number of edits required for the translated output to become a reference translation, and \({W}_{t}\) is the total number of words in the reference translation.

3.5 METEOR

METEOR not only focuses on vocabulary matching, but also considers synonyms and morphological changes. By introducing penalties for word order, METEOR can better reflect the naturalness of translation. The calculation formula is expressed as:

$$METEOR=\frac{P*R}{\alpha P+(1-\alpha )R}$$
(17)

Among them, \(P\) is the proportion of words that match the reference translation in automatic translation; \(R\) is the proportion of vocabulary in the reference translation that matches the automatic translation, \(\alpha\) is the weight parameter, set to 0.9.

4 Experimental evaluation of multilingual english translation quality based on T5 and MAML

To evaluate the quality of multilingual English translation based on T5 and MAML, this article conducts experimental analysis from three levels: baseline comparison advanced methods comparison, and ablation experiments.

4.1 Experimental data

The collected English monolingual sample data is divided as shown in Table 7:

Table 7 English monolingual sample data

The English text data obtained in Table 7 is cleaned and preprocessed, and the Google Translate technology is used to form a multilingual parallel corpus. The bilingual sample data is shown in Table 8:

Table 8 Bilingual sample data

In Table 8, EC, EJ, EG, ES, and EF represent bilingual sample data for English Chinese, Japanese, German, Spanish, and French, respectively.

On this basis, the dataset samples in the experiment are divided in an 8:2 ratio, as shown in Table 9:

Table 9 Division of the dataset

According to Table 9, in the dataset division, two disjoint metadata sets \({D}_{train}\) and \({D}_{test}\) are obtained. In translation evaluation, a task consists of two parts: support set and query set. Support set: n-class k training refers to training with n categories of data in the same task, each category containing k labeled samples, that is, n × k labeled samples. Query set: one task contains q samples.

The model and meta learning parameter settings are shown in Table 10:

Table 10 Model and meta learning parameter settings

4.2 Baseline comparison

In the experiment, three types of baseline models are compared with the T5-MAML model, and their performance in multilingual translation tasks is evaluated.

Baseline models:

OpenNMT: an open-source translation model that supports multiple neural network architectures.

Transformer: a typical multilingual transformer model trained on the same multilingual dataset.

Opus-MT: a multilingual translation system based on the OPUS model.

Experimental model:

T5-MAML: adopting MAML meta learning strategy based on the T5 model.

For each model, the same training dataset is used for training. The baseline model settings are consistent with the experimental model.

  1. 1)

    Baseline comparison analysis of BLEU score.

BLEU score evaluates translation quality based on N-gram matching degree, with a particular emphasis on the precision and fluency of the translation. This article compares the BLEU scores of different models in the same translation task to compare their differences in practical use. The final results are shown in Fig. 6:

Fig. 6
figure 6

Baseline comparison results of BLEU score

From Fig. 6, it can be seen that the BLEU score of the model in this article is generally high. The mean BLEU score of the model in this article reaches about 32.53% under different language tasks; the mean BLEU scores of the OpenNMT model, Transformer model, and Opus-MT model in different language pairs are approximately 26.48%, 29.94%, and 30.47%, respectively. From the specific comparison results, compared to the other three types of models, the mean BLEU score of the translation model based on T5-MAML is 6.05%, 2.59%, and 2.05% higher, respectively. This result indicates that the T5-MAML model can better adapt to multilingual translation tasks and achieve better translation results on fewer data samples. By pre training the T5 model on unlabeled text data, rich language representations were learned, enabling the model to have better language understanding and generation capabilities, thereby improving translation quality and achieving an improvement in BLEU Score.

  1. 2)

    Baseline comparison analysis of TER.

TER pays more attention to the differences in editing between translated and referenced translations. In the TER comparison experiment, this article compares the TER values of various models in the same text to test the translation quality, readability, and attention to detail processing of the models. The final results are shown in Fig. 7:

Fig. 7
figure 7

Baseline comparison results of TER

From Fig. 7, it can be seen that the four types of models show certain differences in the baseline comparison results of TER. Among them, the lowest TER value of the model in this article reaches 50.30%, and its mean TER value under different language tasks is about 52.59%. The TER values of OpenNMT model, Transformer model, and Opus-MT model are the lowest at 59.14%, 52.47%, and 52.26%, respectively, with mean TER values of approximately 63.72%, 58.62%, and 58.68%, respectively. From the comparison of mean values, compared to the baseline model, the TER mean of the model in this article is 11.13%, 6.03%, and 6.09% lower in different language pairs, respectively. This indicates that the T5-MAML model can better adapt to multilingual translation tasks in terms of editing distance and achieve better translation results on fewer data samples. The rich language representation provided by T5, combined with the fast learning ability of MAML, enables the model to have good basic language understanding ability, effectively improving fluency in specific translation tasks.

  1. 3)

    Baseline comparison analysis of METEOR.

The METEOR scores for the four types of models are shown in Table 11:

Table 11 METEOR score

From Table 11, it can be seen that compared to the baseline model, the T5-MAML model performs the best in METEOR evaluation, with an average score of 0.73. The Transformer model has an average score of 0.70, and the OpenNMT and Opus MT have average METEOR scores of 0.66 and 0.62, respectively. This result reflects that the T5-MAML model combines advanced pre training techniques and meta learning strategies to more effectively improve translation quality in multilingual translation tasks.

4.3 Comparison of advanced methods

Based on the comparison of baseline models, in order to comprehensively evaluate the effectiveness of our model, we compared it with advanced multilingual models: mBART (Multilingual BART): mBART is a multilingual version of the BART (Bidirectional and Auto Progressive Transformers) model proposed by Facebook AI. It achieves unsupervised machine translation of multiple languages by using bidirectional encoder representation (BERT like) and autoregressive decoder between the encoder and decoder.

XLM-R (Cross lingual Language Model Pretraining): XLM-R is another powerful multilingual pretraining model, which is an improved version of XLM (Cross lingual Language Model). XLM-R is able to capture relationships between different languages and perform well in tasks involving multiple languages by pre training on large-scale multilingual texts.

Compare the model presented in this article with mBART and XLM-R models, and discuss its performance in BLEU, TER, and METEOR.

(1) Comparative analysis of BLEU with advanced methods.

The comparison results of three types of models in BLEU are shown in Table 12:

Table 12 BLEU comparison

According to Table 12, the mean BLEU evaluation results of mBART model and XLM-R model under various language sequences reached 31.54 and 31.02, respectively. Compared with these two types of models, the BLEU mean of T5-MAML is 0.99% and 1.51% higher, respectively. This result indicates that in terms of translation accuracy, our model is more ideal than the other two advanced multilingual translation models. Based on meta learning and training fine-tuning, the T5-MAML model can learn a wider range of language patterns and contextual information, and has better generalization ability and stronger adaptability when dealing with multilingual translation.

(2) Comparative analysis of TER with advanced methods

The comparison results of the three types of models in TER are shown in Table 13:

Table 13 TER comparison

According to Table 13, the mean TER results of the mBART model and XLM-R model for each language sequence are 54.24% and 54.21%, respectively, which are slightly higher than those of the T5-MAML model. This represents that the translation results generated by the T5-MAML model are closer to the results of manual translation and require less correction. In the process of multilingual training, compared with the mBART model and XLM-R model, the T5-MAML model focuses more on optimizing the editing distance. Through effective adaptive training strategies, it can minimize the gap between the generated translation and the reference translation as much as possible.

(2) omparative analysis of METEOR with advanced methods.

The comparison results of three types of models in METEOR are shown in Table 14:

Table 14 METEOR comparison

From Table 14, it can be seen that the T5-MAML model has more significant advantages. Among them, the mean METEOR results of mBART model and XLM-R model in various language sequences were 0.70 and 0.69, respectively, slightly lower than the model in this paper. This indicates that the translation results generated by the T5-MAML model not only match well with the reference translation at the lexical level, but also perform better in terms of word order, fragment length matching, and semantic coherence. Compared with the T5-MAML model, although the mBART model and XLM-R model also learned cross lingual features during the pre training stage, they do not have a dedicated meta learning mechanism to accelerate their ability to adapt to new tasks.

4.4 Ablation experiment

In the ablation experiment, the impact of MAML meta learning strategy on the translation quality is studied by removing it. In the T5-MAML model, the MAML meta learning strategy is used to train the model. The T5-Large model is used as the ablated model, which is trained only using standard supervised learning methods. The hyperparameter settings of the two types of models remain consistent.

  1. 1)

    Ablation comparison analysis of BLEU score.

The impact of MAML strategy on the translation quality is analyzed by comparing the BLEU scores of T5-MAML model and T5-Large model. The ablation comparison results are shown in Fig. 8:

Fig. 8
figure 8

Ablation comparison results of BLEU score

From Fig. 8, it can be seen that in the ablation experiment, the BLEU score of the model in this article is significantly better than that of the T5-Large model. From the specific results, the highest BLEU score of the model in this article reaches 33.63%, and its mean in different language tasks reaches 31.67%. The highest BLEU score of the T5-Large model is 30.67%, with a mean of 28.24% in different language tasks. Compared to the T5-Large model, the mean BLEU score of the model in this article is 3.43% higher. The model in this article integrates MAML algorithm, which enables the model to quickly adapt to new tasks on data. In translation tasks, this means that the model can learn the mapping relationship between new language pairs faster, thereby improving translation quality.

  1. 2)

    Ablation comparison analysis of TER.

By comparing the TER values of two models in the same translation task, the impact of meta learning strategies on the translation quality and detail processing is analyzed. The final comparison results are shown in Fig. 9:

Fig. 9
figure 9

Ablation comparison results of TER

In Fig. 9, the TER results of the model in different language pairs are generally lower than those of the T5-Large model. From the specific comparison results, it can be seen that the TER results of the model in this article in different language tasks are the lowest at 49.45%, with a mean of about 50.82%. The T5-Large model achieves the lowest TER result of 58.27% in different language pairs, with a mean of approximately 59.34%. The mean TER of the T5-MAML model in this article is 8.52% lower than that of the T5-Large model without meta learning. From this result, it can be seen that the T5-MAML model using meta learning performs better in English translation quality in multilingual environments. The MAML algorithm enables models to quickly learn from data, and in multilingual environments, models can quickly adapt and generate high-quality translations.

(3) Ablation comparison analysis of METEOR.

By comparing the METEOR scores of the models in the same translation task, analyze the impact of meta learning strategies on the naturalness of the translated text. The final comparison results are shown in Table 15:

Table 15 METEOR score

From Table 15, it can be seen that under the ablation experiment, the mean METEOR score of our model is 0.75, and the mean METEOR score of the T5 Large model is 0.67. From the comparison results, it can be seen that meta learning strategies have a key impact on the naturalness of the model's generated translations. Meta learning strategies utilize prior knowledge to better understand the mapping relationship between the source language and the target language in translation tasks, thereby generating more natural and fluent translations.

5 Discussion

This article verifies the sustainable improvement and application effect of multilingual English translation quality based on T5 and MAML through baseline comparison advanced methods comparison, and ablation experiments. From the baseline comparison, compared to the other three baseline models, the T5-MAML model in this article performs better in BLEU Score and TER levels. The T5-MAML model can learn more semantic information and structure through joint training on multiple tasks, thus having higher translation quality. From the ablation experiment, it can be seen that compared with the T5-Large model without meta learning, the T5-MAML model has higher BLEU scores and lower TER results. MAML enables the model to have strong multilingual English translation capabilities, reducing overfitting to specific languages and improving the overall quality of the translation. Overall, the T5-MAML model has better adaptability and generalization ability, and can better perform multilingual translation tasks, obtaining high-quality translations with a smaller sample size.

T5, as the basic model, has learned rich language representations through pre training on a large amount of text data. Combining MAML meta learning mechanism, quickly adapt to new tasks based on corpus training, and learn conversion rules between new language pairs more quickly. Through cross language alignment techniques and the use of multilingual datasets, the diversity and balance of the model have been further enhanced. Although this article provides some guidance for improving the quality of multilingual translation, there are still limitations in terms of cross domain adaptability. T5 pre trained on large-scale datasets may be affected by data imbalance during corpus training, and the selection of hyperparameters in MAML methods has a significant impact on the performance of the model. In future research, we will consider building multimodal translation scenarios to study how to maintain consistency and naturalness in translation in a multimodal environment, in order to promote the high-quality development of multilingual intelligent translation. From the perspective of improving the multilingual pre training model, cross language transfer learning is utilized to enhance the performance of the model in low resource languages, further optimizing the meta learning algorithm to make it more efficient and stable on different language pairs.

6 Conclusion

In the context of global integration, cross-cultural communication is becoming increasingly frequent, making multilingual translation an inevitable trend. Traditional translation models have limitations in their training adaptability and fitting ability in multilingual environments. To improve the translation performance and level of the model, this article combines T5 and MAML to study their sustainable quality improvement and application effect in multilingual English translation. Compared with the baseline models, the T5-MAML model has more ideal translation quality, and MAML effectively enhances the model’s rapid adaptation and generalization ability, improving the practicality of the translation model. Although this article can provide some guidance for multilingual English translation, it also has limitations. This article mainly focuses on the translation task between English and other languages, and the translation effect between non English languages still needs further verification. In future research, exploring the application of T5 and MAML meta learning strategies in translation of other languages can be considered to further promote cross-cultural communication and exchange.