0% found this document useful (0 votes)
24 views11 pages

Continual Learning Proposal

The proposal aims to analyze the backpropagation algorithm in the context of continual learning, focusing on phenomena like catastrophic forgetting and plasticity. It distinguishes between continual reinforcement learning and continual supervised learning, exploring their frameworks and challenges, particularly in language modeling. The document also outlines new synthetic task sequences for testing procedural and declarative memory in models, proposing novel datasets for continual pretraining and finetuning.

Uploaded by

agentd.startup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views11 pages

Continual Learning Proposal

The proposal aims to analyze the backpropagation algorithm in the context of continual learning, focusing on phenomena like catastrophic forgetting and plasticity. It distinguishes between continual reinforcement learning and continual supervised learning, exploring their frameworks and challenges, particularly in language modeling. The document also outlines new synthetic task sequences for testing procedural and declarative memory in models, proposing novel datasets for continual pretraining and finetuning.

Uploaded by

agentd.startup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Continual Learning Proposal

Pavan Kalyan
November 2024

Objective
Analyse backpropagation algorithm and understand the properties of the algorithm
that are related to continual learning. There are two phenomena that are popular in
the continual learning literature – Catastrophic forgetting and plasticity (adaptation
to new tasks). The inability of models to prevent the first and do the latter simul-
taneously is called stability-plasticity tradeoff. Most works analyse neural networks
for these phenomena [12, 33, 36, 26] but do not directly attribute the results to
the basic properties of backpropagation algorithm. We aim to analyse properties
of backpropgation as one of the credit assignment algorithms in a continual learn-
ing setting. To achieve this we design different types of novel synthetic continual
learning settings inspired by types of human memory and their plasticity.

Background
Continual learning has been tested popularly in two frameworks - continual super-
vised learning and continual reinforcement learning.

Continual Reinforcement Learning (CRL)


CRL is assumed to be the most natural setting to study continual learning because
of the inherent non-stationarity in the learning framework [25]. Some of the popular
continual learning benchmarks include Jelly Bean World [40], LIBERO [29], Con-
tinual World [55], CRLMaze [30], CORA [41], and COOM [48]. These benchmarks
often exhibit certain deficiencies like lack of comprehensive baselines, multiple RL
backones, only testing class/task-incremental learning and limited number of tasks.
This is accompanied by lack of precise definition for continual reinforcement learn-
ing [27, 1, 25] and computational challenges of deep RL like intractable action and
state spaces, cost of training and inflexible reward design [44, 9].

Continual Supervised Learning (CSL)


CSL is frequently referred to as continual learning in the vision modality. Wang
et al. [51] discusses the types of continual learning, different benchmarks available

1
and methods for solving the continual learning problem. The most popular setting
for testing continual learning is to use the image classification task by arranging the
classes present in Imagenet/MNIST/CIFAR in different ways [38]. Task, Class and
domain - incremental learning are the most popular types of problems that are being
solved by the vision community [28, 50]. While these are not the most natural forms
of continual learning, they offer flexible and efficient ways to test continual learning
[10]. CSL can also be tested for pretrained models and have shown to mitigate
forgetting of previously seen classes [39, 21, 11]. Continual learning for pretrained
models is more popular for the domain of language modelling. There can be three
arguments to justify using language modeling as a setting for analyzing continual
learning -
• LLMs inference has become a common cost for industries but as the compute
capacity is scaling, more companies are taking up training as well. While
many companies might not have the ability to pretrain from scratch, they
might want to just continually train on private data.
• Many works relate continual learning to model scale and pre-training phase.
Cossu et al. [11], Mehta et al. [32] show self-supervised pre-training helps in
mitigating forgetting. Hernandez et al. [17], Ramasesh et al. [43], Mirzadeh
et al. [34] show that this ability improves with model scale. Scialom et al. [46]
relates this ability of language models to learn continually with pre-training
objective.
• Class-incremental setting is supposed to be the hardest type of CL in the
vision classification tasks. Though this setting is very well defined, the idea
of learning new tasks is not natural to be posed as learning a new class of
objects. On the other hand, language modeling offers a richer scope to define
tasks that require incremental learning.

Continual Learning in Language modelling (CLM)


Literature divides CL in language modelling into three phases - Continual Pretraining
(CPT), Domain Adaptive pretraining (DAP), Continual Fientuning (CFT) [47]. This
differentiation is based on how the pretraining dataset is different from dataset for
subsequent tasks? size (-smaller/equal/larger), language (-code/natural language),
domain (-specific/across domains). Wu et al. [56] classified CPT into fact, domain
and language updates.
• Fact or knowledge updation is usually performed on temporal datasets like
temporal wiki [19, 20]. Each task is modelling a set of wikipedia pages having
content with different time stamps.
• Domain updates can be of two types - domain-incremental and domain-
specific. cCaugatay Yildiz et al. [7] created a sequence of 150 tasks where each
task is modeling a text corpora from different domain. Most works [24, 15, 23]
use available datasets on news, academic papers, reviews and specific domains
to create a sequence of tasks. Domain-specific continual pretraining involves

2
adapting the model to only one/two specific domains. For example, Yang
et al. [59] adapts LLMs to small-scale academic plant-science data. Similarly,
Ma et al. [31] adapts the models to e-commerce domain and unstructured
data.
• Language can be both code and natural language. Most works [14, 6] use 4-7
natural langauges in their CL settings, learning each language as a task. In the
case of CL for new code languages, Yadav et al. [58] builds a benchmark based
on four tasks namely code generation, translation, summarization, and refine-
ment, with different input and output programming languages. A more recent
work [57] shows that LLMs are bad at version controllable code-generation.
Their benchmark, VersiCode includes data spanning over 300 Python libraries
and more than 2,000 versions across 9 years.
Most works mentioned in Wu et al. [56] claim to do CPT but the size of the train-
ing corpora for subsequent tasks is much smaller than that of the first pretraining
corpora. Ibrahim et al. [18] is a recent work that emphasizes on the size of the pre-
training corpora and the datasets for subsequent tasks. The work contains analysis
of continual pretraining with shift in language, weak shift in datasets, and shift in
domains (different datasets).

Continual finetuning (CFT) can be broadly classified into Continual instruction


tuning (CIT), model alignment and refinement [47]. Out of which CIT is the most
popular. Continual instruction tuning can be used to update domain, task or tool ac-
cess for the model. Wang et al. [52] released an evaluation benchmark called TRACE
for testing CL in LLMs. The benchmark consists of 8 distinct datasets spanning
domain-specific tasks, multilingual capabilities, code generation, and mathematical
reasoning. Only 5000 instances are available for each task. Task-incremental CIT
[61, 60] involves fientuning LLM on a sequence of NLP tasks and testing for both
backward and forward transfer. These works used Natural-Instructions [35] and
Super-Natural Instructions [53] datasets to construct the benchmarks. Dong et al.
[13] finetunes LLMs on a sequence of supervised finetuning data from different do-
mains. Works like [16, 42] introduced instruction-tuning datasets for tool use by
LLMs. These tools can be sequentially arranged to systematically test incremental-
tool learning in LLMs.

Learning and Memory in Humans


Human Long-term memory (LTM) can be divided into two subtypes according to
Atkinson and Shiffrin [3]: Explicit memory and Implicit memory. Explicit memory
also known as Declarative memory is mainly of two types - Episodic and Semantic
memory. There are further types of explicit memories which are a combination of
these two types like spatial and autobiographical memory. Explicit memory is the
conscious and intentional recollection of facts, experiences and concepts. While
there are supporting evidences for this theory of different types of memories and
some criticisms as well, the underlying brain regions for storing these two types of

3
information (explicit and implicit) have found to be different. While Hippocampus
and some parts of cortex are responsible for explicit memory, implicit memory shows
activity in striatum and othe parts of basal ganglia and cortex [22]. Implicit mem-
ory is used and acquired unconsciously [45]. It can be of many forms like priming
(pattern completion), perceptual learning (differentiation between different senses),
category learning (assigning novel objects to contrasting groups like movies genres,
fruit types etc.) and procedural learning (formation of skills and habits like riding
a bicycle). Language learning also has some interesting connections to declarative
and procedural memory [49]. While aging generally leads to a decline in declara-
tive memory, procedural memory may remain intact or even enhance under certain
conditions [4, 37, 8]. Given this difference for humans, it is also important to test
language models on different sets of tasks sequentially. Most vision benchmarks are
based on image classification/object recognition task which primarly uses declar-
ative knowledge [5, 54]. In language modelling, for CPT, the models are trained
continually in an unsupervised way. Both procedural and declarative knowledge is
required for the model to perform well. It is important to first separate these tasks
into two types and analyse the problems with backpropagation.

New Setting
Owing to the importance of differentiation between procedural and declarative mem-
ory in humans, we design two types of task sequences. In each type, we have
two settings – continual pretraining and continual finetuning. Since most continual
learning benchmarks and datasets only test declarative memory, our declarative task
sequence will not be very different, but we design new dataset for testing procedural
continual learning.

Procedural Task Sequence


Continual Pre-training - We use polynomials of trigonometric functions as syn-
thetic tasks for continual pretraining. These functions should be easily learnable
by a small language model but also should offer us the flexibility to define diverse
functions to continually pretrain the model.

Continual Finetuning - Procedural tasks does not require explicit knowledge or


facts to be stores, instead they are about skills acquired unconsciously. To mimic
the same, we pretrain a transformer model on small english corpora like TinyStories
until the model has good understanding of english language. Then, we formulate
the model’s ability to use a literary device as a skill, i.e. the model is still per-
forming language modelling but now can introduce literary devices into the text
learnt during the finetuning. Literary devices are techniques or tools used by writers
to enhance their writing, convey meaning, evoke emotion, or engage the reader.
There are very common literary devices like metaphor, similie etc. but there are
also very specific literary devices in poetry and prose. The most common literary
devices are - Metaphor, Simile, Analogy, Imagery, Symbolism, Personification, Hy-

4
perbole, Irony, Juxtaposition, Paradox, Allusion, Allegory, Ekphrasis, Onomatopoeia
and Pun. There are uncommon literary devices classified into the following types
having multiple literary devices - Prose, Poetry, Repetition, Dialogue, Word Play,
Parallelism and Rhetoric. The sequence of these literary devices can act as a substi-
tute to the sequence of procedural skills. Given there are different types, it is also
possible to test forward and backward transfer of skills.

Declarative Task Sequence


A declarative task can be any knowledge-intensive NLP task. For instance, learning
a series of synthetic biographies [2] and performing question answering can be a
controlled setup to test continual pretraining and finetuning. Different family trees
can be drawn to make relation between tasks in order to test forward and backward
transfer.

References
[1] David Abel, André Barreto, Benjamin Van Roy, Doina Precup, H. V. Has-
selt, and Satinder Singh. A definition of continual reinforcement learning.
ArXiv, abs/2307.11046, 2023. URL https://api.semanticscholar.org/
CorpusID:259991454. 1

[2] Zeyuan Allen-Zhu and Yuanzhi Li. Physics of language models: Part 3.1,
knowledge storage and extraction. ArXiv, abs/2309.14316, 2023. URL https:
//api.semanticscholar.org/CorpusID:262825178. 5
[3] Richard C. Atkinson and Richard M. Shiffrin. Human memory: A proposed
system and its control processes. In The psychology of learning and motivation,
1968. URL https://api.semanticscholar.org/CorpusID:22958289. 3
[4] Rachel M. Brown, Edwin M. Robertson, and Daniel Z. Press. Sequence skill
acquisition and off-line learning in normal aging. PLoS ONE, 4, 2009. URL
https://api.semanticscholar.org/CorpusID:1891434. 4

[5] Valeria Paola Carlini. The object recognition task: A new proposal for the
memory performance study. 2011. URL https://api.semanticscholar.
org/CorpusID:11238733. 4
[6] Giuseppe Castellucci, Simone Filice, Danilo Croce, and Roberto Basili. Learning
to solve nlp tasks in an incremental number of languages. In Annual Meeting
of the Association for Computational Linguistics, 2021. URL https://api.
semanticscholar.org/CorpusID:236460265. 3
[7] cCaugatay Yildiz, Nishaanth Kanna Ravichandran, Prishruit Punia, Matthias
Bethge, and Beyza Hilal Ermiş. Investigating continual pretraining in large
language models: Insights and implications. ArXiv, abs/2402.17400, 2024.
URL https://api.semanticscholar.org/CorpusID:268032887. 2

5
[8] Guillaume Chauvel, François Maquestiaux, André Didierjean, Sven Joubert,
Benedicte Dieudonne, and Marc Verny. [use of nondeclarative and automatic
memory processes in motor learning: how to mitigate the effects of aging].
Geriatrie et psychologie neuropsychiatrie du vieillissement, 9 4:455–63, 2011.
URL https://api.semanticscholar.org/CorpusID:43814900. 4

[9] Weiqin Chen. Open problems and modern solutions for deep reinforce-
ment learning. ArXiv, abs/2302.02298, 2023. URL https://api.
semanticscholar.org/CorpusID:256615621. 1
[10] Andrea Cossu, Gabriele Graffieti, Lorenzo Pellegrini, Davide Maltoni, Davide
Bacciu, Antonio Carta, and Vincenzo Lomonaco. Is class-incremental enough
for continual learning? Frontiers in Artificial Intelligence, 5, 2021. URL https:
//api.semanticscholar.org/CorpusID:244909234. 2
[11] Andrea Cossu, Tinne Tuytelaars, Antonio Carta, Lucia C. Passaro, Vin-
cenzo Lomonaco, and Davide Bacciu. Continual pre-training mitigates for-
getting in language and vision. Neural networks : the official journal of
the International Neural Network Society, 179:106492, 2022. URL https:
//api.semanticscholar.org/CorpusID:248887419. 2
[12] Shibhansh Dohare, J. Fernando Hernandez-Garcia, Qingfeng Lan, Parash Rah-
man, Ashique Rupam Mahmood, and Richard S. Sutton. Loss of plastic-
ity in deep continual learning. Nature, 632:768 – 774, 2024. URL https:
//api.semanticscholar.org/CorpusID:259251905. 1
[13] Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Day-
iheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, and Jingren Zhou. How abili-
ties in large language models are affected by supervised fine-tuning data compo-
sition. ArXiv, abs/2310.05492, 2023. URL https://api.semanticscholar.
org/CorpusID:263830318. 3
[14] Evangelia Gogoulou, Timothée Lesort, Magnus Boman, and Joakim Nivre.
Continual learning under language shift. In International Conference on Text,
Speech and Dialogue, 2023. URL https://api.semanticscholar.org/
CorpusID:264935145. 3

[15] Suchin Gururangan, Michael Lewis, Ari Holtzman, Noah A. Smith, and Luke
Zettlemoyer. Demix layers: Disentangling domains for modular language
modeling. In North American Chapter of the Association for Computational
Linguistics, 2021. URL https://api.semanticscholar.org/CorpusID:
236976189. 2

[16] Shibo Hao, Tianyang Liu, Zhen Wang, and Zhiting Hu. Toolkengpt: Aug-
menting frozen language models with massive tools via tool embeddings.
ArXiv, abs/2305.11554, 2023. URL https://api.semanticscholar.org/
CorpusID:258823133. 3

6
[17] Danny Hernandez, Jared Kaplan, Tom Henighan, and Sam McCandlish. Scal-
ing laws for transfer. ArXiv, abs/2102.01293, 2021. URL https://api.
semanticscholar.org/CorpusID:231749962. 2
[18] Adam Ibrahim, Benjamin Th’erien, Kshitij Gupta, Mats L. Richter, Quentin
Anthony, Timothée Lesort, Eugene Belilovsky, and Irina Rish. Sim-
ple and scalable strategies to continually pre-train large language models.
ArXiv, abs/2403.08763, 2024. URL https://api.semanticscholar.org/
CorpusID:268379604. 3
[19] Joel Jang, Seonghyeon Ye, Sohee Yang, Joongbo Shin, Janghoon Han,
Gyeonghun Kim, Stanley Jungkyu Choi, and Minjoon Seo. Towards continual
knowledge learning of language models. ArXiv, abs/2110.03215, 2021. URL
https://api.semanticscholar.org/CorpusID:238419458. 2
[20] Joel Jang, Seonghyeon Ye, Changho Lee, Sohee Yang, Joongbo Shin, Janghoon
Han, Gyeonghun Kim, and Minjoon Seo. Temporalwiki: A lifelong benchmark
for training and evaluating ever-evolving language models. In Conference on
Empirical Methods in Natural Language Processing, 2022. URL https://
api.semanticscholar.org/CorpusID:248476156. 2
[21] Paul Janson, Wenxuan Zhang, Rahaf Aljundi, and Mohamed Elhoseiny. A sim-
ple baseline that questions the use of pretrained-models in continual learning.
ArXiv, abs/2210.04428, 2022. URL https://api.semanticscholar.org/
CorpusID:252780641. 2
[22] Eric R. Kandel, Yadin Dudai, and Mark Mayford. The molecular and sys-
tems biology of memory. Cell, 157:163–186, 2014. URL https://api.
semanticscholar.org/CorpusID:827792. 4
[23] Zixuan Ke, Haowei Lin, Yijia Shao, Hu Xu, Lei Shu, and Bin Liu. Continual
training of language models for few-shot learning. ArXiv, abs/2210.05549,
2022. URL https://api.semanticscholar.org/CorpusID:252815848. 2
[24] Zixuan Ke, Yijia Shao, Haowei Lin, Tatsuya Konishi, Gyuhak Kim, and Bin
Liu. Continual pre-training of language models. In International Conference
on Learning Representations, 2023. URL https://api.semanticscholar.
org/CorpusID:258079422. 2
[25] Khimya Khetarpal, Matthew Riemer, Irina Rish, and Doina Precup. Towards
continual reinforcement learning: A review and perspectives. J. Artif. In-
tell. Res., 75:1401–1476, 2020. URL https://api.semanticscholar.org/
CorpusID:229679944. 1

[26] Jeremias Knoblauch, Hisham Husain, and Tom Diethe. Optimal continual
learning has perfect memory and is np-hard. In International Conference
on Machine Learning, 2020. URL https://api.semanticscholar.org/
CorpusID:219558404. 1

7
[27] Saurabh Kumar, Henrik Marklund, Anand Srinivasa Rao, Yifan Zhu, Hong Jun
Jeon, Yueyang Liu, and Benjamin Van Roy. Continual learning as computation-
ally constrained reinforcement learning. ArXiv, abs/2307.04345, 2023. URL
https://api.semanticscholar.org/CorpusID:259501376. 1
[28] Zhiqiu Lin, Jia Shi, Deepak Pathak, and Deva Ramanan. The clear benchmark:
Continual learning on real-world imagery. ArXiv, abs/2201.06289, 2022. URL
https://api.semanticscholar.org/CorpusID:244906195. 2
[29] Bo Liu, Yifeng Zhu, Chongkai Gao, Yihao Feng, Qian Liu, Yuke Zhu, and Peter
Stone. Libero: Benchmarking knowledge transfer for lifelong robot learning.
ArXiv, abs/2306.03310, 2023. URL https://api.semanticscholar.org/
CorpusID:259089508. 1
[30] Vincenzo Lomonaco, Karan Desai, Eugenio Culurciello, and Davide Mal-
toni. Continual reinforcement learning in 3d non-stationary environments.
2020 IEEE/CVF Conference on Computer Vision and Pattern Recogni-
tion Workshops (CVPRW), pages 999–1008, 2019. URL https://api.
semanticscholar.org/CorpusID:165163924. 1
[31] Shirong Ma, Shen Huang, Shulin Huang, Xiaobin Wang, Yangning Li, Hai-
Tao Zheng, Pengjun Xie, Fei Huang, and Yong Jiang. Ecomgpt-ct: Continual
pre-training of e-commerce large language models with semi-structured data.
ArXiv, abs/2312.15696, 2023. URL https://api.semanticscholar.org/
CorpusID:266551564. 3
[32] Sanket Vaibhav Mehta, Darshan Patil, Sarath Chandar, and Emma Strubell. An
empirical investigation of the role of pre-training in lifelong learning. J. Mach.
Learn. Res., 24:214:1–214:50, 2021. URL https://api.semanticscholar.
org/CorpusID:245329773. 2
[33] Seyed Iman Mirzadeh, Mehrdad Farajtabar, Razvan Pascanu, and Hassan
Ghasemzadeh. Understanding the role of training regimes in continual learning.
ArXiv, abs/2006.06958, 2020. URL https://api.semanticscholar.org/
CorpusID:219636010. 1
[34] Seyed Iman Mirzadeh, Arslan Chaudhry, Huiyi Hu, Razvan Pascanu, Dilan
Gorur, and Mehrdad Farajtabar. Wide neural networks forget less catas-
trophically. In International Conference on Machine Learning, 2021. URL
https://api.semanticscholar.org/CorpusID:239616391. 2
[35] Swaroop Mishra, Daniel Khashabi, Chitta Baral, and Hannaneh Hajishirzi.
Natural instructions: Benchmarking generalization to new tasks from nat-
ural language instructions. ArXiv, abs/2104.08773, 2021. URL https:
//api.semanticscholar.org/CorpusID:233296635. 3
[36] Cuong V Nguyen, Alessandro Achille, Michael Lam, Tal Hassner, Vijay Ma-
hadevan, and Stefano Soatto. Toward understanding catastrophic forget-
ting in continual learning. ArXiv, abs/1908.01091, 2019. URL https:
//api.semanticscholar.org/CorpusID:199442601. 1

8
[37] Lars-Göran Nilsson. Memory function in normal aging. Acta Neurologica Scan-
dinavica, 107, 2003. URL https://api.semanticscholar.org/CorpusID:
25804407. 4
[38] Lorenzo Pellegrini, Chenchen Zhu, Fanyi Xiao, Zhicheng Yan, Anto-
nio Carta, Matthias De Lange, Vincenzo Lomonaco, Roshan Sumbaly,
Pau Rodrı́guez López, and David Vázquez. 3rd continual learning work-
shop challenge on egocentric category and instance level object understanding.
ArXiv, abs/2212.06833, 2022. URL https://api.semanticscholar.org/
CorpusID:254636216. 2
[39] Francesco Pelosin. Simpler is better: off-the-shelf continual learning through
pretrained backbones. ArXiv, abs/2205.01586, 2022. URL https://api.
semanticscholar.org/CorpusID:248506204. 2
[40] Emmanouil Antonios Platanios, Abulhair Saparov, and Tom M. Mitchell. Jelly
bean world: A testbed for never-ending learning. ArXiv, abs/2002.06306, 2020.
URL https://api.semanticscholar.org/CorpusID:211132493. 1
[41] Sam Powers, Eliot Xing, Eric Kolve, Roozbeh Mottaghi, and Abhinav Kumar
Gupta. Cora: Benchmarks, baselines, and metrics as a platform for continual
reinforcement learning agents. ArXiv, abs/2110.10067, 2021. URL https:
//api.semanticscholar.org/CorpusID:239024725. 1
[42] Yujia Qin, Shi Liang, Yining Ye, Kunlun Zhu, Lan Yan, Ya-Ting Lu, Yankai
Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Runchu Tian, Ruobing
Xie, Jie Zhou, Marc H. Gerstein, Dahai Li, Zhiyuan Liu, and Maosong Sun.
Toolllm: Facilitating large language models to master 16000+ real-world apis.
ArXiv, abs/2307.16789, 2023. URL https://api.semanticscholar.org/
CorpusID:260334759. 3
[43] Vinay Venkatesh Ramasesh, Aitor Lewkowycz, and Ethan Dyer. Effect of scale
on catastrophic forgetting in neural networks. In International Conference on
Learning Representations, 2022. URL https://api.semanticscholar.org/
CorpusID:251648120. 2
[44] Timothy Rupprecht and Yanzhi Wang. A survey for deep reinforcement learning
in markovian cyber-physical systems: Common problems and solutions. Neural
networks : the official journal of the International Neural Network Society,
153:13–36, 2022. URL https://api.semanticscholar.org/CorpusID:
248975465. 1
[45] Daniel L. Schacter. Implicit memory: History and current status. Journal
of Experimental Psychology: Learning, Memory and Cognition, 13:501–518,
1987. URL https://api.semanticscholar.org/CorpusID:3728984. 4
[46] Thomas Scialom, Tuhin Chakrabarty, and Smaranda Muresan. Fine-tuned
language models are continual learners. In Conference on Empirical Methods in
Natural Language Processing, 2022. URL https://api.semanticscholar.
org/CorpusID:252815378. 2

9
[47] Haizhou Shi, Zihao Xu, Hengyi Wang, Weiyi Qin, Wenyuan Wang, Yibin
Wang, and Hao Wang. Continual learning of large language models: A
comprehensive survey. ArXiv, abs/2404.16789, 2024. URL https://api.
semanticscholar.org/CorpusID:269362836. 2, 3
[48] Tristan Tomilin, Meng Fang, Yudi Zhang, and Mykola Pechenizkiy. Coom:
A game benchmark for continual reinforcement learning. In Neural Informa-
tion Processing Systems, 2023. URL https://api.semanticscholar.org/
CorpusID:268064920. 1
[49] Michael T. Ullman. Contributions of memory circuits to language: The declar-
ative/procedural model. Cognition, 92:231–270, 2004. ISSN 0010-0277. doi:
10.1016/j.cognition.2003.10.008. 4
[50] Eli Verwimp, Kuo Yang, Sarah Parisot, Hong Lanqing, Steven G. McDonagh,
Eduardo P’erez-Pellitero, Matthias De Lange, and Tinne Tuytelaars. Clad: A
realistic continual learning benchmark for autonomous driving. Neural networks
: the official journal of the International Neural Network Society, 161:659–669,
2022. URL https://api.semanticscholar.org/CorpusID:252762199. 2
[51] Liyuan Wang, Xingxing Zhang, Hang Su, and Jun Zhu. A comprehensive survey
of continual learning: Theory, method and application. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 46:5362–5383, 2023. URL https:
//api.semanticscholar.org/CorpusID:256459333. 1
[52] Xiao Wang, Yuan Zhang, Tianze Chen, Songyang Gao, Senjie Jin, Xian-
jun Yang, Zhiheng Xi, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, and
Xuanjing Huang. Trace: A comprehensive benchmark for continual learn-
ing in large language models. ArXiv, abs/2310.06762, 2023. URL https:
//api.semanticscholar.org/CorpusID:263830425. 3
[53] Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Ko-
rdi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan
Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Kara-
manolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson,
Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, M. Morad-
shahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza,
Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Shailaja Keyur Sampat,
Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit,
Xudong Shen, Chitta Baral, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi,
and Daniel Khashabi. Super-naturalinstructions: Generalization via declara-
tive instructions on 1600+ nlp tasks. In Conference on Empirical Methods in
Natural Language Processing, 2022. URL https://api.semanticscholar.
org/CorpusID:253098274. 3
[54] Boyer D. Winters, Lisa M. Saksida, and Timothy J. Bussey. Object recognition
memory: Neurobiological mechanisms of encoding, consolidation and retrieval.
Neuroscience & Biobehavioral Reviews, 32:1055–1070, 2008. URL https:
//api.semanticscholar.org/CorpusID:207088001. 4

10
[55] Maciej Wolczyk, Michal Zajkac, Razvan Pascanu, Lukasz Kuci’nski, and Piotr
Milo’s. Continual world: A robotic benchmark for continual reinforcement
learning. In Neural Information Processing Systems, 2021. URL https://
api.semanticscholar.org/CorpusID:235166235. 1
[56] Tongtong Wu, Linhao Luo, Yuan-Fang Li, Shirui Pan, Thuy-Trang Vu, and
Gholamreza Haffari. Continual learning for large language models: A survey.
ArXiv, abs/2402.01364, 2024. URL https://api.semanticscholar.org/
CorpusID:267406164. 2, 3
[57] Tongtong Wu, Weigang Wu, Xingyu Wang, Kang Xu, Suyu Ma, Bo Jiang, Ping
Yang, Zhenchang Xing, Yuan-Fang Li, and Gholamreza Haffari. Versicode:
Towards version-controllable code generation. ArXiv, abs/2406.07411, 2024.
URL https://api.semanticscholar.org/CorpusID:270380057. 3
[58] Prateek Yadav, Qing Sun, Hantian Ding, Xiaopeng Li, Dejiao Zhang, Ming
Tan, Xiaofei Ma, Parminder Bhatia, Ramesh Nallapati, Murali Krishna Ra-
manathan, Mohit Bansal, and Bing Xiang. Exploring continual learning for code
generation models. In Annual Meeting of the Association for Computational
Linguistics, 2023. URL https://api.semanticscholar.org/CorpusID:
259341641. 3
[59] Xianjun Yang, Junfeng Gao, Wenxin Xue, and Erik Alexandersson. Pllama: An
open-source large language model for plant science. ArXiv, abs/2401.01600,
2024. URL https://api.semanticscholar.org/CorpusID:266741610. 3
[60] Wenpeng Yin, Jia Li, and Caiming Xiong. Contintin: Continual learn-
ing from task instructions. ArXiv, abs/2203.08512, 2022. URL https:
//api.semanticscholar.org/CorpusID:247476090. 3
[61] Zihan Zhang, Meng Fang, Ling Chen, and Mohammad-Reza Namazi-Rad.
Citb: A benchmark for continual instruction tuning. In Conference on Em-
pirical Methods in Natural Language Processing, 2023. URL https://api.
semanticscholar.org/CorpusID:264426357. 3

11

You might also like