Multi Task Learning (MTL)
Multi Task Learning (MTL)
for
Deep Learning
Dr G Manikandan
• Multi-Task Learning (MTL) is a type of machine learning technique where a
model is trained to perform multiple tasks simultaneously.
• By sharing some of the network’s parameters, the model can learn a more
efficient and compact representation of the data, which can be beneficial when
heads.
• The shared feature extractor is a part of the network that is shared across tasks
and is used to extract features from the input data.
• The task-specific heads are used to make predictions for each task and are
commonalities.
• It is also useful when the data is limited, MTL can help to improve the
• Biologically, humans learn in the same way. We learn better if we learn multiple
related tasks instead of focusing on one specific task for a long time.
• Now, let’s discuss the major and prevalent techniques to use MTL. Hard
Parameter Sharing – A common hidden layer is used for all tasks but several
task specific layers are kept intact towards the end of the model.
that the parameters become similar and can represent all the tasks.
• Assumptions and Considerations – Using MTL to share knowledge among
tasks are very useful only when the tasks are very similar, but when this
assumption is violated, the performance will significantly
decline. Applications: MTL techniques have found various uses, some of the
major applications are-
1.Task relatedness: MTL is most effective when the tasks are related or have some
commonalities, such as natural language processing, computer vision, and healthcare.
2.Data limitation: MTL can be useful when the data is limited, as it allows the model to
leverage the information shared across tasks to improve the generalization
performance.
4.Task-specific heads: Task-specific heads are used to make predictions for each task
1.Shared decision-making layer: another approach is to use a shared decision-making layer,
where the decision-making layer is shared across tasks, and the task-specific layers are
connected to the shared decision-making layer.
3.Overfitting: MTL models can be prone to overfitting if the model is not regularized
properly.
4.Avoiding negative transfer: when the tasks are very different or independent, MTL can
lead to suboptimal performance compared to training a single-task model. Therefore, it is
important to make sure that the shared features are useful for all tasks to avoid negative