a.k.a. Joint Learning, Learning with auxiliary tasks.

2개 이상의 loss function 사용 ⇒ MTL!

<aside> 💡 “MTL improves generalization by leveraging the domain-specific information contained in the training signals of related tasks.”

</aside>

Although acceptable performance can be achieved by being totally focused on a single task, sometimes we ignore information that might help us do even better on the metric we care about. Specifically, this kind of information comes from the training signals of other related tasks. By sharing representations among these related tasks, we can achieve better generalization for our model on our intended task. This approach is known as multitask learning (MTL).

Sharing representation → generalization!

Reference