A unified architecture for natural language processing: Deep neural networks with multitask learning

R Collobert, J Weston - Proceedings of the 25th international conference …, 2008 - dl.acm.org
Proceedings of the 25th international conference on Machine learning, 2008dl.acm.org
We describe a single convolutional neural network architecture that, given a sentence,
outputs a host of language processing predictions: part-of-speech tags, chunks, named
entity tags, semantic roles, semantically similar words and the likelihood that the sentence
makes sense (grammatically and semantically) using a language model. The entire network
is trained jointly on all these tasks using weight-sharing, an instance of multitask learning. All
the tasks use labeled data except the language model which is learnt from unlabeled text …
We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic roles, semantically similar words and the likelihood that the sentence makes sense (grammatically and semantically) using a language model. The entire network is trained jointly on all these tasks using weight-sharing, an instance of multitask learning. All the tasks use labeled data except the language model which is learnt from unlabeled text and represents a novel form of semi-supervised learning for the shared tasks. We show how both multitask learning and semi-supervised learning improve the generalization of the shared tasks, resulting in state-of-the-art-performance.
ACM Digital Library