skip to main content
10.5555/2382029.2382100dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
research-article
Free access

Vine pruning for efficient multi-pass dependency parsing

Published: 03 June 2012 Publication History

Abstract

Coarse-to-fine inference has been shown to be a robust approximate method for improving the efficiency of structured prediction models while preserving their accuracy. We propose a multi-pass coarse-to-fine architecture for dependency parsing using linear-time vine pruning and structured prediction cascades. Our first-, second-, and third-order models achieve accuracies comparable to those of their unpruned counterparts, while exploring only a fraction of the search space. We observe speed-ups of up to two orders of magnitude compared to exhaustive search. Our pruned third-order model is twice as fast as an unpruned first-order model and also compares favorably to a state-of-the-art transition-based parser for multiple languages.

References

[1]
S. Bergsma and C. Cherry. 2010. Fast and accurate arc filtering for dependency parsing. In Proc. of COLING, pages 53--61.
[2]
S. Buchholz and E. Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In CoNLL.
[3]
X. Carreras, M. Collins, and T. Koo. 2008. Tag, dynamic programming, and the perceptron for efficient, feature-rich parsing. In Proc. of CoNLL, pages 9--16.
[4]
X. Carreras. 2007. Experiments with a higher-order projective dependency parser. In Proc. of CoNLL Shared Task Session of EMNLP-CoNLL, volume 7, pages 957--961.
[5]
E. Charniak, M. Johnson, M. Elsner, J. Austerweil, D. Ellis, I. Haxton, C. Hill, R. Shrivaths, J. Moore, M. Pozar, et al. 2006. Multilevel coarse-to-fine PCFG parsing. In Proc. of NAACL/HLT, pages 168--175.
[6]
M. Collins. 2002. Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In Proc. of EMNLP, pages 1--8.
[7]
K. Crammer and Y. Singer. 2003. Ultraconservative online algorithms for multiclass problems. The Journal of Machine Learning Research, 3:951--991.
[8]
K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. 2006. Online passive-aggressive algorithms. The Journal of Machine Learning Research, 7:551--585.
[9]
M. C. De Marneffe, B. MacCartney, and C. D. Manning. 2006. Generating typed dependency parses from phrase structure parses. In Proc. of LREC, volume 6, pages 449--454.
[10]
J. Eisner and N. A. Smith. 2005. Parsing with soft and hard constraints on dependency length. In Proc. of IWPT, pages 30--41.
[11]
J. Eisner. 2000. Bilexical grammars and their cubic-time parsing algorithms. Advances in Probabilistic and Other Parsing Technologies, pages 29--62.
[12]
L. Huang and K. Sagae. 2010. Dynamic programming for linear-time incremental parsing. In Proc. of ACL, pages 1077--1086.
[13]
D. Klein and C. D. Manning. 2005. Parsing and hypergraphs. New developments in parsing technology, pages 351--372.
[14]
T. Koo and M. Collins. 2010. Efficient third-order dependency parsers. In Proc. of ACL, pages 1--11.
[15]
T. Koo, A. M. Rush, M. Collins, T. Jaakkola, and D. Sontag. 2010. Dual decomposition for parsing with non-projective head automata. In Proc. of EMNLP, pages 1288--1298.
[16]
M. Kuhlmann, C. Gómez-Rodríguez, and G. Satta. 2011. Dynamic programming algorithms for transition-based dependency parsers. In Proc. of ACL/HLT, pages 673--682.
[17]
J. Lafferty, A. McCallum, and F. C. N. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML, pages 282--289.
[18]
L. Lee. 2002. Fast context-free grammar parsing requires fast boolean matrix multiplication. Journal of the ACM, 49(1):1--15.
[19]
M. P. Marcus, M. A. Marcinkiewicz, and B. Santorini. 1993. Building a large annotated corpus of english: The penn treebank. Computational linguistics, 19(2):313--330.
[20]
R. McDonald and F. Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Proc. of EACL, volume 6, pages 81--88.
[21]
R. McDonald, K. Crammer, and F. Pereira. 2005. Online large-margin training of dependency parsers. In Proc. of ACL, pages 91--98.
[22]
J. Nivre, J. Hall, and J. Nilsson. 2004. Memory-based dependency parsing. In Proc. of CoNLL, pages 49--56.
[23]
J. Nocedal and S. J. Wright. 1999. Numerical Optimization. Springer.
[24]
A. Pauls and D. Klein. 2009. Hierarchical search for parsing. In Proc. of NAACL/HLT, pages 557--565.
[25]
S. Petrov and D. Klein. 2007. Improved inference for unlexicalized parsing. In Proc. of NAACL/HLT, pages 404--411.
[26]
S. Petrov, D. Das, and R. McDonald. 2012. A universal part-of-speech tagset. In LREC.
[27]
S. Petrov. 2009. Coarse-to-Fine Natural Language Processing. Ph.D. thesis, University of California at Bekeley, Berkeley, CA, USA.
[28]
B. Roark and K. Hollingshead. 2008. Classifying chart cells for quadratic complexity context-free inference. In Proc. of COLING, pages 745--751.
[29]
S. Shalev-Shwartz, Y. Singer, and N. Srebro. 2007. Pegasos: Primal estimated sub-gradient solver for svm. In Proc. of ICML, pages 807--814.
[30]
B. Taskar, C. Guestrin, and D. Koller. 2003. Max-margin markov networks. Advances in neural information processing systems, 16:25--32.
[31]
I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun. 2006. Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6(2):1453.
[32]
D. Weiss and B. Taskar. 2010. Structured prediction cascades. In Proc. of AISTATS, volume 1284, pages 916--923.
[33]
H. Yamada and Y. Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proc. of IWPT, volume 3, pages 195--206.
[34]
Y. Zhang and J. Nivre. 2011. Transition-based dependency parsing with rich non-local features. In Proc. of ACL, pages 188--193.

Cited By

View all
  • (2012)Parse, price and cutProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2391028(732-743)Online publication date: 12-Jul-2012
  • (2012)Generalized higher-order dependency parsing with cube pruningProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2390988(320-331)Online publication date: 12-Jul-2012

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
NAACL HLT '12: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
June 2012
840 pages
ISBN:9781937284206

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 03 June 2012

Qualifiers

  • Research-article

Acceptance Rates

Overall Acceptance Rate 240 of 768 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)38
  • Downloads (Last 6 weeks)9
Reflects downloads up to 04 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2012)Parse, price and cutProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2391028(732-743)Online publication date: 12-Jul-2012
  • (2012)Generalized higher-order dependency parsing with cube pruningProceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning10.5555/2390948.2390988(320-331)Online publication date: 12-Jul-2012

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media