Learning and Inference for Hierarchically Split PCFGs
Google Tech Talks
February, 28 2008
ABSTRACT
Treebank parsing can be seen as the search for an optimally refined grammar consistent with a coarse training treebank. We describe a method in which a minimal grammar is hierarchically refined using EM to give accurate, compact grammars. The resulting grammars are extremely compact compared to other high-performance parsers, yet the parser gives the best published accuracies on several languages, as well as the best generative parsing numbers in English. In addition, we give an associated coarse-to-fine inference scheme which vastly improves inference time with no loss in test set accuracy.
Slides: http://www.eecs.berkeley.edu/~petrov/data/google_talk.ppt
Speaker: Slav Petrov
Slav Petrov is a Ph.D. Candidate at University of California Berkeley Dept of Computer Science, where he is also a research assistant working with Dan Klein and Jitendra Malik on inducing latent structure for perception problems in vision and language.