The Association for Computational Linguistics recently named Eugene Charniak among its founding group of Fellows for his significant contributions to natural language parsing. The ACL Fellows program recognizes ACL members whose contributions to the field have been most extraordinary.
He was also awarded the 2011 Calvin & Rose G Hoffman Prize for a Distinguished Publication on Christopher Marlowe. The Marlowe prize was awarded for the essay ‘Statistical Stylometrics and the Marlowe-Shakespeare Authorship’ (with coauthors Neal Fox and Omran Ehmoda) which asked the question, "When we don't know who the author of a document is, but we have a set of candidates, how can we make confident predictions about who its author is?". Eugene and his co-authors theorized that some such statistics which might represent the "fingerprint" of an author may include how often they use certain function words and general part of speech usage frequencies, as well as the likelihoods of transitions between them. The team developed models that use these types of statistics to get an "average fingerprint" for a set of candidate authors, and tested an unlabeled document against these candidates to choose which author's fingerprint is closest to the text of unknown authorship. These methods were applied to the question in the Shakespearean authorship debate related to Christopher Marlowe's candidacy as the true author of the works attributed to Shakespeare.
Eugene received an A.B. degree in Physics from University of Chicago and a Ph.D. from M.I.T. in Computer Science. He has published four books: Computational Semantics, with Yorick Wilks (1976); Artificial Intelligence Programming (now in a second edition) with Chris Riesbeck, Drew McDermott, and James Meehan (1980, 1987); Introduction to Artificial Intelligence with Drew McDermott (1985); and Statistical Language Learning (1993). He is a Fellow of the American Association of Artificial Intelligence and was previously a Councilor of the organization. Eugene was recently honored with a lifetime achievement award from ACL.
Eugene is interested in programming computers to understand language so that they will be able to perform such tasks as answering questions and holding a conversation. This is far beyond our current capabilities, so research proceeds by dividing the problem up into manageable subparts. His research is called "statistical language learning." He and his students write programs that collect statistical information about language from large amounts of text, then apply the statistics to new examples. For example, much of his recent research has been on statistical models of syntactic parsing-grammatically identifying parts of speech and learning the rules for sentence formation, an exercise akin to the sentence diagramming that most of us did in school. Most researchers believe it is a small but important step toward true language understanding.