NSF has awarded three more Information Technology Research (ITR) grants to researchers in this department.
First, Michael Black was granted $446K over three years for ``The Computer Science of Biologically Embedded Systems'', work done in conjunction with John Donoghue (Biomed-Neuroscience) and Lucien Bienenstock (Division of Applied Mathematics). Michael's abstract states, `` Biologically embedded systems that directly couple artificial computational devices with neural systems are emerging as a new area of information technology research. The physical structure and adaptability of the human brain make these biologically embedded systems quite different from computational systems typically studied in Computer Science.
``Fundamentally, biologically embedded systems must make inferences about the behavior of a biological system based on measurements of neural activity that are indirect, ambiguous, and uncertain. Moreover. these systems must adapt to short- and long-term changes in neural activity of the brain. These problems are addressed by a multidisciplinary team in the context of developing a robot arm that is controlled by simultaneous recordings from neurons in the motor cortex of a subject. The goal is to model the behavior of these neurons probabilistically as a function of arm motion and then reconstruct continuous arm trajectories based on the neural activity. To do so, the project will exploit mathematical and computational techniques from computer vision, image processing, and machine learning.
``This work will enhance scientific knowledge about how to design and build new types of hybrid human/computer systems, will explore new devices to assist the severely disabled, will address the information technology questions raised by these biologically embedded systems, and will contribute to the understanding of neural coding.''
Second, Eli Upfal, working with Michael Mitzenmacher at Harvard, has been awarded $524K over five years for research on ``Algorithmic Issues in Large-Scale Dynamic Networks''. Eli summarizes his work as follows: ``We propose to develop a theoretically well-founded framework for the design and analysis of algorithms for large-scale dynamic networks, in particular, for the Web and related dynamic networks, such as the underlying Internet topology and Internet-based peer-to-peer ad hoc networks. We plan to develop rigorous mathematical models that capture key characteristics and can make reliable predictions about features such as connectivity, information content, and dynamic of these networks. We plan to apply this framework to test existing algorithms and construct improved new algorithms.
``The main benefits of developing the mathematical models of the Web structure and dynamics will be the improved theoretical foundation for the design, analysis, and testing of algorithms that operate in the Web environment. The tangible results of this work will therefore be models that can be subjected to experimental verification, analyses of algorithms based upon these models, new algorithms that benefit from these analyses, and, finally, proof-of-concept demonstrations and experimental evaluations of such algorithms.''
Third, Eugene Charniak received $450K over three years for work on ``Learning Syntactic/Semantic Information for Parsing''. Eugene says, ``The research envisioned under this grant concerns the unsupervised learning of structural information about English that is not present in current tree-banks (specifically the various Penn tree-banks). That is, one wants a machine to learn this information without having to create a corpus in which the information is annotated. Generally unsupervised (as opposed to supervised) learning is of more interest because ultimately such research may shed light on the larger problems of learning a complete grammar for English, because the creation of significant corpora is a very labor-intensive task that should be avoided if at all possible, and because quite often the subdomains in question are areas of theoretical dispute, so obtaining the agreement necessary prior to a corpus-creation project might be difficult.
``This proposal is called `Learning Syntactic/Semantic Information for Parsing' because the structural information to be learned often falls at the boundary between syntax and semantics. For example, is the fact that `Fred' is typically a person's first name a syntactic or semantic fact? Does the fact that `New York Stock Exchange' has as part of its name the location `New York' fall under syntax or semantics? What about the similarity between the expressions `[to] market useless items' and `the market for useless items'? These are some of the topics that come up in this research.
``As for the `for Parsing' portion of the title, the intention is to learn the above kinds of information in a form that current statistical parsers can use so that they can output more finely structured parses. However, this is not meant to suggest that parsing is the sole use for this sort of information -- exactly the opposite is the case. For example, more and more systems for automatically extracting information from free text use coreference detection and `named-entity recognition' (e.g., recognizing that `New York' is a location but `New York Stock Exchange' is an organization). There is evidence to suggest that both coreference and named-entity recognition can be improved with the finer level of analysis to be made possible by this research. Or again, `language models' (programs that assign a probability to strings in a language) are standard parts of all current speech-recognition systems. There is now evidence suggesting that finer-grained syntactic analysis can improve current language models. Thus this research will enable a wide variety of systems to make better use of language input and thus make these systems more accessible to a diverse user pool.''