Recent Computer Science Ph.D. graduates (AY 2008-2009)

Recent Computer Science Ph.D. graduates (AY 2008-2009)


Martin Allen: Agent Interactions in Decentralized Environments; Shlomo Zilberstein, Advisor; Feb. 2009; Mellon Post Graduate Fellow, Connecticut College.

The decentralized partially observable Markov decision process (Dec-POMDP) is a powerful formal model for multiagent problems where cooperative action is optimal, but agents act based on local data alone. Unfortunately, Dec-POMDPs are intractable: NEXP-complete in the worst case, with experiments indicating optimal solutions are infeasible. Research therefore often focuses on special subclasses, restricting the interaction between agents. In some cases, structured interactions provably reduce complexity. Empirical results suggest this also holds where formal proofs are lacking.

I establish novel complexity results for some popular restricted-interaction models, showing worst-case intractability to be more wide-spread than previously known. I therefore propose new information-theoretic ways of analyzing the actual difficulty of Dec-POMDPs. Empirical results show that these measures correlate with problem difficulty, allowing us to apply optimal algorithms more widely, and intelligently tuning performance of approximate methods as well.

Benjamin Carterette: Low-Cost and Robust Evaluation of Information Retrieval Systems; James Allan, Advisor; Sept. 2008; Assistant Professor, University of Delaware.

Research in Information Retrieval has progressed against a background of rapidly increasing corpus size and heterogeneity, with every advance in technology quickly followed by a desire to organize and search more unstructured, more heterogeneous, and even bigger corpora. But as retrieval problems get larger and more complicated, evaluating the ranking performance of a retrieval engine gets harder: evaluation requires human judgments of the relevance of documents to queries, and for very large corpora the cost of acquiring these judgments may be insurmountable. This cost limits the types of problems researchers can study as well as the data they can be studied on.

I present methods for understanding performance differences between retrieval engines in the presence of missing and noisy relevance judgments. The work introduces a model of the cost of experimentation that incorporates the cost of human judgments as well as the cost of drawing incorrect conclusions about differences between engines in both the training and testing phases of engine development. Through adopting a view of evaluation that is more concerned with distributions over performance differences rather than estimates of absolute performance, the expected cost can be minimized so as to reliably differentiate between engines with less than 1% of the human effort that has been used in past experiments without sacrificing the ability to reuse the judgments in future experiments.

Rachel Cobleigh: PROPEL: An Approach Supporting User Guidance in Developing Precise and Accessible Property Specifications; Lori A. Clarke and George S. Avrunin, Advisors; Sept. 2008; Usability Specialist, The MathWorks, Inc.

Property specifications concisely describe an aspect of system behavior. Although each property has a narrow focus, it can be difficult to specify a property correctly. There are subtle, but important, details in desired system behavior that can often be overlooked. Property specifications should be precise enough to support automated analyses and plain enough to be easily understood by system developers. Property specifications can be written in mathematical formalisms, which provide precision, but are difficult to understand. In practice, property specifications are usually written in natural language, but such informality often makes them ambiguous and they cannot be used in many types of automated analyses.

We developed a PROPerty ELucidator tool (PROPEL) which aims to make the job of specifying and understanding properties easier by providing templates that build on commonly-occurring property patterns. These templates indicate variations in the subtle details that must be considered. PROPEL provides three alternative property views: graphical finite-state automata for precision; "disciplined" natural language, for understandability; and question trees, for guidance in selecting a template.

To evaluate PROPEL, we used it to specify properties in five case studies in the medical domain. The results indicate that our approach is promising because it was effective at specifying most of the properties we encountered.


Paul Dickson: Automatic Capture and Indexing of Lectures; W. Richards Adrion and Allen R. Hanson, Advisors; Sept. 2008; Visiting Assistant Professor, Hampshire College.

This dissertation describes the design, implementation, and testing of Presentations Automatically Organized from Lectures (PAOL), a content capture system that automatically captures lectures. PAOL-captured presentations include a video, a thumbnail-based table of contents, and windows containing the material currently being presented on computer and whiteboard. These presentations are searchable using the table of contents and give the viewer access to all material presented in the lecture. The capture process occurs transparently to the lecturer with the only requirement being that the lecturer wear a wireless microphone. No training or installation of special software is required. These presentations are created to give students access to class content after the class and through the table of contents allow students to review specific portions of the material presented.

PAOL is divided into 3 subsystems: video creation, computer capture, and whiteboard capture. Image processing techniques are used to create a video of the lecturer from fixed-view high-resolution cameras focused on the whiteboard. Image processing techniques are also used to identify significant changes on the computer and whiteboard in order to form the table of contents. The whiteboard images are also processed to remove the instructor, heighten contrast, and improve clarity.


Ao Feng: Incident Threading in News; James Allan, Advisor; Sept. 2008; Software Development Engineer, Amazon

With an overwhelming volume of news reports currently available, there is an increasing need for automatic techniques to analyze and present news to a general reader in a meaningful and efficient manner. We explore incident threading as a possible solution to this problem. All text that describes the occurrence of a real-world happening is merged into a news incident, and incidents are organized in a network with dependencies of predefined types.

Earlier attempts at this problem have assumed that a news story covers a single topic. We move beyond that limitation to introduce passage threading, which processes news at the passage level. First, we develop a new testbed for this research and extend the evaluation methods to address new granularity issues. Then, a three-stage algorithm is described that identifies on-subject passages, groups them into incidents, and establishes links between related incidents. Finally, we observe significant improvement over earlier work when we optimize the harmonic mean of the appropriate evaluation measures. The resulting performance exceeds the level that a calibration study shows is necessary to support a reading comprehension task.

Yu Gu: Scalable Techniques for Network Control and Evaluation; Donald F. Towsley, Advisor; Sept. 2008; Research Staff Member, NEC Lab

The Internet has been experiencing significant changes in its scale, capacity, user population, traffic volume and the variety and quantity of online applications and devices. Its continuing evolution demands scalable technologies that provide sustained support for a network with increasing scale and complexity.

In this thesis, we propose several scalable techniques that apply to network control and evaluation. Specifically, we consider the following three problems: First, what is the appropriate congestion control algorithm for networks as capacities get higher and higher and workloads evolve? Second, how does one efficiently measure one-way packet loss rates along paths in a large network? And third, how does one simulate large high bandwidth IP networks so as to obtain detailed packet level information?


Gary Holness: A Statistical Approach to Improving Accuracy in Classifier Ensembles; Paul E. Utgoff, Advisor; Sept. 2008; Lead Research Scientist, Lockheed Martin Advanced Technology Laboratories

Popular ensemble classifier induction algorithms, such as bagging and boosting, construct the ensemble by optimizing component classifiers in isolation. The controllable degrees of freedom in an ensemble include the instance selection and feature selection for each component classifier. Zenobi et al. demonstrated that ensemble construction should consider a classifier's contribution to ensemble accuracy and diversity even at the expense of individual classifier performance. To tradeoff individual accuracy against ensemble accuracy and diversity, a component classifier inducer requires knowledge of the choices made by the other ensemble members.

We introduce an approach, called DiSCO, that exercises direct control over the tradeoff between diversity and error by sharing ensemble-wide information on instance selection during training. In this work, we explore a method for training the component classifiers collectively by sharing information about training set selection. This allows our algorithm to build ensembles whose component classifiers select complementary error distributions that maximize diversity while minimizing ensemble error directly. Treating ensemble construction as an optimization problem, we explore approaches using local search, global search, and stochastic methods. In ensemble classification research, how to use diversity to build effective classifier teams is an open question. We also provide a method that uses entropy as a measure of diversity to train an ensemble classifier.


Ozgur Simsek: Behavioral building blocks for autonomous agents: description, identification, and learning; Andrew G. Barto, Advisor; Sept. 2008; Postdoctoral Research Fellow, Max Planck Institute.

The broad problem I address in my dissertation is design of autonomous agents that can learn how to achieve desired behaviors in large, complex environments. I focus on one essential design component: the ability to form new behavioral units (or skills) from existing ones. For instance, how can a robot (one that needs to manipulate objects) figure out that grasping is a useful behavior, learn how to perform this behavior, and use it as a behavioral building block in the future? Similarly, how can a tic-tac-toe player go through an analogous process to form a behavior for creating a fork on the board?

In addressing this problem, the main assumption I make is that the agent's interaction with its environment forms a Markov decision process. I make three primary contributions. First, I propose a concrete definition of what makes a useful skill. This definition uses a graphical representation of the agent's interaction with its environment and is based on an analysis of shortest paths on this graph. Second, I introduce several online algorithms for identifying such skills. And third, I formulate the exploration problem (of how to efficiently acquire a desired skill) as an optimization problem and introduce a reinforcement learning algorithm that solves it approximately.


Mark Smucker: Evaluation of Find-Similar with Simulation and Network Analysis; James Allan, Advisor; Sept. 2008; Assistant Professor, University of Waterloo

In this dissertation, we show how the addition of a simple user interaction mechanism, find-similar, can improve information retrieval (IR) quality by making it easier for users to navigate from relevant documents to other relevant documents. Find-similar allows a user to request documents similar to a given document. In the first part of the dissertation, we measure find-similar's retrieval potential through simulation of a user's behavior with hypothetical user interfaces. We show that find-similar has the potential to improve the retrieval quality of a state-of-the-art IR system by 23% and match the performance of relevance feedback. We also show how find-similar responds to varying initial conditions and acts to compensate for poor retrieval quality. In the second part of the dissertation, we characterize find-similar by measuring the quality of the document networks formed by find-similar's document-to-document similarity measure. Find-similar effectively creates links between documents that allow the user to navigate documents by similarity. We show that find-similar's similarity measure affects the navigability of the document network and how a query-biased similarity measure can improve find-similar. We develop measures of network navigability and show that find-similar should make the World Wide Web more navigable. Taken together, the simulation of find-similar and the measurement of the navigability of document networks shows how find-similar as a simple user interaction mechanism can improve a user's ability to find relevant documents.


Xuerui Wang: Structured Topic Models: Jointly Modeling Words and Their Accompanying Modalities; Andrew McCallum, Advisor; May 2009; Scientist, Yahoo! Labs.

The abundance of textual data in the information age poses an immense challenge for us: how to perform large-scale inference to understand and utilize this overwhelming amount of information. We develop effective and efficient statistical topic models for massive text collections by taking care of extra information from other modalities in addition to the text itself. Text documents are not just text; various additional information is naturally interleaved with text. Most of the previous work, however, pays attention to only one modality at a time and ignores the others. I present a series of probabilistic topic models to show how we can bridge multiple modalities of information, in a united fashion, for various tasks. Interestingly, joint inference over multiple modalities leads to many findings that cannot be discovered from just one modality alone, which are clear evidence that we can better understand and utilize massive text collections when additional modalities are considered and modeled jointly with text.


Ting Yang: Operating System Support for Modern Applications; Emery Berger and J. Eliot Moss, Advisors; May 2009; Software Engineer, Intel Corp.

Computer systems now run drastically different workloads than they did two decades ago. The enormous advances in both hardware and software technologies dramatically reshape applications and their behaviors, thus introducing more challenges to system resource management. However, existing general-purpose operating systems have not kept up. Their resource managers (CPU, memory and disk I/O) still work independently, leading to various performance issues. Garbage-collected applications suffer under memory pressure because the virtual memory manager does not provide enough information for them to adjust heap size. Interactive applications may lose responsiveness because one resource is overloaded.

To deliver robust performance, an operating system has to coordinate its resource managers, as well as cooperate with those in the user space (e.g. garbage collector and thread manager). To support garbage-collected applications, we present CRAMM, a system that enables them to adaptively adjust heap size to maintain high throughput by cooperating with the underlying virtual memory manager. To support highly interactive workloads, we present Redline, a system that uses lightweight specifications to drive CPU scheduling and to coordinate memory and disk I/O management to maintain system's responsiveness. Our experiences show that such coordination can dramatically improve application performance under heavy contention, sometimes by orders of magnitude.