Harvard Computer Science Technical Reports for 2007

TR-01-07 [tr-01-07.pdf (159 K)]
Sevan G. Ficici and Avi Pfeffer . 2007. ``Modeling how Humans Reason about Others with Partial Information .''
Computer agents participate in many collaborative and competitive multi-agent domains in which humans make decisions. For computer agents to interact successfully with people in such environments, an understanding of human reasoning is beneficial. In this paper, we investigate the question of how people reason strategically about others under uncertainty and the implications of this question for the design of computer agents. Using a situated partial-information negotiation game, we conduct human-subjects trials to obtain data on human play. We then construct a hierarchy of models that explores questions about human reasoning: Do people explicitly reason about other players in the game? If so, do people also consider the possible states of other players for which only partial information is known? Is it worth trying to capture such reasoning with computer models and subsequently utilize them in computer agents? We further address these questions by constructing computer agents that use our models; we deploy our agents in further human-subjects trials for evaluation. Our results indicate that people do reason about other players in our game and that the computer agents that best model human play obtain superior scores.
TR-02-07 [tr-02-07.pdf (174 K)]
Sevan G. Ficici and Avi Pfeffer . 2007. ``Simultaneously Modeling Humans' Preferences and their Beliefs about Others' Preferences .''
In strategic multi-agent decision making, it is often the case that a strategic reasoner must hold beliefs about other agents and use these beliefs to inform its decision making. The behavior thus produced by the reasoner reflects an interaction between the reasoner's beliefs about other agents and the reasoner's own preferences. In this paper, we are interested to investigate human reasoning, particularly the interaction between a human's utility function and the beliefs the human holds to reason about other agents. A significant challenge faced by model designers, therefore, is how to model such a reasoner's behavior so that the reasoner's preferences and beliefs can each be identified and distinguished from each other. In this paper, we introduce a model of strategic human reasoning that allows us to distinguish between the human's utility function and the human's beliefs about another agent's utility function as well as the human's beliefs about how that agent might interact with yet other agents. We show that our model is uniquely identifiable. We then illustrate the performance of our model in a multi-agent negotiation game.
TR-03-07 [tr-03-07.ps.gz (277 K), tr-03-07.pdf (224 K)]
Ya'akov Gal, Barbara J. Grosz, Avi Pfeffer, Stuart Shieber, and Alex Allain. 2007. ``The Influence of Contexts on Decision-making.''
Many environments in which people and computer agents interact involve deploying resources to accomplish tasks and satisfy goals. This paper investigates the way that the contextual setting in which decisions are made affects the behavior of people and the performance of computer agents that interact with people in such environments. It presents experiments that measured negotiation behavior in two types of contextual settings. One provided a task context that made explicit the relationship between goals, tasks and resources, The other provided a completely abstract context in which the payoffs for all decision choices were listed. Results show that people are more helpful, less selfish, and less competitive when making decisions in task contexts than when making them in completely abstract contexts. Further, their overall performance was better in task contexts. A predictive computational model that was trained on data obtained in task contexts outperformed a model that was trained under abstract contexts. These results indicate that modeling the way people make decisions in context is essential for the design of computer agents that will interact with people.
TR-04-07 [tr-04-07.pdf (175 K)]
Charlie Frogner and Avi Pfeffer . 2007. ``Heuristics for Automatically Decomposing a Stochastic Process for Factored Inference.''
Dynamic Bayesian networks are factored representations of stochastic processes. Despite their factoredness, exact inference in DBNs is generally intractable. One approach to approximate inference involves factoring the variables in the process into components. In this paper we study eĀ±cient methods for automatically decomposing a DBN into weakly- interacting components so as to minimize the error in inference entailed by treating them as independent. We investigate heuristics based on two views of weak interaction: mutual information and the degree of separability ([Pf01] and [Pf06]). It turns out, however, that measuring the degree of separability exactly is probably intractable. We present a method for estimating the degree of separability that includes a mechanism for trading off efficiency and accuracy. We use the afore- mentioned heuristics in two clustering frameworks to find weakly-interacting components, and we give an empirical comparison of the results in terms of the error encountered in the belief state across the whole system, as well as that in the belief states across single components.
Sasha Rush. 2007. ``.''

TR-06-07 [tr-06-07.pdf (174 K)]
Rani Nelken and Stuart M. Shieber. 2007. ``Lexical Chaining and Word-Sense-Disambiguation .''
Lexical chains algorithms attempt to find sequences of words in a document that are closely related semantically. Such chains have been argued to provide a good indication of the topics covered by the document without requiring a deeper analysis of the text, and have been proposed for many NLP tasks. Different underlying lexical semantic relations based on WordNet have been used for this task. Since links in WordNet connect synsets rather than words, open word-sense disambiguation becomes a necessary part of any chaining algorithm, even if the intended application is not disambiguation. Previous chaining algorithms have combined the tasks of disambiguation and chaining by choosing those word senses that maximize chain connectivity, a strategy which yields poor disambiguation accuracy in practice.

We present a novel probabilistic algorithm for finding lexical chains. Our algorithm explicitly balances the requirements of maximizing chain connectivity with the choice of probable word-senses. The algorithm achieves better disambiguation results than all previous ones, but under its optimal settings shifts this balance totally in favor of probable senses, essentially ignoring the chains. This model points to an inherent conflict between chaining and word-sense-disambiguation. By establishing an upper bound on the disambiguation potential of lexical chains, we show that chaining is theoretically highly unlikely to achieve accurate disambiguation.

Moreover, by defining a novel intrinsic evaluation criterion for lexical chains, we show that poor disambiguation accuracy also implies poor chain accuracy. Our results have crucial implications for chaining algorithms. At the very least, they show that disentangling disambiguation from chaining significantly improves chaining accuracy. The hardness of all-words disambiguation, however, implies that finding accurate lexical chains is harder than suggested by the literature.

TR-07-07 [tr-07-07.pdf (337 K)]
Hamilton Y. Chong and Steven J. Gortler. 2007. ``Scene Optimized Shadow Mapping.''
Shadow mapping possesses an aliasing issue that is difficult to control. Recently a number of techniques, beginning with Perspective Shadow Maps have been proposed to choose the best possible 4 by 4 matrix for the lights projection matrix. These new methods offer no guarantees of optimality, and can perform very poorly for some light configurations. Here we describe a method that chooses the 4 by 4 matrix using an optimization frame- work. We then describe numerical experimental results comparing our method to some of the previously suggested techniques.
TR-08-07 [tr-08-07.pdf (1227 K)]
Simson L. Garfinkel. 2007. ``An Evaluation of Amazon's Grid Computing Services: EC2, S3 and SQS .''
Amazon.com's Elastic Compute Cloud (EC2), Simple Storage Service (S3) and Simple Queue Service (SQS) offer enterprise-class computing, storage and coordination facilities to any organization or individual in the world with a valid credit card. This paper details our experience working with these commodity grid computing services between November 2006 and May 2007, including an analysis of the overall system's API and ease-of-use; an analysis of EC2's management and security facilities; an end-to-end performance analysis of S3's throughput and latency as observed from Amazon's EC2 cluster and other locations on the Internet; and an analysis of the SQS operation and performance. We conclude with a report of our experience moving a large-scale research application from dedicated hardware to the Amazon offering. We find that this collection of Amazon Web Services (AWS) has great promise but are hobbled by service consistency problems, the lack of a Service Level Agreement (SLA), and a problematic Web Services Licensing Agreement (WSLA).
TR-09-07 [tr-09-07.pdf (335 K)]
Aleksandar Nanevski, Paul Govereau, and Greg Morrisett . 2007. ``Type-theoretic Semantics for Transactional Concurrency .''
We propose a dependent type theory that combines programming, specification and reasoning about higher-order concurrent programs with shared higher-order transactional memory. /par We build on our previous work on Hoare Type Theory (HTT), which is extended here with types that correspond to Hoare-style specifications for transactions. The new types have the form CMD I P x A Q, and classify concurrent programs that may execute in a shared state with invariant I, and local state precondition P. Upon termination, such programs return a result of type A, and local state changed according to the postcondition Q. Throughout the execution, shared state always satisfies the invariant I, except at specific critical sections which are executed atomically; that is, sequentially, without interference from other processes. Atomic sections may violate the invariant, but must restore it upon exit.

We follow the idea of Separation Logic, and adopt "small footprint" specifications, whereby each process is associated with a precondition that tightly describes its state requirement. /par HTT is a program logic and a dependently typed programming language at the same time. To consider it as a logic, we prove that it is sound and compositional. To consider it as a programming language, we define its operational semantics, and show adequacy with respect to the specifications.

TR-10-07 [tr-10-07.pdf (739 K)]
Yuanchen Zhu and Steven J. Gortler. 2007. ``3D Deformation Using Moving Least Squares.''
We present a 3d deformation method based on Moving Least Squares that extends the work by Schaefer et al. [Schaefer et al. 2006] to the 3d setting. The user controls the deformation by manipulating a set of point handles. Locally, the deformation takes the form of either a rigid transformation or optionally a similarity transformation, and tends to preserve local features. Our derivation of the closed-form solution is based on singular value decomposition, and is applicable to deformation in arbitrary dimensions, as opposed to the planar case in [Schaefer et al. 2006]. Our prototype implementation allows interactive deformation of meshes of over 100k vertices.

For the application of 3d mesh deformation, we further introduce a weighting scheme that determines the influence of point handles on vertices based on approximate mesh geodesics. In practice, the new scheme gives much better deformation results for limbed character models, compared with simple Euclidean distance based weighting. The new weighting scheme can be of use to the traditional skinny based deformation technique as well.

TR-11-07 [tr-11-07.pdf (1288 K)]
Bor-rong Chen, Geoffrey Peterson, Geoff Mainland, and Matt Welsh . 2007. ``LiveNet: Using Passive Monitoring to Reconstruct Sensor Network Dynamics .''
Understanding the behavior of deployed sensor networks is difficult as they become more sophisticated and larger in scale. Much of the difficulty comes from the lack of tools to provide a global view on the network dynamics. This paper describes LiveNet, a set of tools and techniques for reconstructing complex dynamics of live sensor network deployments. LiveNet is based on the use of passive sniffers co-deployed with the network. We address several challenges: merging multiple sniffer traces, determining coverage of sniffers, inference of missing information for path reconstruction and high-level analyses with application-specific knowledge. To validate LiveNet'saccuracy, we conduct controlled experiments on an indoor testbed. Finally, wepresent data from a real deployment using LiveNet. The results show thatLiveNet is able to to reconstruct network topology, bandwidth usage, routingpaths, identify hot-spot nodes, and disambiguate failures observed at application level without instrumenting application code.
TR-12-07 [tr-12-07.pdf (123 K)]
Avi Pfeffer. 2007. ``A General Importance Sampling Algorithm for Probabilistic Programs.''
Highly expressive probabilistic modeling languages are capable of describing many interesting models. Some of these models are quite complex, so approximate inference algorithms are needed. One approach to approximate inference is importance sampling, but this can be hard to do in expressive languages because of the many deterministic relationships between concepts. As a result, a sampling algorithm has to make commitments to choices long before their effects can be determined. This paper presents an importance sampling algorithm based on the principle of using the structure of a model to infer as much as possible about a decision before making a commitment. The algorithm is presented for the IBAL language, a highly expressive and general probabilistic modeling language based on the functional programming paradigm. The paper demonstrates using a musical example how easy it is to encode interesting new models in IBAL. Results show that the importance sampling algorithm is able to make interesting and useful inferences, and is far superior to a rejection sampling algorithm. The paper presents proof of concept on the musical example that the algorithm is capable of handling real applications.
TR-13-07 [tr-13-07.pdf (984 K)]
Rohan Narayana Murty, Abhimanyu Gosain, Matthew Tierney, Andrew Brody, Amal Fahad, Josh Bers, and Matt Welsh. 2007. ``CitySense: A Vision for an Urban-Scale Wireless Networking Testbed.''
In this paper, we present the vision for an open, urban-scale wireless networking testbed, called CitySense, with the goal of supporting the development and evaluation of novel wireless systems that span an entire city. CitySense is currently under development and will consist of about 100 Linux-based embedded PCs outfitted with dual 802.11a/b/g radios and various sensors, mounted on buildings and streetlights across the city of Cambridge. CitySense takes its cue from citywide urban mesh networking projects, but will differ substantially in that nodes will be directly programmable by end users. The goal of CitySense is explicitly not to provide public Internet access, but rather to serve as a new kind of experimental apparatus for urban-scale distributed systems and networking research efforts. In this paper we motivate the need for CitySense and its potential to support a host of new research and application developments. We also outline the various engineering challenges of deploying such a testbed as well as the research challenges that we face when building and supporting such a system.
TR-14-07 [tr-14-07.pdf (296 K)]
Justin Werfel and Radhika Nagpal. 2007. ``Towards a common comparison framework for global-to-local programming of self-assembling robotic systems.''
Self-assembling robotic systems are a class of modular robots where many identically-programmed modules mix randomly and attach together to assemble into complex shapes. Recently several groups have proposed global-to-local compilers for such systems and for related robotic systems; the compilers take a global shape description and automatically generate appropriate local module attachment rules. A current challenge is understanding how these approaches compare in terms of complexity, parallelism, and expressivity.

In this paper we present some initial steps towards a common framework for comparing algorithmic approaches to self-assembly. We specify a generic 2D self-assembly framework and then use this framework to compare self-assembly algorithms based on two distinct global-to-local compilers: the MakeGraphGrammar approach for generating topological graphs and the CollectiveConstruction approach for mobile robots that build structures using communicating blocks. We show that (1) both approaches generate module programs with the same high complexity, O(total modules); (2) both approaches have dramatically different best-case parallelism: while the number of steps required to build a shape with CollectiveAssembly is bounded by the diameter of that shape, MakeGridGrammar has a best case of no more than four parallel steps to complete any shape it can handle; (3) both approaches can provably generate 2D shapes without defects, but the class of shapes that it can handle is much larger for CollectiveConstruction and poses less stringent module requirements.

TR-15-07 [tr-15-07.pdf (253 K)]
Stephen Chang, Adam Kirsch, and Michael Lyons. 2007. ``Energy and Storage Reduction in Data Intensive Wireless Sensor Network Applications .''
Recording large data sets in current wireless sensor networks is difficult if not impossible, as sensor network nodes feature extremely limited capabilities. Meager amounts of memory limit storage and slow network transmission rates limit data transfer. Even if enough data is recorded, network lifetime may be brief due to energy consumption. In this paper, we discuss using Bloom filters to reduce the memory and network transmission requirements for sending or storing large amounts of data. Using our approach, we can gather more information from our wireless sensor network and simultaneously extend the network lifetime.