# Harvard Computer Science Technical Reports for 2005

TR-01-05 [tr-01-05.ps.gz (825 K), tr-01-05.pdf (181 K)]
Sukyoung Ryu and Norman Ramsey. 2005. Source-level Debugging for Multiple Languages With Modest Programming Effort.''
We present techniques that enable source-level debugging for multiple languages at the cost of only modest programming effort. The key idea is to avoid letting debugging requirements constrain the internal structure of the compiler. Constraints are minimized primarily by hiding the source-language type system and target-machine representations from the debugger. This approach enables us to support a new language and compiler while reusing existing elements: a multi-language, multi-platform debugger; the compiler's implementation of source-language types and expressions; information already present in the compiler's private data structures; and our \emph{compile-time support library}, which helps the compiler meet its obligations to the debugger without exposing language-dependent details. We evaluate our approach using two case studies: the production compiler exttt{lcc} and an instructional compiler for MiniJava.
TR-02-05 [tr-02-05.ps.gz (450 K), tr-02-05.pdf (369 K)]
Adam Kirsch and Michael Mitzenmacher. 2005. Building a Better Bloom Filter.''
A technique from the hashing literature is to use two hash functions h_1(x) and h_2(x) to simulate additional hash functions of the form g_i(x) = h_1(x) + i h_2(x). We demonstrate that this technique can be usefully applied to Bloom filters and related data structures. Specifically, only two hash functions are necessary to effectively implement a Bloom filter without any loss in the asymptotic false positive probability. This leads to less computation and potentially less need for randomness in practice.
TR-03-05 [tr-03-05.ps.gz (711 K), tr-03-05.pdf (433 K)]
Brenda Ng, Avi Pfeffer, and Richard Dearden. 2005. Factored Sampling For Efficient Tracking of Large Hybrid Systems.''
This work presents a new approach to monitoring large dynamic systems. The approach is based on factored particles, which adapts particle filtering by factoring the system into weakly interacting subsystems and maintaining particles over the factors, thus allowing much larger systems to be tracked. Our approach, hybrid factored sampling, works with systems that involve both discrete and continuous variables, including systems where discrete variables depend on continuous parents. The framework lends itself to asynchronous inference---each factor can be reasoned about independently, and the factors joined only when there exists sufficient correlation between them. This allows us to reason about each factor at its appropriate time granularity. In addition, hybrid factored sampling exploits the factorization to provide tractable look-ahead prediction, allowing sampling from the posterior probability given new observations, and considerably improving performance. Empirical results show that hybrid factored sampling is an efficient and versatile method for inference in large hybrid systems.
TR-04-05 [tr-04-05.ps.gz (236 K), tr-04-05.pdf (174 K)]
Ya'akov Gal, Shavit Talman, Meirav Hadad , and Sarit Kraus. 2005. Adapting to Agents' Personalities in Negotiation.''
To establish cooperative relationships, agents must be willing to engage in helpful behavior and to keep their commitments with agents who reciprocate this behavior. However, in uncertain and dynamic environments, it is difficult to identify the degree of helpfulness of other agents. This paper approaches this problem by characterizing agents' helpfulness in terms of cooperation and reliability. An agent chooses an action based on other agents' helpfulness as well as the dependency relationship between the agent and others. This model was evaluated in a negotiation game in which players needed to exchange resources to reach their goals, but did not have information about each other's resources. Results showed that the model allowed agents to identify and to adapt to others' varying degree of helpfulness, even while they are constantly changing their strategy. Moreover, agents who vary their cooperativeness and reliability depending on those traits of others, can outperform agents who do not, as well as increase the social welfare of the group.
TR-05-05 [tr-05-05.ps.gz (393 K), tr-05-05.pdf (324 K)]
Paul Govereau. 2005. Type Generativity in Higher-Order Module Systems.''
We present a higher-order module system similar to those found in Standard ML and Objective Caml. Our system allows both generative and non-generative types. Unlike other systems, the generativity of a type is reflected directly in the signature of the module in which it is declared, allowing a more direct analysis of type abstraction and generativity. Our module system can express both generative and applicative functors, and allows mixing of generative and non-generative types within a single module. This gives the programmer greater control over type generativity, and brings a new perspective to the relationship between the two styles of abstraction.
TR-06-05 [tr-06-05.ps.gz (233 K), tr-06-05.pdf (169 K)]
Christian Stefansen. 2005. SMAWL: A SMAll Workflow Language Based on CCS .''
There is an ongoing debate in the workflow community about the relative merits of Petri nets and pi-calculus for workflow modeling. Recently, van der Aalst presented some challenges to model workflow in pi-calculus. This paper responds to those challenges by showing how to code the 20 most commonplace workflow patterns in CCS (a subset of pi-calculus), and describes two new workflow patterns that were identified in the process. It then presents a CCS-based language, SMAWL, as a result of this work and shows how the Recording Star example can be expressed in SMAWL. The applicability of pi-calculus to the workflow modeling domain is briefly discussed and a new overlaying operator is discussed with applications to workflow descriptions.
TR-07-05 [tr-07-05.ps.gz (419 K), tr-07-05.pdf (350 K)]
Ian Fischer and Craig Gotsman. 2005. Fast Approximation of High Order Voronoi Diagrams and Distance Transforms on the GPU.''
We present an implementation of the tangent-plane algorithm for computing the kth-order Voronoi diagram of a set of sites in image space. Correct and efficient implementation of this algorithm using graphics hardware is possible only with the use of an appropriate shader program on the GPU. This is achieved by rendering in k passes the parallel projection of the top k levels of an arrangement of planes tangent to the paraboloid z = x2+y2. Each level of the arrangement corresponds to the so-called kth-nearest point diagram, which is interesting in its own right. Composition of the images of the k top levels results in the kth-order Voronoi diagram. The diagram facilitates efficient computation of the k nearest neighbors of an arbitrary query point. We describe our implementation of the algorithm in OpenGL and Cg, and its optimizations. We also show how to efficiently compute the distance transform of the given sites using the GPU, based on the first-order Voronoi diagram.
TR-08-05 [tr-08-05.pdf (863 K), tr-08-05.ps.gz (5900 K)]
Victor Shnayder, Bor-rong Chen, Konrad Lorincz, Thaddeus R. F. Fulford-Jones, and Matthew Welsh. 2005. Sensor Networks for Medical Care.''
Sensor networks have the potential to greatly impact many aspects of medical care. By outfitting patients with wireless, wearable vital sign sensors, collecting detailed real-time data on physiological status can be greatly simplified. However, there is a significant gap between existing sensor network systems and the needs of medical care. In particular, medical sensor networks must support multicast routing topologies, node mobility, a wide range of data rates and high degrees of reliability, and security.

This paper describes our experiences with developing a combined hardware and software platform for medical sensor networks, called CodeBlue. CodeBlue provides protocols for device discovery and publish/subscribe multihop routing, as well as a simple query interface that is tailored for medical monitoring. We have developed several medical sensors based on the popular MicaZ and Telos mote designs, including a pulse oximeter, EKG and motion-activity sensor. We also describe a new, miniaturized sensor mote designed for medical use.

We present initial results for the CodeBlue prototype demonstrating the integration of our medical sensors with the publish/subscribe routing substrate. We have experimentally validated the prototype on our 30-node sensor network testbed, demonstrating its scalability and robustness as the number of simultaneous queries, data rates, and transmitting sensors are varied. We also study the effect of node mobility, fairness across multiple simultaneous paths, and patterns of packet loss, confirming the system's ability to maintain stable routes despite variations in node location and data rate.

TR-09-05 [tr-09-05.ps.gz (354 K), tr-09-05.pdf (151 K)]
Alexandra Fedorova, Margo Seltzer, Christopher Small, and Daniel Nussbaum. 2005. Performance of Multithreaded Chip Multiprocessors And Implications For Operating System Design.''
An operating system's design is often influenced by the architecture of the target hardware. While uni-processor and multiprocessor architectures are well understood, such is not the case for multithreaded chip multiprocessors (CMT) - a new generation of processors designed to improve performance of memory-intensive applications. The first systems equipped with CMT processors are just becoming available, so it is critical that we now understand how to obtain the best performance from such systems.

The goal of our work is to understand the fundamentals of CMT performance and identify the implications for operating system design. We have analyzed how the performance of a CMT processor is affected by contention for the processor pipeline, the L1 data cache, and the L2 cache, and have investigated operating system approaches to the management of these performance-critical resources. Having found that contention for the L2 cache can have the greatest negative impact on processor performance, we have quantified the potential performance improvement that can be achieved from L2-aware OS scheduling. We evaluated a scheduling policy based on the balance-set principle and found that it has a potential to reduce miss ratios in the L2 by 19-37% and improve processor throughput by 27-45%. To achieve a similar improvement in hardware requires doubling the size of the L2 cache.

TR-10-05 [tr-10-05.ps.gz (579 K), tr-10-05.pdf (467 K)]
Geoffrey Goodell, Scott Bradner, and Mema Roussopoulos. 2005. "Blossom: A Decentralized Approach to Overcoming Systemic Internet Fragmentation".''
The Internet is systemically fragmented. We consider the causes of fragmentation, including both technical concerns, such as middleboxes and routing failure, as well as political concerns, such as incomplete peering and the structure of Internet governance. While fragmentation may be desirable in certain circumstances and for various reasons, it can also be problematic, violating central Internet design principles and rendering routine tasks difficult. We motivate the need for a system designed to facilitate connectivity throughout the Internet, providing the benefits of locality, universal access, and distributed management, while interoperating with the existing infrastructure. Finally, we describe our prototype implementation that enables overcoming fragmentation in our network testbed.
TR-11-05 [tr-11-05.ps.gz (332 K), tr-11-05.pdf (259 K)]
Norman Ramsey. 2005. ML Module Mania: A Type-Safe, Separately Compiled, Extensible Interpreter.''
The new embedded interpreter Lua-ML combines extensibility and separate compilation without compromising type safety. The interpreter's types are extended by applying a sum constructor to built-in types and to extensions, then tying a recursive knot using a two-level type; the sum constructor is written using an ML functor. The interpreter's initial basis is extended by composing initialization functions from individual extensions, also using ML functors. Lua-ML adds a detailed example to the debate over how much power is needed in a modules language.
TR-12-05 [tr-12-05.ps.gz (4856 K), tr-12-05.pdf (497 K)]
Avi Pfeffer. 2005. The Design and Implementation of IBAL: A General-Purpose Probabilistic Programming Language.''
This paper describes IBAL, a high level representation language for probabilistic AI. IBAL integrates several aspects of probability-based rational behavior, including probabilistic reasoning, Bayesian parameter estimation and decision theoretic utility maximization. IBAL is based on the functional programming paradigm, and is an ideal rapid prototyping language for probabilistic modeling. The paper presents the IBAL language, and presents a number of examples in the language. It then discusses the semantics of IBAL, presenting the semantics in two different ways. Finally, the inference algorithm of IBAL is presented. Seven desiderata are listed for inference, and it is shown how the algorithm fulfils each of them.
TR-13-05 [tr-13-05.ps.gz (349 K), tr-13-05.pdf (216 K)]
Kevin Redwine and Kelly Heffner. 2005. Varia: Inference Driven Optimization.''
We have designed a prototype compiler optimization infrastructure called Varia and demonstrated its potential to explore the space of optimizations. In Varia everything is represented as logic: instructions and blocks are axioms, analyses and transformations are inference rules, and optimization proceeds by forward-chaining deduction. Varia makes adding and combining optimizations simple---every rule may be eligible to fire at each step, so optimizations are automatically combined. We have coded five optimizations in our system and demonstrate that Varia can combine them. We describe how to translate normal iterative data-flow analyses into Varia, briefly discuss the performance of the rules engine, and finally, propose changes in the engine to make it more suitable for optimization.
TR-14-05 [tr-14-05.ps.gz (1886 K), tr-14-05.pdf (1255 K)]
Geetika Tewari, Craig Gotsman, and Steven J. Gortler. 2005. Meshing Genus-1 Point Clouds using Discrete One-Forms.''
We present an algorithm to mesh point clouds sampled from a closed manifold surface of genus 1. The method relies on a doubly-periodic global parameterization of the point cloud to the plane, so no segmentation of the point cloud is re-quired. Based on some recent techniques for parameterizing higher genus meshes, when some mild conditions on the sampling density are satisfied, the algorithm generates a closed toroidal manifold which interpolates the input and is geometrically similar to the sampled surface.

Keywords: mesh generation, surface reconstruction, reverse engineering, point cloud, torus, parameterization.

TR-15-05 [tr-15-05.ps.gz (281 K), tr-15-05.pdf (266 K)]
Alexandra Fedorova, Margo Seltzer, and Michael D. Smith. 2005. Modeling the Effects of Memory Hierarchy Performance On Throughput of Multithreaded Processors.''
Understanding the relationship between the performance of the on-chip processor caches and the overall performance of the processor is critical for both hardware design and software program optimization. While this relationship is well understood for conventional processors, it is not understood for new multithreaded processors that hide a workload's memory latency by executing instructions from several threads in parallel. In this paper we present a model for estimating processor throughput as a function of the cache hierarchy performance. Our model has a closed-form solution, is robust against a range of workloads and input parameters, and gives estimates of processor throughput that are within 13% of measured values for heterogeneous workloads. We demonstrate how this model can be used in an operating system scheduler tailored for multithreaded processor systems.
TR-16-05 [tr-16-05.ps.gz (1880 K), tr-16-05.pdf (1434 K)]
Amal Ahmed, Matthew Fluet, and Gregory Morrisett. 2005. A Step-Indexed Model of Substructural State.''
The concept of a unique'' object arises in many emerging programming languages such as Clean, CQual, Cyclone, TAL, and Vault. In each of these systems, unique objects make it possible to perform operations that would otherwise be prohibited (e.g., deallocating an object) or to ensure that some obligation will be met (e.g., an opened file will be closed). However, different languages provide different interpretations of uniqueness'' and have different rules regarding how unique objects interact with the rest of the language.

Our goal is to establish a common model that supports each of these languages, by allowing us to encode and study the interactions of the different forms of uniqueness. The model we provide is based on a substructural variant of the polymorphic lambda-calculus, augmented with four kinds of mutable references: unrestricted, relevant, affine, and linear. The language has a natural operational semantics that supports deallocation of references, strong (type-varying) updates, and storage of unique objects in shared references. We establish the strong soundness of the type system by constructing a novel, semantic interpretation of the types.

TR-17-05 [tr-17-05.ps.gz (741 K), tr-17-05.pdf (1812 K)]
Jonthan Ledlie and Margo Seltzer. 2005. Stable and Accurate Network Coordinates.''
Synthetic coordinate systems that mirror latencies between physical hosts have become a part of the toolbox networking researchers would like to use in real deployments. However, the most promising algorithm for building these coordinate systems, Vivaldi, breaks down when run under real world conditions. Previous work on network coordinates has examined their performance in simulation through the use of a latency matrix, which summarizes each link with a single latency. In a deployment, instead of perceiving a single latency for each link, nodes see a stream of distinct observations that may vary by as much as three orders-of-magnitude. With no means to discern an appropriate latency for each link, coordinate systems are prone to high error and instability in live deployments.

Two simple enhancements improved Vivaldi's accuracy by 54 percent and coordinate stability by 96 percent when run on a real large-scale network. First, we use a non-linear low pass filter to ascertain a clear underlying signal from each link. These filters primarily improve accuracy. Second, we introduce a distinction between system- and application-level coordinates. We evaluate a set of change-detection heuristics that allow coordinates to evolves at the system-level and only initiate an application-level update after a coordinate has undergone a significant change. These application-level coordinates retain the filter's high accuracy and dramatically increase coordinate stability.

TR-18-05 [tr-18-05.ps.gz (332 K), tr-18-05.pdf (1464 K)]
Margo Seltzer, Kiran-Kumar Muniswamy-Reddy, David A. Holland, Uri Braun, and Jonathan Ledlie . 2005. Provenance-Aware Storage Systems .''
Provenance is a type of meta-data that describes the history or ancestry of an object. Although provenance is typically manually generated and stored in a stand-alone database, we make the case that it must be managed by the storage system.

In this paper, we describe provenance-aware storage systems (PASS), a new class of storage system that automatically tracks provenance. A PASS takes responsibility for recording provenance meta-data for the objects stored on it and maintaining that provenance over time. We predict that within the next decade, all storage systems will be expected to be provenance-aware.

We describe a PASS prototype, demonstrate that tracking provenance does not incur significant overhead, and present comments from a prospective user indicating that provenanceaware storage helps scientists get their jobs done better than is currently possible.

TR-19-05 [tr-19-05.ps.gz (14876 K), tr-19-05.pdf (8614 K)]
Guillermo D. Canas and Steven J. Gortler. 2005. Surface Remeshing in Arbitrary Codimensions.''
We present a method for remeshing surfaces that is both general and efficient. Existing efficient methods are restrictive in the type of remeshings they produce, while methods that are able to produce general types of remeshings are generally based on iteration, which prevents them from producing remeshes at interactive rates. In our method, the input surface is directly mapped to an arbitrary (possibly high-dimensional) range space, and uniformly remeshed in this space. Because the mesh is uniform in the range space, all the quantities encoded in the mapping are bounded, resulting in a mesh that is simultaneously adapted to all criteria encoded in the map, and thus we can obtain remeshings of arbitrary characteristics. Because the core operation is a uniform remeshing of a surface embedded in range space, and this operation is direct and local, this remeshing is efficient and can run at interactive rates.
TR-20-05 [tr-20-05.ps.gz (553 K), tr-20-05.pdf (588 K)]
Rebecca Nesson, Stuart M. Shieber, and Alexander Rush . 2005. Induction of Probabilistic Tree-Insertion Grammars.''
Increasingly, researchers developing statistical machine translation systems have moved to incorporate syntactic structure in the models that they induce. These researchers are motivated by the intuition that the limitations in the finite-state translation models exemplified by IBM's Model 5'' follow from the inability to use phrasal and hierarchical information in the interlingual mapping. What is desired is a formalism that has the substitution-based hierarchical structure provided by context-free grammars, with the lexical relationship potential of $n$-gram models, with processing efficiency no worse than CFGs. Further, it should ideally allow for discontinuity in phrases, and be synchronizable, to allow for multilinguality. Finally, in order to support automated induction, it should allow for a probabilistic variant. We introduce probabilistic synchronous tree-insertion grammars (PSTIG) as such a formalism. In this paper, we define a restricted version of PSTIG, and provide algorithms for parsing, parameter estimation, and translation. As a proof of concept, we successfully apply these algorithms to a toy problem, corpus-based induction of a statistical translator of arithmetic expressions from postfix to partially parenthesized infix.
TR-21-05 [tr-21-05.ps.gz (25943 K), tr-21-05.pdf (3932 K)]
Jill Suzanne Nickerson. 2005. Reference Specification in Multilingual Document Production.''
To produce documents in multiple languages automatically requires a language-independent representation of the documents' meaning. For a person to build this language-independent representation by communicating in natural language with a computer system, the problem of reference must be addressed. This problem, inherent in natural language, presents itself not only in the specification of the language-independent representation, but also in the generation of documents with the meaning contained in this representation.

This thesis presents methods to make both the specification of entities in the user interface and the generation of expressions to refer to these entities in documents more natural and provides empirical evidence demonstrating the efficacy of these methods. More specifically, this thesis describes the development of three types of reference mechanisms: a statistical model that uses domain and lexical knowledge to organize new options in the interface; techniques for controlling coreference specification that take advantage of discourse structure and genre features; and automatically learned models for generating expressions to refer to new and already mentioned entities in a particular domain. The evaluation of these reference mechanisms establishes that specifying new entities using an interface informed by computational linguistic processing reduces the amount of time required to refer to entities in the interface; exploiting discourse structure and genre features is more helpful than traditional knowledge-editing interfaces for referring to entities in the interface that are already contained in the knowledge representation; and using learned linguistic information to generate referring expressions in documents leads to expressions that more closely match the decisions of people.

TR-22-05 [tr-22-05.ps.gz (1818 K), tr-22-05.pdf (2041 K)]
Bor-rong Chen, Kiran-Kumar Muniswamy-Reddy, and Matt Welsh. 2005. Lessons Learned from Implementing Ad-Hoc Multicast Routing in Sensor Networks.''
The mobile ad-hoc networking (MANET) community has proposed a wide range of protocols for unicast and multicast routing through mobile wireless devices. Many of these protocols have been studied only under simulation using simplistic radio models, and do not consider issues such as bandwidth or memory limitations. In contrast, the sensor network community has demanded solutions that work on hardware with limited resources, with a focus on routing through stationary nodes to a single base station.

Still, several emerging sensor network applications involve mobile nodes with communication patterns requiring any-to-any routing topologies. We should be able to build upon the MANET work to implement these systems. However, translating these protocols into real implementations on resource-constrained sensor nodes raises a number of challenges. In this paper, we present the lessons learned from implementing one such protocol, Adaptive Demand-Driven Multicast Routing (ADMR), on CC2420-based motes using the TinyOS operating system. ADMR was chosen because it supports multicast communication, a critical requirement for many pervasive and mobile applications. To our knowledge, ours is the first non-simulated implementation of ADMR. Through extensive measurement on a 30-node sensor network testbed, we present the performance of TinyADMR under a wide range of conditions. We highlight the real-world impact of path selection metrics, radio asymmetry, protocol overhead, and limited routing table size.

TR-23-05 [tr-23-05.ps.gz (695 K), tr-23-05.pdf (6664 K)]
Ian Fischer and Craig Gotsman . 2005. Fast Visualization of Depth Contours Using Graphics Hardware.''
Depth contours are a well-known technique for visualizing the distribution of multidimensional point data sets. We present an image-space algorithm for drawing the depth contours of a set of planar points. The algorithm is an improvement on existing algorithms based on the duality principle from computational geometry, implemented with 3D graphics rendering techniques. Our improvement takes advantage of properties of the dual arrangement of the input point set to significantly reduce the amount of computation, thus is asymptotically faster than its predecessors.
TR-24-05 [tr-24-05.ps.gz (814 K), tr-24-05.pdf (440 K)]
Aleksandar Nanevski and Greg Morrisett . 2005. Dependent Type Theory of Stateful Higher-Order Functions .''
In this paper we investigate a logic for reasoning about programs with higher-order functions and effectful features like non-termination and state with aliasing. We propose a dependent type theory HTT (short for Hoare Type Theory), where types serve as program specifications. In case of effectful programs, the type of Hoare triples {P}x:A{Q} specifies the precondition P, the type of the return result A, and the postcondition Q.

By Curry-Howard isomorphism, a dependent type theory may be viewed as a functional programming language. From this perspective, the type of Hoare triples is a monad, and HTT is a monadic language, whose pure fragment consists of higher-order functions, while the effectful fragment is a full Turing-complete imperative language with conditionals, loops, recursion and commands for stateful operations like allocation, lookup and mutation of location content.