Harvard Computer Science Technical Reports for 1990

TR-01-90
Donald Beaver. 1990. ``Perfect Privacy for Two-Party Protocols.''
We examine the problem of computing functions in a distributed manner based on private inputs, revealing the value of a function without revealing any additional information about the inputs. A function $f(x_1,ots,x_n)$ is $t$-private if there is a protocol whereby $n$ parties, each We holding an input value, can compute $f,$ and no subset of $t$ or fewer parties gains any information other than what is computable from $f(x_1,ots,x_n).$ The class of $t$-private {\em boolean} functions for $t \geqlceil \frac {n} {2} \rceil$ was described in {ck89} . We give a characterization of 1-private functions for two parties, without the restriction to the boolean case. We also examine a restricted form of private computation in the $n$-party case, and show that addition is the only $t$-privately computable function in that model. Incorrect proofs of this characterization appeared in [Kushilevitz,1989] and an earlier technical report [Beaver,1989]. We present a different proof which avoids the errors of those works. This report supersedes Harvard Technical Report TR-11-89.
TR-02-90
David Albert. 1990. ``Cooperative Dialogues While Playing Adventure.''
To collect examples of the use of hedging phrases in informal conversation, three pairs of subjects were (separately) asked to cooperate in playing the computer game {\em Adventure} . While one typed commands into the computer, the two engaged in conversation about their options and best strategy. The conversation was recorded on audio tape, and their computer session was saved in a disk file. Subsequently, the audio tape was transcribed and merged with the computer record of the game, to produce a combined transcript showing what was typed along with a simultaneous running commentary by the participants. This report contains the complete transcripts and a discussion of the collection and transcription methodology.
TR-03-90
Robert Muller. 1990. ``M-LISP: A Representation Independent Dialect of LISP with Reduction Semantics.''
In this paper we propose to reconstruct LISP from first principles with an eye toward reconciling S-expression LISP's metalinguistic facilities with the kind of operational semantics advocated by Plotkin {pl81} . After reviewing the original definition of LISP we define the abstract syntax and the operational semantics of the new dialect, M-LISP, and show that its equational theory is consistent. Next we develop the operational semantics of an extension of M-LISP which features an explicitly callable {\em eval} and {\em fexprs} (i.e., procedures whose arguments are passed {\em by-representation} ). Since M-LISP is independent of any representation of its programs it has no {\em quotation} operator or any of its related forms. To compensate for this we encapsulate the shifting between mention and use which is performed globally by {\em quote} within the metalinguistic constructs that require it. The resulting equational system is shown to be inconsistent. We leave it as an open problem to find confluent variants of the metalinguistic constructs.
TR-04-90
Robert Muller. 1990. ``Syntax Macros in M-LISP: A Representation Independent Dialect of LISP with Reduction Semantics.''
In this paper we consider syntactic abstraction in M-LISP, a dialect of LISP which is independent of any representation of its programs. Since it is independent of McCarthy's original Meta- expression representation scheme, M-LISP has no {\em quotation} form or any of its related forms {\em backquote} , {\em unquote} or {\em unquote-splicing} . Given that LISP macro systems depend on this latter representation model, it is not obvious how to base syntax extensions in M-LISP. Our approach is based on an adaptation of the {\em Macro-by-Example} {kowa87} and {\em Hygienic} algorithms {kofrfedu86} . The adaptation for the quotation-free syntactic structure of M-LISP yields a substantially different model of syntax macros. The most important difference is that $lambda$ binding patterns become apparent when an abstraction is first (i.e., partially) transcribed in the syntax tree. This allows us to define tighter restrictions on the capture of identifiers. This is not possible in S-expression dialects such as Scheme since $lambda$ binding patterns are not apparent until the tree is completely transcribed.
TR-05-90
Robert Muller. 1990. ``Semantics Prototyping in M-LISP (Extended Abstract).''
In this paper we describe a new semantic metalanguage which simplifies prototyping of programming languages. The system integrates Paulson's semantic grammars within a new dialect of LISP, M-LISP, which has somewhat closer connections to the $lambda$-calculus than other LISP dialects such as Scheme. The semantic grammars are expressed as attribute grammars. The generated parsers are M-LISP functions that can return denotational (i.e., higher-order) representations of abstract syntax. We illustrate the system with several examples and compare it to related systems.
TR-06-90
Azer Bestavros. 1990. ``ESPRIT Executable Specification of Parallel Real-time Interactive Tasks.''
The vital role that embedded systems are playing and will continue to play in our world, coupled with their increasingly complex and critical nature, demand a rigorous and systematic treatment that recognizes their unique requirements. The Time- constrained Reactive Automaton (TRA) is a formal model of computation that admits these requirements. Using the TRA model, an embedded system is viewed as a set of {\em asynchronously} interacting automata (TRAs), each representing an {\em autonomous} system entity. TRAs are {\em input enabled} ; they communicate by signaling events on their {\em output channels} and by reacting to events signaled on their {\em input channels} . The behavior of a TRA is governed by {\em time-constrained causal relationships} between {\em computation-triggering} events. The TRA model is {\em compositional} and allows time, control, and computation {\em non-determinism} . In this paper we present ESPRIT, a specification language that is entirely based on the TRA model. We have developed a compiler that allows ESPRIT specifications to be executed in simulated time, thus, providing a valuable validation tool for embedded system specifications. We are currently developing another compiler that would allow the execution of ESPRIT specifications in real-time, thus, making it possible to write real-time programs directly in ESPRIT.
TR-07-90
Harry R. Lewis. 1990. ``A Logic of Concrete Time Intervals (Extended Abstract).''
This paper describes (1) a finite-state model for asynchronous systems in which the time delays between the scheduling and occurrence of the events that cause state changes are constrained to fall between fixed numerical upper and lower time bounds; (2) a branching-time temporal logic suitable for describing the temporal and logical properties of asynchronous systems, for which the structures of (1) are the natural models; and (3) a functional verification system for asynchronous circuits, which generates, from a boolean circuit with general feedback and specified min/max rise and fall times for the gates, a finite-state structure as in (1), and then exhaustively checks a formal specification of that circuit in the language (2) against that finite-state model.
TR-08-90
Ehud Reiter. 1990. ``Generating Descriptions that Exploit a User's Domain Knowledge.''
Natural language generation systems should customize object descriptions according to the extent of their user's domain and lexical knowledge. The task of generating customized descriptions is formalized as a task of finding descriptions that are {\it accurate} (truthful), {\it valid} (fulfill the speaker's communicative goal), and {\it free of false implicatures} (do not give rise to unintended conversational implicatures) with respect to the current user model. An algorithm that generates descriptions that meet these constraints is described, and the computational complexity of the generation problem is discussed.
TR-09-90
Ehud Reiter. 1990. ``The Computational Complexity of Avoiding Conversational Implicatures.''
Referring expressions and other object descriptions should be maximal under the Local Brevity, No Unnecessary Components, and Lexical Preference preference rules; otherwise, they may lead hearers to infer unwanted conversational implicatures. These preference rules can be incorporated into a polynomial time generation algorithm, while some alternative formalizations of conversational implicature make the generation task NP-Hard.
TR-10-90
Ehud Baruch Reiter. 1990. ``Generating Appropriate Natural Language Object Descriptions.''
Natural language generation (NLG) systems must produce different utterances for users with different amounts of domain and lexical knowledge. A utterance that is meant to be read by an expert should use technical vocabulary, and avoid explicitly mentioning facts the expert can immediately infer from the rest of the utterance. In contrast, an utterance that is meant to be read by a novice should avoid specialized vocabulary, and may be required to explicitly mention facts that would be obvious to an expert. An NLG system that does not customize utterances according to its user's domain and lexical knowledge may generate text that is incomprehensible to a novice, or text that leads an expert to infer unwanted {\it conversational implicatures} (Grice 1975). This thesis examines the problem of generating attributive descriptions of individuals, that is object descriptions that are intended to inform the user that a particular object has certain attributes. It proposes that such descriptions will be appropriate for a particular user if they are {\it accurate, valid} , and {\it free of false implicatures} with respect to a user-model that represents that user's relevant domain and lexical knowledge. Descriptions are represented as definitions of KL-ONE- type (Brachman and Schmolze 1985) classes, and a description is called {\it accurate} if it defines a class that subsumes the object being described; {\it valid} if every attribute the system wishes to communicate is either part of the description or a default attribute that is inherited by the class defined by the description; and {\it free of false implicatures} if it is maximal under three preference rules: No Unnecessary Components, Local Brevity, and Lexical Preference.
TR-11-90
Yuli Zhou and Robert Muller. 1990. ``Domain Theory for Nonmonotonic Functions.''
We prove several lattice theoretical fixpoint theorems based on the classical result of Knaster-Tarski. These theorems give sufficient conditions for a system of generally nonmonotonic functions, on a complete lattice to define a unique fixpoint. The primary objective of this paper is to develop a domain theoretic framework to study the semantics of general logic programs as well as various rule-based systems where the rules define nonmonotonic functions on lattices.
TR-12-90
Joe Marks. 1990. ``A Syntax and Semantics for Network Diagrams.''
The ability to automatically design graphical displays of data will be important for the next generation of interactive computer systems. The research reported here concerns the automated design of network diagrams, one of the three main classes of symbolic graphical display (the other two being chart graphs and maps). Previous notions of syntax and semantics for network diagrams are not adequate for automating the design of this kind of graphical display. I present here a new formulation of syntax and semantics for network diagrams that is used in the ANDD (Automated Network Diagram Designer) system. The syntactic formulation differs from previous work in two significant ways: perceptual-organization phenomena are explicitly represented, and syntax is described in terms of constraints rather than as a grammar of term-rewriting rules. The semantic formulation is based on an application- independent model of network systems that can be used to model many real-world applications. The paper includes examples that show how these concepts are used by ANDD to automatically design network diagrams.
TR-13-90
Joe Marks and Ehud Reiter. 1990. ``Avoiding Unwanted Conversational Implicatures in Text and Graphics.''
We have developed two systems, FN and ANDD, that use natural language and graphical displays, respectively, to communicate information about objects to human users. Both systems must deal with the fundamental problem of ensuring that their output does not carry unwanted and inappropriate conversational implicatures. We describe the types of conversational implicatures that FN and ANDD can avoid, and the computational strategies the two systems use to generate output that is free of unwanted implicatures.
TR-14-90
Karen E. Lochbaum, Barbara J. Grosz, and Candace L. Sidner. 1990. ``Models of Plans to Support Communications: An Initial Report.''
Agents collaborating to achieve a goal bring to their joint activity different beliefs about ways in which to achieve the goal and the actions necessary for doing so. Thus, a model of collaboration must provide a way of representing and distinguishing among agents' beliefs and of stating the ways in which the intentions of different agents contribute to achieving their goal. Furthermore, in collaborative activity, collaboration occurs in the planning process itself. Thus, rather than modelling plan recognition, per se, what must be modelled is the {\it augmentation} of beliefs about the actions of multiple agents and their intentions. In this paper, we modify and expand the SharedPlan model of collaborative behavior (Grosz & Sidner 1990). We present an algorithm for updating an agent's beliefs about a partial SharedPlan and describe an initial implementation of this algorithm in the domain of network management.
TR-15-90
Yuh-Dauh Lyuu. 1990. ``An Information Dispersal Approach to Issues in Parallel Processing.''
Efficient schemes for the following issues in parallel processing are presented: fast communication, low congestion, fault tolerance, simulation of ideal parallel computation models, synchronization in asynchronous networks, low sensitivity to variations in component speed, and on-line maintenance. All our schemes employ Rabin's information dispersal idea. We also develop an efficient information dispersal algorithm (IDA) based on the Fast Fourier Transform and an IDA-based voting scheme to enforce fault tolerance. Let $N$ denote the size of the hypercube network. We present a randomized communication scheme, FSRA (for ``Fault-tolerant Subcube Routing Algorithm''), that routes in $2 dot log N + 1$ time using only constant size buffers and with probability of success $1 - N^ {-\Theta (log N)} $. (All log's are to the base 2.) FSRA also tolerates $O(N)$ random link failures with high probability. Similar results are also obtained for the de Bruijn and the butterfly networks (without fault tolerance in the latter case). FSRA is employed to simulate, without using hashing, a class of CRCW PRAM (concurrent-read concurrent-write parallel random access machine) programs with a slowdown of $O(log N)$ with almost certainty if combining is used. A fault-tolerant simulation scheme for general CRCW PRAM programs is also presented. A simple acknowledgement synchronizer can make all our routing schemes in this dissertation run on asynchronous networks without loss of efficiency. We further show that speed of any component--be it a processor or a link--has only linear impact on the run-time of FSRA; that is, the extra delay in run-time is only proportional to the drift in the component's delay and is independent of the size of the network. On-line maintainability makes the machine more available to the user. We show that, under FSRA, a constant fraction of the links can be disabled with essentially no impact on the routing performance. This result immediately suggests several efficient maintenance procedures. Based on the above results, a fault-tolerant parallel computing system, called HPC (for ``hypercube parallel computer''), is sketched at the end of this dissertation.
TR-16-90
Thomas R. Hancock. 1990. ``Identifying $\mu$-Formula Decision Trees with Queries.''
We consider a learning problem for the representation class of $\mu$-formula decision trees, a generalization of $\mu$-formulas and $\mu$-decision trees. (The ``$\mu$'' form of a representation has the restriction that no variable appears more than once). The learning model is one of exact identification by oracle queries, where learner's goal is to discover an unknown function by asking membership queries (is the function true on some specified input?) and equivalence queries (is the function identical to some hypothesis we present, and if not what is an input on which they differ?). We present an identification algorithm using these two types of queries that runs in time polynomial in the number of variables, we and show that no such polynomial time algorithm exists that uses either membership or equivalence queries alone (in the latter case under the stipulation that the hypotheses are drawn from the same representation class). We further extend the algorithm to identify a broader class where the formulas taken over a more powerful basis including arbitrary threshold gates.
TR-17-90
Mark Friedell and Sandeep Kochhar. 1990. ``Design and Modeling with Schema Grammars.''
Graphical scene modeling is usually a time-consuming, tedious, and expensive manual activity. This paper proposes an approach to partially automating the process though a paradigm for cooperative user-computer scene modeling that we call {\em cooperative computer-aided design} (CCAD). Formal grammars, referred to as {\em schema grammars} , are used to imbue the modeling system with an elementary ``understanding'' of the kinds of scenes to be created. The grammar interpreter constructs part or all of the scene after accepting from the user partially completed scene components and descriptions of scene properties that must be incorporated into the final scene. This approach to modeling harnesses the power of the computer to construct scene detail, thereby freeing the human user to focus on essential creative decisions. This paper describes the structure and interpretation of schema grammars, and provides techniques for controlling the combinatorial explosion that can result from the undirected interpretation of the grammars. CCAD is explored in the context of two experimental systems---FLATS, for architectural design, and LG, for modeling landscapes.
TR-18-90
Sandeep Kochhar. 1990. ``Cooperative Computer-Aided Design: A Paradigm for Automating the Design and Modeling of Graphical Objects.''
Design activity is often characterized by a search in which the designer examines various alternatives at several stages during the design process. Current computer-aided design (CAD) systems, however, provide very little support for this exploratory aspect of design. My research provides the foundation for {\em cooperative computer-aided design (CCAD)} ---a novel CAD technology that intersperses partial exploration of design alternatives by the computer with guiding design operations by the system user. CCAD combines the strengths of manual and automated design and modeling by allowing the user to make creative decisions and provide specialized detail, while exploiting the power of the machine to explore many design alternatives and create detail that is resonant with the user's design decisions. In the CCAD paradigm, the user expresses initial design decisions in the form of a partial design and a set of properties that the final design must satisfy. The user then initiates the generation by the system of alternative {\em partial} developments of the initial design subject to a ``language'' of valid designs. The results are then structured in a spatial framework through which the user moves to explore the alternatives. The user selects the most promising partial design, refines it manually, and then requests further automatic development. This process continues until a satisfactory complete design is created. I present the interpretation of schema grammars---a class of generative grammars for manipulating graphical objects---as the fundamental generative mechanism underlying CCAD. I describe in detail several mechanisms for providing user control over the generative process, thereby controlling the combinatorial explosion inherent in an unrestricted, undirected interpretation of generative grammars. I explore graphical browsing as a facility for efficiently perusing the set of design alternatives generated by the system. I also describe FLATS ( {f F} loor plan {f LA} you {f T} {f S} ystem)---a prototype CCAD system for the design of small architectural floor plans.
TR-19-90
William A. Woods. 1990. ``Understanding Subsumption and Taxonomy: A Framework for Progress.''
This paper continues a theme begun in my paper, ``What's in a Link,'' -- seeking a solid foundation for network representations of knowledge. Like its predecessor, its goal is to clarify issues and establish a framework for progress. The paper analyzes the concepts of subsumption and taxonomy and synthesizes a framework that integrates and clarifies many previous approaches and goes beyond them to provide an account of abstract and partially defined concepts. The distinction between definition and assertion is reinterpreted in a framework that accommodates probabilistic and default rules as well as universal claims and abstract and partial definitions. Conceptual taxonomies in this framework are shown to be useful for indexing and organizing information and for managing the resolution of conflicting defaults. The paper introduces a distinction between intensional and extensional subsumption and argues for the importance of the former. It presents a classification algorithm based on intensional subsumption and shows that its typical case complexity is logarithmic in the size of the knowledge base.
TR-20-90
William A. Woods and James G. Schmolze. 1990. ``The KL-ONE Family.''
The knowledge representation system KL-ONE has been one of the most influential and imitated knowledge representation systems in the Artificial Intelligence community. Begun at Bolt Beranek and Newman in 1978, KL-ONE pioneered the development of taxonomic representations that can automatically classify and assimilate new concepts based on a criterion of terminological subsumption. This theme generated considerable interest in both the formal community and a large community of potential users. The KL-ONE community has since expanded to include many systems at many institutions and in many different countries. This paper introduces the KL-ONE family and discusses some of the main themes explored by KL-ONE and its successors. We give an overview of current research, describe some of the systems that have been developed, and outline some future research directions.
TR-21-90
Hagit Attiya and Marios Mavronicolas. 1990. ``Efficiency of Semi-Synchronous versus Asynchronous Networks.''
The $s$-session problem is studied in {\em asynchronous} and {\em semi-synchronous} networks. Processes are located at the nodes of an undirected graph $G$ and communicate by sending messages along links that correspond to the edges of $G$. A session is a part of an execution in which each process takes at least one step; an algorithm for the $s$-session problem guarantees the existence of at least $s$ disjoint sessions. The existence of many sessions guarantees a degree of interleaving which is necessary for certain computations. It is assumed that the (real) time for message delivery is at most $d$. In the asynchronous model, it is assumed that the time between any two consecutive steps of any process is in the interval $[0,1]$; in the semi-synchronous model, the time between any two consecutive steps of any process is in the interval $[c,1]$ for some $c$ such that $0 < c leq 1$, the {\em synchronous} model being the special case where $c=1$. In the {\em initialized} case of the problem, all processes are initially synchronized and take a step at time 0. For the asynchronous model, an fupper bound of $diam(G)(d+1)(s-1)$ and a lower bound of $diam(G)d(s-1)$ are presented; $diam(G)$ is the {\em diameter} of $G$. For the semi- synchronous model, an upper bound of $1+\min{lfloor \frac {1} {c} \rfloor +1, diam(G)(d+1)} (s-2)$ is presented. The main result of the paper is a lower bound of $1+\min{lfloor \frac {1} {2c} \rfloor, diam(G)d} (s-2)$ for the time complexity of any semi-synchronous algorithm for the $s$-session problem, under the assumption that $d \geq \frac {d} {\min{lfloor\frac {1} {2c} \rfloor, diam(G)d}} + 2$. These results imply a time separation between semi-synchronous (in particular, synchronous) and asynchronous networks. Similar results are proved for the case where delays are not uniform. In the {\em uninitialized} case of the problem, all processes but one, the {\em initiator} , start in a {\em quiescent} state which they may leave upon receiving a message. Similar results are proved for this case.
TR-22-90
Azer Bestavros and Thomas Cheatham. 1990. ``Efficient execution of homogeneous tasks with unequal run times on the Connection Machine.''
A lot of scientific applications require the execution of a large number of identical {\em tasks} , each on a different set of data. Such applications can easily benefit from the power of SIMD architectures ( {\em e.g. the Connection Machine} ) by having the array of processing elements (PEs) execute the task in parallel on the different data sets. It is often the case, however, that the task to be performed involves the repetitive application of the same sequence of steps, {\em a body} , for a number of times that depend on the input or computed data. If the usual {\em task-level synchronization} is used, the utilization of the array of PEs degrades substantially. In this paper, we propose a {\em body- level synchronization} scheme that would boost the utilization of the array of PEs while keeping the required overhead to a minimum. We mathematically analyze the proposed technique and show how to optimize its performance for a given application. Our technique is especially efficient when the number of tasks to be executed is much larger than the number of physical PEs available.
TR-23-90
Cecile T. Balkanski. 1990. ``Modelling Act-Type Relations in Collaborative Activity.''
Intelligent agents collaborating to achieve a common goal direct a significant portion of their effort to planning together. Because having a plan to perform a given task involves having knowledge about the ways in which the performance of a set of actions will lead to the performance of that task, the representation of actions and of relations among them is a central concern. This paper provides a formalism for representing act-type relations and complex act-type constructors in multiagent domains. To determine a set of relations that would span the space of complex collaborative actions, I analyzed a videotape of two people building a piece of furniture, and identified a group of relations that are adequate to represent the relationships among the actions occurring in the data. The definitions provided here are more complex than those used in earlier studies; in particular, I refine and expand Pollack's (1986) set of act-type relations (generation and enablement), provide constructors for building complex act-types from (simpler) act-types (simultaneity, conjunction, sequence and iteration), and make it possible to represent the joint actions of multiple agents.
TR-24-90
Donald Rozinak Beaver. 1990. ``Security, Fault Tolerance, and Communication Complexity in Distributed Systems.''
We present efficient and practical algorithms for a large, distributed system of processors to achieve reliable computations in a secure manner. Specifically, we address the problem of computing a general function of several private inputs distributed among the processors of a network, while ensuring the correctness of the results and the privacy of the inputs, despite accidental or malicious faults in the system. Communication is often the most significant bottleneck in distributed computing. Our algorithms maintain a low cost in local processing time, are the first to achieve optimal levels of fault-tolerance, and most importantly, have low communication complexity. In contrast to the best known previous methods, which require large numbers of rounds even for fairly simple computations, we devise protocols that use small messages and a constant number of rounds {\em regardless} of the complexity of the function to be computed. Through direct algebraic approaches, we separate the {\em communication complexity of secure computing} from the {\em computational complexity} of the function to be computed. We examine security under both the modern approach of computational complexity-based cryptography and the classical approach of unconditional, information-theoretic security. We develop a clear and concise set of definitions that support formal proofs of claims to security, addressing an important deficiency in the literature. Our protocols are provably secure. In the realm of information-theoretic security, we characterize those functions which two parties can compute jointly with absolute privacy. We also characterize those functions which a weak processor can compute using the aid of powerful processors without having to reveal the instances of the problem it would like to solve. Our methods include a promising new technique called a {\em locally random reduction} , which has given rise not only to efficient solutions for many of the problems considered in this work but to several powerful new results in complexity theory.
TR-25-90
Athanasios M. Tsantilas. 1990. ``Communication Issues in Parallel Computation.''
This thesis examines the problem of interprocessor communication in realistic parallel computers. In particular, we consider the problem of permutation routing and its generalizations in the mesh, hypercube and butterfly networks. Building on previous research, we derive lower bounds for a wide class of deterministic routing algorithms which imply that such algorithms create heavy traffic congestion. In contrast, we show that randomized routing algorithms result in both efficient and optimal upper bounds in the above networks. Experiments were also performed to test the behaviour of the randomized algorithms. These experiments suggest interesting theoretical problems. We also examine the problem of efficient interprocessor communication in a model suggested by recent advances in optical computing. The main argument of this thesis is that communication can be made efficient if randomization is used in the routing algorithms.