Harvard Computer Science Technical Reports for 1987

TR-02-87
Michael O. Rabin. 1987. ``Efficient Dispersal of Information for Security Load Balancing and Fault Tolerance..''
We develop an Information Dispersal Algorithm (IDA) which breaks a file $F$ of length $L=\mid F\mid$ into $n$ pieces of $F_ {i} , 1 leq i leq n$, each of length $\mid F_ {i} \mid=L/m$ , so that every $m$ pieces suffice for reconstructing $F$. Dispersal and reconstruction are computationally efficient. The sum of the lengths $\mid F_ {i} \mid$ is $(n/m) dot L$. Since $n/m$ can be chosen to be close to $1$, the IDA is space efficient. IDA has numerous applications to secure and reliable storage of information in computer networks and even on single disks, to fault-tolerant and efficient transmission of information in networks, and to communications between processors in parallel computers. For the latter problem we get provably time efficient and highly fault-tolerant routing on the $n$ - cube, using just constant size buffers.
TR-03-87
Michael Kearns and Ming Li. 1987. ``Learning in the Presence of Malicious Errors..''
We study an extension to the Valiant model of machine learning from examples in which errors may be present in the sample data. We give strong bounds on the rate of error tolerable when the errors are of a ``malicious'' nature, and show that it is crucial to take both positive and negative examples when such errors exist. Quantitative comparisons between the malicious model of errors and a model of uniform random noise introduced by Angluin and Laird are made. A greedy heuristic for a new generalization of set covering that is of independent interest is given, with nontrivial applications to learning with errors. We also give a reduction from learning to set cover.
TR-04-87
Abdelsalam Heddaya, Meichun Hsu, and William Weihl. 1987. ``Two Phase Gossip: Managing Distributed Event Histories..''
We describe a distributed protocol that operates on replicated data objects---of arbitrary abstract types---represented in terms of the history of object events, rather than in terms of object values. In general, a site that stores a replica of an object has only partial knowledge of the object's event history. We call such a replica an {\it object representative} . The goal of the protocol is to limit the sizes of the histories by checkpointing and discarding old events. We treat separately the three functions of (1) propagating events to sites that do not know about them, (2) checkpointing, and (3) discarding old events. A site rolls forward its checkpoint state as far as it {\it knows} its local version of the object's history to be complete, but it discards old events only as far as it knows that all other sites {\it know} their local histories to be complete. Our protocol propagates events among sites in {\it gossip} messages exchanged in the background in a {\it two phase} manner. Each site maintains several timestamp vectors indicating the extent of its knowledge of other sites' knowledge of events. These timestamp vectors are included in gossip messages and used to decide the extent of global-completeness of each site's local version of the event history. We formally define the correctness criteria for our protocol in terms of the completeness properties of histories.
TR-05-87
Michael O. Rabin and J.D. Tygar. 1987. ``An Integrated Toolkit for Operating System Security..''

TR-06-87
Benny Chor and Mihaly Gereb-Graus. 1987. ``On the influence of single participant in coin flipping schemes..''
We prove that in a one round fair coin flipping scheme with {\it n} participants, either the {\it average} influence of all participants is at least $3/n - o(l/n),$ or there is at least one participant whose influence is $Omega left( n^ {-5/6} \right).$
TR-07-87
Mihaly Gereb-Graus and Danny Krizanc. 1987. ``The Complexity of Parallel Comparison Merging..''
We prove a worst case lower bound of $Omega(log log n)$ for randomized algorithms merging two sorted lists of length $n$ in parallel using $n$ processors on Valiant's parallel computation tree model. We show how to strengthen this result to a lower bound for the expected time taken by any algorithm on the uniform distribution. Finally, bounds are given for the average time required for the problem when the number of processors is less than and greater than $n$.
TR-08-87
Milena Mihail. 1987. ``The Approximation of the Permanent is Still Open - A Flaw in Broader's Proof-.''
In [B] Broder claims to have obtained a polynomial time randomized algorithm for approximating the permanent of dense 0-1 matrices. In this note we point out a major flaw in the analysis of the algorithm: the process C2 defined in [B] as a coupling is in fact {\it not} such. Thus, it remains open whether Broder's algorithm gives satisfactory estimates.
TR-09-87
Marek Chrobak and Ming Li. 1987. ``$k+1$ Heads are Better Than $k$ for PDA's.''
We prove the following conjecture stated by Harrison and Ibarra in 1968 [HI, p.462]: There are languages accepted by $(k$+1)-head 1- way deterministic pushdown automata $((k$+1)-DPDA) but not by $k$- head 1-way pushdown automata $(k$-PDA), for every $k$. (Partial solutions for this conjecture can be found in [M1,M2,C].) On the assumption that their conjecture holds, [HI] also derived some other consequences. Now all those consequences become theorems. For example, the class of languages accepted by $k$-PDA's is not closed under intersection and complementation. Several other interesting consequences also follow: CFL$\not\subsetequp_ {k} $DPDA($k$) and FA(2)$\not\subsetequp_ {k} $DPDA($k$), where DPDA($k$)=\ {L$\mid$L is accepted by a $k$-DPDA} and FA(2)=\ {L$\mid$L is accepted by 2-head FA} . Our proof is constructive (that is, not based on diagonalization). Before, the ``$k$+1 versus $k$ heads'' problem was solved by diagonalization and translation methods [I2,M2,M3, M4,S] for stronger machines (2 -way, etc), and by traditional counting arguments [S2,IK,YR,M1] for weaker machines ($k$-FA-$k$-head counter machines, etc).
TR-10-87
Ming Li and Paul M.B. Vitenyi. 1987. ``Tape versus Queue and Stacks: The Lower Bounds.''
Several new optimal or nearly optimal lower bounds are derived on the time needed to simulate queue, stacks (stack = pushdown store) and tapes by one off-line single-head tape-unit with one- way input, for both the deterministic case and the nondeterministic case. The techniques rely on algorithmic information theory (Kolmogorov complexity).
TR-11-87
Barbara J. Grosz and Candace L. Sidner. 1987. ``Plans for Discourse.''

TR-12-87
Donald Beaver. 1987. ``Oblivious Secret Computation.''
Recently, methods for secret, distributed, and fault-tolerant computation have been developed. Those techniques show how to ``compile'' a function to produce a protocol in which a distributed system evaluates that function at secret arguments, revealing the value of the function but not the values of the arguments, nor the values in intermediate steps of the computation. We extend these methods in two ways: first, we show that the {\it function} itself need not be revealed. That is, we show how to evaluate a function secretly, without revealing any information about the function itself, other than a bound on the time and space needed to compute it. This result has fundamental implications for distributed system security and fault-tolerant computing. Secondly, we show that revealing the {\it result} of a secret computation is not necessary, and we give useful applications in which the values of secretly computed functions are never revealed. One such application extends the ideas of oblivious transfer to the distributed environment. Using this extension, we solve classical, two-party oblivious transfer {\it without using cryptography} , unlike all previous solutions.
TR-13-87
Donald Beaver. 1987. ``Polynomially Sized Boolean Circuits Are Not Learnable.''
Polynomial-sized boolean circuits provide a powerful and general framework to describe many naturally occurring concepts. Because of the wide range of functions which can be described or computed by boolean circuits, an algorithm to learn arbitrary polynomial- sized circuits would have broad implications for artificial intelligence and learning theory. We prove, however, a conjecture made by Valiant that, if one-way functions exist, then the class of polynomial-sized boolean circuits is not probably-approximately learnable. We also discuss how this result might be strengthened by eliminating the assumption of one-way functions.
TR-14-87
Danny Krizanc. 1987. ``Oblivious Routing with Limited Buffer Capacity.''
The problem of oblivious routing in fixed connection networks with a limited amount of space available to buffer packets is studied. We show that for a $n$ processor network with a constant number of connections and a constant number of buffers any deterministic pure source-oblivious strategy realizing all partial permutations requires $Omega(n)$ time. The consequence of this result for well-known networks is discussed.