Parsing Schemata Klaas Sikkel Berlin and Heidelberg: Springer Verlag 1997 Parsing schemata are not parsing algorithms; in fact, they are not {\em algorithms\/} at all. They are more abstract specifications that can be implemented as parsing algorithms; but they do not specify essential algorithmic properties such as data structures or control mechanisms. Rather, they are based on a logical deduction system: a parsing schema specifies a domain of permissible entities (trees, or items); a set of initial entities (axioms); and a set of inference rules from which new entities can be deduced. This level of abstraction is very useful in investigating different parsing strategies, comparing various algorithms and formulating the common techniques of parsing algorithms for different formalisms. The beauty of Sikkel's book (that is based on his 1993 dissertation) lies in his success to provide a simple yet powerful language for describing various algorithms for various formalisms, all geared at one aim: parsing. The fifteen chapters of the book are organized as three parts: Exposition, that serves as an introduction to parsing in general and parsing schemata in particular; Foundation, which defines parsing schemata for the simple and well-known case of context free grammars; and Application, which applies the basic ideas to different formalisms (such as Unification Grammars) and parsing techniques (such as left corner, head corner, LR and parallel parsing). Of the Exposition part, the Introduction is well written. After a short description of the motivation for the work, the author describes the essence of parsing schemata informally; even the na\"{\i}ve reader can understand the ideas that are later elaborated on. This is why the second chapter, The Primordial Soup Framework, is redundant: on one hand, the ideas of the primordial soup could be explained in less detail as part of the introduction; on the other hand, they don't extend smoothly to the concept of parsing schemata. This chapter can be avoided with no loss of information. The Foundation part starts with context free grammars in chapter three. The terminology that is used in the rest of the book is set up in this chapter which is, albeit technical, easy to read. In particular, the first parsing scheme is shown, a tree-based one for context free grammars. This is the only scheme that is defined over trees -- starting from chapter four, schemes are defined over the more general notion of items. Parsing schema are defined as a specification of the more general notion of deduction systems, that are formally defined in this chapter. In particular, the notion of correctness is formally introduced, anticipating a later discussion of correctness of parsing schemata. Chapter four is mainly concerned with the widely used notion of {\em items}. Items are defined as (restricted) equivalence classes over trees. Following a very technical -- and to my mind, superfluous -- discussion of quotient deduction systems, item-based parsing schemata are presented. The importance of the algebraic discussion is that it yields a clear, formal generalization for items, and an understanding of their ontological status: as different algorithms use different items, this generalization is important. The chapter concludes with several examples of parsing schemata, corresponding to well-known algorithms such as Earley or Left-corner parsing. The rest of the Foundations part is dedicated to relations among different parsing schemata: {\em Refinement\/} and {\em generalization\/} are discussed in chapter five, {\em filtering\/} -- in chapter six (these should have been one larger chapter). As such relations are essentially mappings between deduction systems, they can be -- and indeed are -- formally defined. A refined parsing scheme has, roughly, more items and more deduction steps. Generalization is an extension of a schema to a larger class of grammars. For example, it is shown that bottom-up Earley parsing is a generalization of the CYK algorithm. Filtering is an optimization that reduces the number of deduction steps in a parsing system. Chapter six shows how redundancies can be eliminated both statically (at compile time) and dynamically (at run time), without harming the correctness of the schemata. {\em Step contraction}, which is the inverse of step refinement, is introduced as the most powerful filter. Once such relations among schemata are established, they can be used for proving the correctness of a certain scheme, given the correctness of others. This is a very useful tool in formally verifying the correctness of parsing schemata. Here lies the importance of Sikkel's work: at the end of Chapter six, several parsing schemata, corresponding to different parsing algorithms, are summarized. Related to each other by the above-mentioned relations, these schemata form a grid of mathematically defined objects. Once the formal tools for specifying parsing schemata are constructed, the Application part extends them in different directions. Chapters seven to nine deal with unification grammars. Sikkel limits the discussion to grammars with a context free backbone only, and this decision is explained, especially in the discussion of two-pass parsing in Chapter nine. Still it is interesting to see how the framework could be extended to grammatical formalisms such as HPSG (Pollard and Sag 1994) or Categorial Grammars (Haddock {\em et al.} 1987), in which no finitely ambiguous context free backbone exists. While Chapter seven could have been eliminated (being an informal introduction to unification grammars, parts of it could have been inserted in the relevant places in the sequel), the rest of the text is superbly written. The author manages to present the most intriguing problems in a clear way, and after a very rigorous introduction, setting up the notation, one realizes that parsing schemata for unification grammars can be defined in much the same way as for context-free grammars. It is the first work to formally define and study the notion of {\em composite feature structures}, that are used implicitly by several applications. A minor comment is that Sikkel only discusses in detail the simplest case (namely, PATR grammars); an elaborate example of a parsing schema for a more up-to-date formalism, supporting typed feature structures and cyclicity, could be in place. The discussion of related issues such as unification algorithms, disjunctive feature structures and restriction in chapter nine, while brief, sheds more light on the framework and is indeed important. The rest of the book can be viewed as three separate papers. Chapters ten and eleven are dedicated to Left-corner and Head-corner parsing. Parsing schemata are most naturally conceived of as abstractions of chart parsing algorithms; therefore, their application to these two algorithms is rather straightforward. The importance of the abstraction of the simpler left-corner schema is that the concept of items is generalized here: they are no longer required to be equivalence classes of trees; more liberal definitions are possible. Furthermore, the relations between schemata, defined earlier, are applied to relate the new schema to ones presented earlier. The same idea is used in chapter eleven, where it is shown that the Head-corner schemata are generalizations of Left-corner ones. The two chapters give a very detailed and clear descriptions of the motivation for the two algorithms. Chapters twelve and thirteen discuss LR parsing. The informal presentations of the basic LR algorithm, and especially Tomita's algorithm, are aligned with the high standards that Sikkel himself set in previous chapters. Since these algorithms are not based on a chart, defining appropriate schemata is a less trivial task. However, it can be achieved, and furthermore, the obtained schema for Tomita's algorithm turns out to be very similar to conventional Earley parsing. This result is used in chapter thirteen, where a parallel variant for Tomita parsing is defined, that relates to the simple LR(0) just as the simple parallelization of Earley's algorithm relates to its original version. It is a very nice demonstration of the usefulness of parsing schemata as a means for relating different parsing algorithms. The last application, Boolean circuit parsing, presented in chapter fourteen, has mostly theoretical importance: it shows how parsing schemata can be encoded in hardware, by using neural-net-like Boolean circuits with a very large number of processors, thus achieving logarithmic parsing times. Throughout the entire Application part of the book, the author keeps a high degree of mathematical rigor, but at the same time provides the necessary motivation in informal discussions. As chapters ten to fourteen are actually three self-contained papers, there are unnecessary redundancies, especially in the introductions: the same notations are defined and explained in several places. Sikkel's book is an important work because it manages to capture the similarities -- and the differences -- among various parsing algorithms in an abstract, mathematically rigorous way. The concept of parsing schemata, and in particular the several relations among different schemata, will undoubtedly be used for many more applications than those that are presented in this book. On top of that, Sikkel managed to collect a vast number of parsing algorithms, and to describe them all in a clear and bright way, providing abstract descriptions (through parsing schemata) for many of them. The bibliography list is almost complete: it lists virtually all parsing-related papers that were ever published. What is strikingly missing is a confrontation of parsing schemata with the {\em parsing as deduction\/} paradigm, and in particular with the work of Shieber {\it et al.} (1994), which is not mentioned in the book. Nevertheless, this book is a must for everyone who has interest in parsing, both of formal and of natural languages. Now that the content of the book has been praised, some remarks must be made concerning its form. The style of the author moves between excessive, repetitive informal explanations and concentrations of long, technical, mathematical material. For example, the notion of filtering, discussed in chapter six, is introduced in the Overview section; in the introduction to the Foundation part; in the introduction to chapter five; in the last paragraph of chapter five; and, of course, in the introduction to chapter six. Much of it is redundant and should have been eliminated. On the other hand, one can find three pages long lists of definitions without a word concerning their motivation or use, as in section 4.4. In the introduction the author mentions that ``one cannot engage in increasingly technical and formal reasoning and along the way forget about the motivation behind the theory'' (p.\ 5). Reading through the first pages of section 5.1, where -- for a given function $f$ -- its extensions $f^{'}, f^{''}, \hat{f}, \hat{f^{'}}$ and $\hat{f^{''}}$ are defined with no mention of their use, one wonders whether the author has kept his implied promise. In short, the book could benefit a lot from a good, experienced editor. An editor could spot the many typos and the several occasions of wrong references; eliminate embarrassing sentences such as ``An equivalence relation is transitive, reflexive and antisymmetric'' (p.\ 51); or ``In section 4.2 we have defined...'', as the first line in section 4.2; avoid references to numbered examples and definitions that go back one hundred pages and more (p.\ 156), or at least add the page numbers; etc. A minor, yet annoying point: the quality of the print, in two different copies I have checked, is poor. Especially at the lower parts of the pages, the print is dizzy and sometimes even hard to read. References Haddock {\em et al.} 1987: Haddock, Nicholas, Klein, Ewan and Morill, Glyn, {\em Categorial Grammar, Unification and Parsing}, Volume 1 of Working Papers in Cognitive Science, Center for Cognitive Science, University of Edinburgh, 1987 Pollard and Sag 1994: Pollard, Carl and Sag, Ivan A., {\em Head-Driven Phrase Structure Grammar}, University of Chicago Press and CSLI Publications, 1994 Shieber {\em et al.} 1994: Shieber, Stuart M., Schabes, Yves and Pereira, Fernando C.\ N., Principles and Implementation of Deductive Parsing, Technical Report TR-11-94, Center for Research in Computing Technology, Division of Applied Sciences, Harvard University, April 1994 Reviewed by: Shuly Wintner Universit\"at Tuebingen Seminar f\"ur Sprachwissenschaft Wilhelmstr.\ 113, 72074 T\"ubingen, Germany \verb+shuly@sfs.nphil.uni-tuebingen.de+