\title{Computational Syntactic Analysis of Hebrew Sentences}
\author{Shuly Wintner}
\begin{center}
 \Large\bf Abstract
\end{center}

Computer ``understanding'' of a natural language text, which means the
ability of a computer to process natural language input and to produce
output that will resemble human reaction, is composed of four phases:
\begin{description}
\item[Morphologic Analysis] -- whose goal is to identify the lexical items in
the text and to collect the grammatical information they contain;
\item[Syntactic Analysis] -- whose goal is to grasp the structure by which 
phrases, sentences, paragraphs and even larger bulks of text are
composed and to produce a parse tree of these structures;
\item[Semantic Analysis] -- whose goal is to associate each structure in the
text with some meaning, drawn from some semantic domain;
\item[Pragmatic Processing] -- which enriches the output of the previous phase
by using information on the speaker, the context, the environment and
other non-linguistic resources.
\end{description}

These stages of processing do not have to be distinct, and some of
them can be merged; in certain conditions some stages may be skipped;
however, this is the general model for computerized processing of
natural language.

There are many applications for natural language understanding:
automatic translation of texts from one language to another; automatic
condensation of texts; natural language interface for data systems and
robots etc. For different applications different levels of
understanding suffice, but since the syntactic analysis is such a
basic phase in natural language processing, it is likely to be a part
of every system.

In recent years many linguistic theories have been developed which
describe the ways by which a grammar for natural languages should be
defined. As an outcome of these theories {\em grammatical formalisms}
have been developed. Many of these formalisms were combined with
parsing algorithms and were implemented as {\em parser compilers} -- a
system in which the grammar writer defines the rules of the language
according to the formalism, and a parser for that language is
automatically generated.

In the first part of the work some environments for developing
grammars were surveyed. Different models were compared and evaluated,
our criteria being the fitness of the models for defining a
computational grammar for Hebrew. Three such models are described in
detail, namely PATR, Generalized LR Parser/Compiler and Slot
Grammar. For each model we list some of its advantages as well as some
of the features that make it less appropriate for writing a grammar
for Hebrew. As a consequence we list some of the features of a `good'
computational model for developing grammars for natural languages.

The larger part of the project was devoted to writing the grammar for
the Hebrew language. The rules and the lexicon of the grammar are
described in detail, as well as many phenomena the grammar takes care
of.  The grammar deals with simple, subordinate and coordinate
sentences as well as interrogative sentences. Some structures were
thoroughly dealt with, among which are noun phrases, verb phrases,
adjectival phrases; relative clauses, object and adjunct clauses; many
types of adjuncts; subcategorization of verbs; coordination; numbers
etc. For each phrase the parser produces a description of the
structure tree of the phrase as well as a representation of the
syntactic relations in it.

Many examples of Hebrew phrases are demonstrated, together with the
structures the parser assigns them. In the cases where more than one
parse was produced, the reasons of the ambiguity are described. As a
conclusion we suggest some directions for extending and improving the
project.