Syntactical Analysis of Georgian Texts
Jemali Antidze, Nana Gulua, David Mishelashvili
I.Vekua Scientific Institute of Applied Mathematics, Tbilisi State University
2 University, Tbilisi, 380043, Georgia
Phone: 99532 305079
E-mail: antidze@viam.hepi.edu.ge
Abstract
The article describes the algorithm of syntactical analysis of Georgian texts and its realization.
The first version was published in ([1]) and detailed description was presented in the dissertation
work of N.Gulua ([2]).The new version is based on PCPATR formalism ([3]) and realized
with OS LINUX 6.2 version. The linguistic approach is described based on specific features of the
Georgian language. Particularly, this is the role of a verbform in the formation of Georgian sentence.
Keywords: syntactical analysis, formal grammar, parse tree, feature structure, morphological
analysis.
The article describes the algorithm of syntactical analysis of Georgian texts and its
realization. The first version was published in ([1]) and detailed description was presented in the
dissertation work of N.Gulua ([2]).THE new version is based on PCPATR formalism ([3]) and
realized with OS LINUX 6.2 version. The linguistic approach is described based on specific
features of the Georgian language. Particularly, this is the role of a verb form in the formation of
Georgian sentence. Our approach facilitates recognition of relations bitween the subject, objects
and predicate in a sentence irrespective of their order. Besides, Georgian verb has many forms.
This demands identification of the form from its root and affixes before syntactical analysis.
Therefore, in order to use PCPATR formalism it is necessary to establish in advance the feature
structure for each word of a sentence. This isn’t feasible without morphological analysis. Otherwise,
we would need to include each form in the dictionary,which increases the volume of the
dictionary. Therefore we have done the morphological analysis with a special approach and the
result is represented in acceptable form for PCPATR. We established the feature structure for
each lexical item, which is used widely for composition of restrictions on the rules. Also, the
restrictions provide the semantical compatibility of the words of a sentence. The result of the
syntactical analysis is the parse tree of a sentence. Now, our morphological analysis das not
provide the composition of a word from its root and affixes,but in the future we provide to use
PC-KIMMO for this goal. The program is tested on the scientific texts and the experiments
continue for future improvement.
References
[1]. J.Antidze, N.Gulua. On selection of Georgian Text Analysis Formalism, Bulletin of the
Georgian Academy of Sciences, 162, 2, 2000.
[2]. N.Gulua. Formalized Description of Georgian Texts, its Software and its Application to the
Construction of a Teaching System, PhD Dissertation, Tbilisi,1999.
[3]. Stephen McConnel. PC-PATR Reference Manual, version 1.2.2, 2000.