Analysis of the Georgian text for
construction of the teaching system.
J.Antidze, N.Gulua - Institute of Applied Mathematics,
1.Introduction.For the purpose that teaching system accumulates
knowledge on the subject to be taught, understands a question of pupil and
formulates the respective answers,the system shall be able to make automatic
morphological, syntactic and semantic analysis of a text[1].The present work
discusses morphological analysis of Georgian words and aspects of structure
dictionary and the ways of their solution are proposed.
2.Morphological analyses of Georgian words .While performing computer
morphological analysis of Georgian words,the main attention shall be paid to
presentation of verbs in dictionary and identification of precise from of a
verb,i.e, its parsing by morphemes and vice versa, formation of precise verbal
form by means of lexical unit of the verb and the knowledge ,which is attached to this lexical
unit in the dictionary, as morphologic analysis of other parts of speech is not
complicated and the way of its solution is well-known[2].
2.1.Firstly let us discuss what type of knowledge
about the Georgian language shall be presented in the knowledge base.This
knowledge may be divided into the following parts: lexicographical
knowledge,knowledge of syntax, knowledge of semantics and knowledge of
pragmatics [3,4].Both-the form record and its context shall be determined for
each part.We will discuss here only lexicographic knowledge on the example of
the Georgian verbs presentation.Under form of record we mean technique of
frames.Let's discuss the classification of the Georgian verbs in the context of
verbal form composition [5],for realizing what type of information shall be recorded into the frames.If we
consider only the verbal root as the initial unit for creation of verbal
forms,then we'll face problems while splitting lexical meanings corresponding
to these verbal roots,as it is well-known that if the prefix is not fixed
preliminary ,then the number of homonymous forms will substantially increase
and algorithm of their identifying will considerably complicate morphological analysis.That is why we mean
that lexical meaning of a verb shall be fixed by root and prefix. However their
fixation is not completely excluding the case of homonyms. For example: the
verb "ageba"(build)may mean "dzeglis ageba"(to build a
monument) and "khanjalze ameba"(to stab with a dagger).Thus,a prefix
and a root shall be put into the frame,which will basically fix the lexical
meaning of a verb.The next important information is information formation of
active and passive verbal forms. These are vowel prefixes a,i,e,u
or their absence-#, and d for
d-passive.Two cases shall be discussed here: when certain morpheme is not at
all used for the given meaning and when certain verbal form does not have the
morpheme. According to this the lexical meanings are divided into classes, which allows us to precise the number of persons of
verbal form according to attachment or nonattachment of concrete morpheme. Some
roots do not attach prefixes
or other type of morphemes serving as a prefix, or a prefix is
added to a root, though its main function is lost. Such an
information is necessary for identification of morphological categories
of verbal forms. Besides
there are cases when several roots are used for the same lexical meaning of a verb or (lexical
meaning of a verb) when it is formed by special type, like those describing
movement. Such type of information shall be presented in a frame, as the
knowledge about verbs presented in a frame is used both for identification of a
verbal form and for creation of a verbal form. That is why the verbal roots
shall be divided into classes considering the specific conjugation features of
verbs. So personal, theme and row items shall be divided into classes,
according to the classes of those roots, with which they are used. Generally,
when in a result composing certain verbal form or its decomposing according to
certain rules , homonymous case is got, the adopted classification of morpheme
shall be corrected so that new rule might be elaborated to award homonym. The
information about the direction(from left to right) of occurrence of morpheme
class representatives in the verb form and the information about the instances
when occurrence at one class representative
or representatives of group of class in verb form excludes the existence of
other class representative is important.
For example: existence of vowel prefix in verb form excludes existence of
"d"-passive sign. Thus, for each couple(root
and prefix)-which morpheme is attached to this couple and to which class does it belongs for the morpheme shall be
indicated.
2.2.As it was mentioned in the previous paragraph,
concrete couple-root and prefix- determines both-the lexical meaning of verb
form, derived from it, and the rule of composing concrete verb form from the
given couple or vice versa,-which are the morphemes the concrete verb form is
composed by and how shall its
morphologic categories be determined. Each couple, which similarly create verb forms, or
v.v. their verb forms are decomposed into morphemes by same rule, shall be
called as sort of conjugation. For example:for the
verb "ageba"(to build),there is a couple (a,g).A rule may be
elaborated on how any verb form is possible to be composed by this couple and
morphemes ,which are attached to this couple(which morphemes are attached and
which morphemes are not mutually matching-is given in the dictionary).All
couples, toward which the same rule is applicable as that of(a,g)couple, make a
sort of conjugation. For describing the formalism of verb forms calculus let's
discuss the groups of morphemes, which occur generally in verb forms and
enumerate them according the direction of their occurrence(from left to right)
in verb form:1.prefix; 2.prefix sign of person; 3.vowel prefix; 4.root;
5."d"-passive; 6. sign of contact; 7.theme sign ; 8.row sign
; 9. sign of
person; 10. sign of plural. Let's call them sort and
mention according to their serial number.For example: sort 1 is the morphemes,
known as prefixes of verbs.For certain verb form sub -classes, certain order of
sorts is typical, which we call as composed sort.For example:for(a,g)
conjugation sort,the composed sort (1,2,3,4,8,9,10) determines certain number
of verb forms,as for the verb"ageba"(to build) as well for those
verbs which create verb forms by the same rule. The first number 1 of the
composed sort(1,2,3,4,7,8,9,10) for the verb "ageba" (to
build),indicates that this verb has the following feature: when prefix
(ex:"a") is not occurring in a verb form,then there occur verb forms
of the present row (present tense) or
non -complete verb forms of II and III series. Or if the third member is [a],then the verb forms are two personal.Such attitude makes it
possible to formulate simply the rules for the whole conjugation sort,for
analysis and synthesis. The second important issue is how such rules shall be
recorded. In natural language their recording is almost impossible and when it
is possible, even in such cases we have some difficulties in understanding such
rules. The use of special graph representation is advisable here.These graphs
represent the brief recording of graphs[2,6] and they
precisely register verb forms. In our instance the modification of information
presented in a node,will be necessary. Particularly,
when we want to check whether any feature f has any concrete value v,we write the following f=v; when we want to give the value
v to f,we write f:=v. Likewise the feature and the value v may be variable or
constant. For example:for conjugation sort(a,g) the
rule of determining row of I serie is the following:
The
numbers from 1
to 6 name the rows of I serie. Signifies is for the future-row, whilst the
numbers on the left of "="-are sorts. Such a recording may be as well
used for precise calculus of verb forms, which will allow us to register
precisely admissibe verb forms of Georgian verbs and they will be comprehensible
even for those who do not know the
Georgian language [7].For instance: if the conjugation of French verbs is
presented in a form of a table[8],for the Georgian language it may be presented in a form of the above
discussed graphs.
Reperences:
1.J.Antidze,N.Gulua."On a method towards composition
of teaching system", Proceedings of Sukhumi branch of
2.J.Antidze.Experimental algorithm of machine translation from
Georgian into Russion.P.G.essay for achieving the scientific grade of a
candidate of Physics-Mathematics Sciences.
3.R Electronic Dictionary Technical
Guide,Japan Electronic Dictionary,research Institute,Ltd,Tokio,1993.
4.M.Gross.la construction de
dictionaries electroniques, analles des telecommunications, Tome 44,N1-2,1989.
5.A.Shanidze.The grammar of the Georgian language,
I,Morphology.
6.J.Antigze. The construction of dictionary for
machine translation from the Georgian language.The communication of the
7.G.Gogolashvili,Ts.Kvantaliani,D.Shengelia.The dictionary of the Georgian verb roots,
8.J.et J.P.Caput.Dictionaire
des verb francais,Paris,1969.