Inflection Rules For Marathi To English in Rule Based Machine Translation
Inflection Rules For Marathi To English in Rule Based Machine Translation
Inflection Rules For Marathi To English in Rule Based Machine Translation
Corresponding Author:
Namrata G Kharate
Department of Computer Engineering
Vishwakarma Institute of Information and Technology
India
Email: [email protected]
1. INTRODUCTION
Machine translation is one of the emphasis applications in natural language processing (NLP).
Institutions and organizations in India have started working on machine translation systems for Indian
languages and have gained satisfactory results [1], [2]. Communication plays important role in life of people.
There are many languages used for communication all around the world and good literary works are available
in every language. It is not possible to learn all the languages and so there is a need to develop effective
machine translation means for targeting multiple languages. English is the language used my majority of the
world population for official work, literary work, and all sorts of communication. Marathi is primary
language and mostly used in Indian state Maharashtra. It is found that about 71 million people speak Marathi
and variety of literature and novels are available in Marathi and hence there is a need for Marathi to English
translation [3]. Researches have published the work mostly related to pair of languages and some standard
tools are also available for translation [4]-[6]. But it is found that more contribution is needed for Marathi to
English translation. As the structure and the grammar vary for the source and target languages, the
restructuring and grammatical rules need to be observed correctly. This paper mainly discusses the
inflectional rules related to the Marathi-English language pair. Rules are discussed with examples. Rules
plays important role in rule based machine translation. This paper includes the literature review related to
inflections, importance of adpositions in linguistics, proposed work, inflectional rules, research method,
results, and discussion.
Tidke and Sugandhi [7] presented the implementation of the inflection for English to Marathi
translation for parts of speech like nouns, pronouns, verbs, and adjectives. Wren and Martin [8] written book
on English grammar in which various rules are given for English word inflection. Conway [9] has discussed
the problem of English plurals and claimed that even at the lexical level; it can be a complex matter to
correctly inflect the individual words of a sentence to reflect their number, person, mood, and case. As per
related work there is no sufficient work is done infection on Indian languages. In this paper we are working
on Marthi language. As Marathi language is very rich in morphology so it’s little bit difficult to study.
Adpositions are words which can occur before or after a word, phrase or clause that is necessary to complete
the meaning of a given sentence. Adpositions are mainly categorized as: Prepositions, postpositions and
circumpositions.
Prepositions: Prepositions mean the words which occur before the complement [9]. A preposition
occurs in English language. These prepositions are usually converted into postpositions in Marathi
language.
Example: The birds are sitting on the tree. Here ‘on’ is the preposition.
Postpositions: Postpositions mean the words which occur after the complement [10]. Postpositions
occur in Marathi language. Example: पक्षी झाडावर बसलेले आहे त. Here ‘वर’ is the postposition.
Circumpositions: Circumpositions are words appear on before and after the complement.
Circumpositions used in English language. Example: I will play regularly from now on. Here ‘from ...
on’ are the circumpositions. English language has SVO structure and Marathi language has SOV
structure. The languages which follow SVO structure use prepositions. Hence, during translation of a
Marathi sentence to an English sentence, there is a necessary to change the postpositions of language
Marathi which is the source language of the research to prepositions of language English which is the
target language of the research. So, translation postpositions to prepositions are a main problem which
needs to be solved by inflecting the nouns, verbs, and cases (Vibhakti). Depending upon suffix attached
to Marathi word there is a change in words position of English words while translating.
2. PROPOSED METHOD
In this paper we are designing inflection rules to POS tags (noun, pronoun, adjective and verb) of
Marathi language. As shown in Figure 1, output of tokenization [11] and stemming is provided to
morphology analysis. We are taking help of shallow parser to retrieve part of speech tags and its morphology
analysis. Morphology analysis describes multiplicity, gender, person, and tense of verb. Before implementing
inflection module, we have to define rules for inflection of each POS tag. Generating the appropriate
inflection of a word is needed to keep the correct inflection of the word in English [12], [13]. Words can be
classified in two types based on the inflection [14], [15]: inflectional words and non-inflectional words. The
inflectional words are noun, pronoun, adjective and verb. The non-inflectional words are adverb, preposition,
interjection, and conjunction. The words are inflected on the basis of changing gender (masculine, feminine,
neuter), multiplicity (singular, plural), tense (present, past, future), person (first, second, third) and case
(genitive/possessive case).
plural by appending -s but this approach fails miserably on many special cases such as: class classes, story
stories and box boxes. So, there are some pure suffix-based approaches as given in Table 1.
The suffixes which mostly added in noun plural inflections in English language are: -s, -es, -ves, -
ies, -en, -ee, -e, and -ices. Conway [9] has discussed the problem of English plurals and claimed that even at
the lexical level; it can be a complex matter to correctly inflect the individual words of a sentence to reflect
their number, person, mood, and case. Out of the three noun cases, inflection occurs in only possessive case.
Possessive case is used to denote authorship, origin, and ownership. Inflection of nouns in the possessive
case is carried out by adding of -’s or -s’ to the end of a noun. Table 2 includes the noun case inflection.
एक मार्ग पुण्यावरून र्ोव्याला जातो One road goes from Pune to Goa.
In above example the suffix ला comes as a postposition in Marathi whereas the word to come as a preposition
in English. Thus, postposition processing involves attachment of preposition before prepositional object.
Preposition also undergoes inflections according to the suffix attached to postpositional object. In Marathi
there are seven cases, each having its own functional meaning and suffixes. There are different prepositions
are used according to suffix attached [18] as given in Table 3.
Int J Artif Intell, Vol. 10, No. 3, September 2021: 780 - 788
Int J Artif Intell ISSN: 2252-8938 783
Past tense and past participle form are generated by adding -ed to regular verbs, for example walk-
walked-walked. Past tense and past participle form are generated by adding -ed to irregular verbs. There are
mainly three types of irregular verbs. First type of Verbs in which all the three forms i.e., base form, past
tense and past participle form are the same e.g., put – put – put. Next type of verbs in which second and third
forms are the same e.g., sit – sat – sat and third type of verbs in which all three forms are different e.g., drink
– drank – drunk. All this indicates that inflection for verbs in English requires more consideration than
simply adding the affixes -s, -ing, and -ed. Conjugation of verb by combination with parts of other verbs e.g.,
auxiliary verb, plays vital role in translation of Marathi to English sentence [17]. Verb tense is decided
according to action in a sentence is happening e.g., in the present, future, or past. There are four forms in
each tense type. Regular verbs follow a standard rule when conjugated according to tense. Conjugation of the
regular verb is indicated in Table 6. V1 stands for base form of verb, V2 for past tense of verb, V3 for
progressive participle form of verb and Ving for perfect or passive participle form of verb. For Marathi
language type of tense is identified from suffix attached to verb and auxiliary verb used as indicated in
Table 7. Table 6 shows rules for verb conjugation in tenses according to suffix attached to Marathi verb.
Inflection rules for Marathi to English in rule based machine translation (Namrata G Kharate)
784 ISSN: 2252-8938
Int J Artif Intell, Vol. 10, No. 3, September 2021: 780 - 788
Int J Artif Intell ISSN: 2252-8938 785
3. RESEARCH METHOD
While implementation of the inflection, there is a necessity of the information of each word i.e.,
POS tags, gender tags, tense, multiplicity, and degree, which are identified from Shallow parser developed by
IIIT, Hyderabad, India. It provides the system with the morphological analysis of a Marathi sentence. The
parser provides output in Shakti standard format [19], [20]. It provides the root word, POS tag, tense, gender,
multiplicity, direct or oblique case, suffix, Vibhakti and other details important to identify the role of the
word in the sentence. The output is represented as a sequence of abbreviated features, with each attribute is
having a fixed position and meaning in sequence. Following eight cases are occurs in morph output: <fsaf =
'root, lcat, gend, num, pers, case, vibh, suff’>.
Root indicates the root word of the word morphed.
Lcat gives the lexical category of the word. The values it can take are: Noun (n), pronoun (pn), verb (v),
adjective (adj), adverb (adv), and number (num).
Gend gives the gender of the word in context. The values it can take are male (m), female (f), neutral
(n).
Num gives the impression of the word being singular(sg) or plural(pl) in nature.
Pers gives whether the speech of the word is in the first person (1), second (2) or the third person (3).
Case gives whether the noun has a direct or an oblique case depending on the sentence and usage.
Vibh is the Vibhakti of the word.
Suff identifies the suffix of the word if it contains any.
For example: पर्यटकाांना NN <fsaf='पर्यटक, n, m, pl, o, ना,ना' name=“पर्यटकाांना”>
Cases and tenses are identified from word endings as per defined in database, for example as shown
in Table 3. Using the above retrieved information, we can apply various inflection rules as discussed in
inflection module to get the correct inflection. The inflected words then mapped to the SVO structure of
English to generate the correct translation [21]. We have 25,000 Marathi-English sentences from tourism
domain from TDIL.
Inflection rules for Marathi to English in rule based machine translation (Namrata G Kharate)
786 ISSN: 2252-8938
Example 2
Marathi sentence: अमृतसर सु वणग मंदिराचे शहर आहे . The result is in Table 11.
Example 3
Marathi sentence: जल महालाची रचना इ.स 1534 बहािु रशाह र्ु जरातीने केली होती. The result is in
Table 12.
Example 4
Marathi sentence: केंद्रपाडा दजल्हा लोहमार्ाग ला जोडलेला नाही. The result is in Table 13. In the above
all cases, all example gives the inflection of pronouns, nouns, verbs according to the inflection rules
discussed and defined in tables from inflection module.
5. CONCLUSION
In the field of machine translation for Indian languages, a great amount of work has been done but
for Marathi the research is limited. There is no work done on rule based Marathi to English machine
Int J Artif Intell, Vol. 10, No. 3, September 2021: 780 - 788
Int J Artif Intell ISSN: 2252-8938 787
translation. This paper focuses on the issue of Marathi to English translation with proper inflection with 88-
90% accuracy. This paper attempts to provide the detailed description of the rules required for inflecting the
words for machine translation from Marathi to English. Ultimately it helps in appropriate translation which
was confirmed by the results.
REFERENCES
[1] A. Godase and S. Govilkar, “Machine Translation Development for Indian Languages and its Approaches,”
International Journal on Natural Language Computing, vol. 4, no. 2, pp. 55–74, 2015,
doi: 10.5121/ijnlc.2015.4205.
[2] M. A. S. Khan, S. Amada, and T. Nishino, “Sublexical Translations for low-resource language,” Proceedings of
Workshop on Machine Translation and Parsing in Indian Languages (MTPIL-2012), 24th International
Conference on Computer Linguistics (Coling12), 2012, pp. 39-52.
[3] Census of India, “Abstract of speakers’ strength of languages and mother tongues–2001”. [Online]. Available:
https://censusindia.gov.in/Census_Data_2001/Census_Data_Online/Language/Statement1.htm.
[4] K. T. Belerao, V. S. Wadne, and S. V. Phulari, “A novel approach for Interlingual example-based translation of
English to Marathi,” International Journal of Emerging Technology and Advanced Engineering, vol. 4, no 3,
pp. 414-417, 2014.
[5] S. Dave, J. Parikh, and P. Bhattacharyya, “Interlingua-based English–Hindi machine translation and language
divergence,” Machine Translation, vol. 16, no. 4, pp. 251–304, 2001, doi: 10.1023/A:1021902704523.
[6] N. Patel, B. Patel, R. Parikh, and B. Bhatt, “A Survey: Word Sense Disambiguation,” International Journal of
Advance Foundation and Research in Computer (IJAFRC), vol. 2, 2015.
[7] C. Tidke, S. Binayakya, S. Patil, and R. Sugandhi “Inflection Rules for English to Marathi Translation”
International Journal of Computer Science and Mobile Computing, IJCSMC, vol. 2, no. 4, pp. 7-18, 2013.
[8] P. Wren and H. Martin, High School English Grammar and Composition. S Chand Publication, 2016.
[9] D. Conway, “An algorithmic approach to English pluralization,” in Proceedings of the Second Annual Perl
Conference, SanJose, CA, USA, 1998.
[10] G. R. Navalkar, The Student’s Marathi Grammar. Asian Educational Services, 2001
[11] R. Sugandhi, D. Pisharoty, P. Sidhaye, H. Utpat, S. Wandkar, and R. Khope, “Managing Tokens for Machine
Translation from English to Marathi,” International Journal of Engineering, Science and Research (IJESR), vol. 1,
no. 3, 2011.
[12] A. R. Joshi and N. Sasikumar, “Constructive approach to teach inflections in Marathi language,” Proceedings of
National Conference on Advances in Technology and Recent Developments, Mumbai, India, 2008, pp. 10-16.
[13] S. B. CharugatraTidke and P. Shivani, “Inflection Rules for English to Marathi Machine Translation,” IJCSMC,
vol. 2, no. 4, pp. 7–18, 2013.
[14] A. R. Joshi, M. Sasikumar, “Constructive approach to teach inflections in Marathi language,”
www.cdacmumbai.in/design/corporate_site/.../pdf.../CATIML1.pdf
[15] D. Walker and R. Amsler, “The Use of Machine Readable Dictionaries in Sublanguage Analysis,” in Analyzing
Language in Restricted Domains, Grishman and Kittredge (eds), LEA Press, pp. 69-83, 1986.
[16] G. V. Garje, G. K. Kharate, and H. Kulkarni, “Transmuter: An Approach to Rule-based English to Marathi
Machine Translation,” International Journal of Computer Applications, vol. 98, no. 21, pp. 33–37, 2014.
[17] N. G. Kharate and V. H. Patil, “Challenges in Rule Based Machine Translation from Marathi to English,” 5th
International Conference on Advances in Computer Science and Information Technology (ACSTY-2019), 2019,
pp. 45-54, doi: 10.5121/csit.2019.91005.
[18] N. G. Kharate and D. H. Patil, “Handling Challenges in Rule Based Machine Translation from Marathi to English,”
International Journal on Natural Language Computing, vol. 8, no. 4, pp. 39–49, 2019,
doi: 10.5121/ijnlc.2019.8404 39.
[19] B. Kulkarni, P. D. Deshmukh, and K. V. Kale, “Syntactic and Structural Divergence in English-to-Marathi
Machine Translation,” IEEE 2013 International Symposium on Computational and Business Intelligence, New
Delhi, 2013, pp. 191-194, doi: 10.1109/ISCBI.2013.46.
[20] A. Bharati, R. Sangal, and D. M Sharma, “SSF: Shakti Standard Format Guide,” 2007.
[21] A. Adapanawar, A. Garje, P. Thakare, P. Gundawar, and P. Kulkarni, “Rule based English to Marathi translation of
Assertive sentence,” International Journal of Scientific & Engineering Research, vol. 4, no. 5, 2013.
[22] W. Black et al., "Introducing the Arabic WordNet Project," Proceedings of the 3rd Global Wordnet Conference,
2006, pp. 295-300.
[23] M. Gupta, S. Yadav, S. Sharma, and S. Yadav, “Word Sense Disambiguation Using HindiWordNet and Lesk
Approach,” IPASJ International Journal of Computer Science (IIJCS), vol. 1, no. 6, pp. 12-17, 2013.
[24] R. Sinha and R. Mihalcea, "Unsupervised Graph-based Word Sense Disambiguation Using Measures of Word
Semantic Similarity," International Conference on Semantic Computing (ICSC 2007), 2007, pp. 363-369,
doi: 10.1109/ICSC.2007.87.
[25] K. Neeraja and B. P. Rani, “Approaches for Word Sense Disambiguation: Current State of The Art,” International
Journal of Electronics Communication and Computer Engineering, vol. 6, no. 5, pp. 197-201, 2015.
Inflection rules for Marathi to English in rule based machine translation (Namrata G Kharate)
788 ISSN: 2252-8938
BIOGRAPHIES OF AUTHORS
Dr. Varsha H. Patil is Professor, University of Pune, and Head, Computer Science
Department, at Matoshri College of Engineering & Research Centre, Nashik. She is also
serving as the Vice Principal of the same institute. She has received her Ph.D. in Computer
Engineering from BVCOE, Bharati Vidyapeeth Pune. She has received her M.E in Computer
Engineering from COEP, Pune .She has more than 20 years of teaching experience and several
research papers, published in national and international journals of repute, to her credit. She is
a member of the Board of Studies of Computer Engineering at the University of Pune.
Int J Artif Intell, Vol. 10, No. 3, September 2021: 780 - 788