Inflection rules for Marathi to English in rule based machine translation

Received Apr 30, 2020 Revised May 22, 2021 Accepted Jun 8, 2021 Machine translation is important application in natural language processing. Machine translation means translation from source language to target language to save the meaning of the sentence. A large amount of research is going on in the area of machine translation. However, research with machine translation remains highly localized to the particular source and target languages as they differ syntactically and morphologically. Appropriate inflections result correct translation. This paper elaborates the rules for inflecting the parts-of-speech and implements the inflection for Marathi to English translation. The inflection of nouns, pronouns, verbs, adjectives are carried out on the basis of semantics of the sentence. The results are discussed with examples.


INTRODUCTION
Machine translation is one of the emphasis applications in natural language processing (NLP). Institutions and organizations in India have started working on machine translation systems for Indian languages and have gained satisfactory results [1], [2]. Communication plays important role in life of people. There are many languages used for communication all around the world and good literary works are available in every language. It is not possible to learn all the languages and so there is a need to develop effective machine translation means for targeting multiple languages. English is the language used my majority of the world population for official work, literary work, and all sorts of communication. Marathi is primary language and mostly used in Indian state Maharashtra. It is found that about 71 million people speak Marathi and variety of literature and novels are available in Marathi and hence there is a need for Marathi to English translation [3]. Researches have published the work mostly related to pair of languages and some standard tools are also available for translation [4]- [6]. But it is found that more contribution is needed for Marathi to English translation. As the structure and the grammar vary for the source and target languages, the restructuring and grammatical rules need to be observed correctly. This paper mainly discusses the inflectional rules related to the Marathi-English language pair. Rules are discussed with examples. Rules plays important role in rule based machine translation. This paper includes the literature review related to inflections, importance of adpositions in linguistics, proposed work, inflectional rules, research method, results, and discussion.
Tidke and Sugandhi [7] presented the implementation of the inflection for English to Marathi translation for parts of speech like nouns, pronouns, verbs, and adjectives. Wren and Martin [8] written book on English grammar in which various rules are given for English word inflection. Conway [9] has discussed the problem of English plurals and claimed that even at the lexical level; it can be a complex matter to

PROPOSED METHOD
In this paper we are designing inflection rules to POS tags (noun, pronoun, adjective and verb) of Marathi language. As shown in Figure 1, output of tokenization [11] and stemming is provided to morphology analysis. We are taking help of shallow parser to retrieve part of speech tags and its morphology analysis. Morphology analysis describes multiplicity, gender, person, and tense of verb. Before implementing inflection module, we have to define rules for inflection of each POS tag. Generating the appropriate inflection of a word is needed to keep the correct inflection of the word in English [12], [13]. Words can be classified in two types based on the inflection [14], [15]: inflectional words and non-inflectional words. The inflectional words are noun, pronoun, adjective and verb. The non-inflectional words are adverb, preposition, interjection, and conjunction. The words are inflected on the basis of changing gender (masculine, feminine, neuter), multiplicity (singular, plural), tense (present, past, future), person (first, second, third) and case (genitive/possessive case). plural by appending -s but this approach fails miserably on many special cases such as: class  classes, story  stories and box  boxes. So, there are some pure suffix-based approaches as given in Table 1.
The suffixes which mostly added in noun plural inflections in English language are: -s, -es, -ves,ies, -en, -ee, -e, and -ices. Conway [9] has discussed the problem of English plurals and claimed that even at the lexical level; it can be a complex matter to correctly inflect the individual words of a sentence to reflect their number, person, mood, and case. Out of the three noun cases, inflection occurs in only possessive case. Possessive case is used to denote authorship, origin, and ownership. Inflection of nouns in the possessive case is carried out by adding of -'s or -s' to the end of a noun. Table 2 includes the noun case inflection.  Add " 's " The boy's school Noun-plural and ends with 's' Add " ' " Boys' school Horses' tails. Noun-plural but does not ends with 's' Add " 's " Men's club Children's books Two nouns are closely connected Add " 's " to second noun Karim and Salim's Bakery Nouns telling distance/space/ weight Add " 's " I want a day's leave. Shila will be back in a month's time.
Postpositions in Marathi occur as prepositions in English [16]. Translating Marathi sentence to English sentence requires conversion of postposition to preposition [17]. For example: एक मार्ग पु ण्यावरून र्ोव्याला जातो  One road goes from Pune to Goa.
In above example the suffix ला comes as a postposition in Marathi whereas the word to come as a preposition in English. Thus, postposition processing involves attachment of preposition before prepositional object. Preposition also undergoes inflections according to the suffix attached to postpositional object. In Marathi there are seven cases, each having its own functional meaning and suffixes. There are different prepositions are used according to suffix attached [18] as given in Table 3.

Verb inflection
Inflection of verbs in English is called conjugation. The conjugation of a verb gives the different verb forms either by inflection or by combination with parts of other verbs (auxiliary verb) which shows mood, tense, number, and person. English verbs are inflected for tense. A verb lexeme has at most five forms i.e., third person singular form, past tense, progressive participle, perfect or passive participle form. In fact, most verbs have only four forms, because the past tense and the perfect (or passive) participle forms are the same. This is true for all regular verbs. In third person singular there are few variations. In present third person singular, suffix -s is added to both regular and irregular verb. If verbs are ends with a sibilant consonant, then suffix -es is added and if verbs end with -y preceded by a consonant have then -y changed to -i-and then the suffix -es is added. Table 4 includes verb-third person singular form inflection.
There are some variations for the progressive participle. The suffix -ing is added to all verbs to get progressive participle form. Most of the verbs add "ing" to the end without changing the spelling, but for some verb's spelling in present participle form little bit different according to the specific environment. There are different rules according to verbs ends with as indicated in Table 5.  Past tense and past participle form are generated by adding -ed to regular verbs, for example walkwalked-walked. Past tense and past participle form are generated by adding -ed to irregular verbs. There are mainly three types of irregular verbs. First type of Verbs in which all the three forms i.e., base form, past tense and past participle form are the same e.g., putputput. Next type of verbs in which second and third forms are the same e.g., sitsatsat and third type of verbs in which all three forms are different e.g., drink drankdrunk. All this indicates that inflection for verbs in English requires more consideration than simply adding the affixes -s, -ing, and -ed. Conjugation of verb by combination with parts of other verbs e.g., auxiliary verb, plays vital role in translation of Marathi to English sentence [17]. Verb tense is decided according to action in a sentence is happening e.g., in the present, future, or past. There are four forms in each tense type. Regular verbs follow a standard rule when conjugated according to tense. Conjugation of the regular verb is indicated in Table 6. V1 stands for base form of verb, V2 for past tense of verb, V3 for progressive participle form of verb and Ving for perfect or passive participle form of verb. For Marathi language type of tense is identified from suffix attached to verb and auxiliary verb used as indicated in Table 7. Table 6 shows rules for verb conjugation in tenses according to suffix attached to Marathi verb.

Adjective inflection
There are three forms of adjective in English grammar. They are called the degrees of comparisons i.e., positive degree, comparative degree, and superlative degree. Positive degree of an adjective is the adjective in its simple form. Adjectives are inflected to get comparative and superlative forms.
Generally, for superlative and comparative forms, adjectives are generated by adding the suffixes -er and -est to the positive form, respectively. There are some exceptional rules as shown in Table 8. Few adjectives in which comparative and superlative are not formed from positive, for example: Good-Better-Best. It can be concluded that adjective inflection in English is also more complicated than following simple rules of grammar.

Pronoun inflection
A pronoun is a word that can be substituted for a noun or a noun phrase. Pronoun inflection is similar to noun inflection. The words are inflected on the basis of changing gender i.e., masculine, feminine and neuter; multiplicity i.e., singular, plural; and case i.e., nominative, accusative, and possessive. Pronoun inflection rules are given in Table 9.

RESEARCH METHOD
While implementation of the inflection, there is a necessity of the information of each word i.e., POS tags, gender tags, tense, multiplicity, and degree, which are identified from Shallow parser developed by IIIT, Hyderabad, India. It provides the system with the morphological analysis of a Marathi sentence. The parser provides output in Shakti standard format [19], [20]. It provides the root word, POS tag, tense, gender, multiplicity, direct or oblique case, suffix, Vibhakti and other details important to identify the role of the word in the sentence. The output is represented as a sequence of abbreviated features, with each attribute is having a fixed position and meaning in sequence. Following eight cases are occurs in morph output: <fsaf = 'root, lcat, gend, num, pers, case, vibh, suff'>.  Table 3. Using the above retrieved information, we can apply various inflection rules as discussed in inflection module to get the correct inflection. The inflected words then mapped to the SVO structure of English to generate the correct translation [21]. We have 25,000 Marathi-English sentences from tourism domain from TDIL.

RESULTS AND DISCUSSION
In testing we considered 7000 sentences of tourism domain and we tested for it. Output of our system i.e., inflected words compared with our reference sentences from data set and it is observed that we got 88-90% of accuracy. While testing word sense disambiguation is also considered [5], [6], [22]- [25]. Four test cases are discussed.  Example 1 Marathi sentence: शहराचे वातावरण पर्य टकाां ना आनां द दे ते . The result is in Table 10.  Table 11.  Table 13. In the above all cases, all example gives the inflection of pronouns, nouns, verbs according to the inflection rules discussed and defined in tables from inflection module.

CONCLUSION
In the field of machine translation for Indian languages, a great amount of work has been done but for Marathi the research is limited. There is no work done on rule based Marathi to English machine translation. This paper focuses on the issue of Marathi to English translation with proper inflection with 88-90% accuracy. This paper attempts to provide the detailed description of the rules required for inflecting the words for machine translation from Marathi to English. Ultimately it helps in appropriate translation which was confirmed by the results.