Browsing Dr.Santhosh Kumar G by Subject "English Malayalam translation"

Dyuthi/Manakin Repository

Dyuthi Home →
e-SCHOLARSHIP →
Computer Science →
Faculty →
Dr.Santhosh Kumar G→
Browsing Dr.Santhosh Kumar G by Subject

About Dyuthi | Login

Browsing Dr.Santhosh Kumar G by Subject "English Malayalam translation"

Now showing items 1-3 of 3

A Classification of Sandhi Rules for Suffix Separation in Malayalam

Santhosh Kumar, G; Sheena Kurian, K; Mary, Priya Sebastian (Cochin University of Science And Technology, 2009)

[+]

Abstract:

Suffix separation plays a vital role in improving the quality of training in the Statistical Machine Translation from English into Malayalam. The morphological richness and the agglutinative nature of Malayalam make it necessary to retrieve the root word from its inflected form in the training process. The suffix separation process accomplishes this task by scrutinizing the Malayalam words and by applying sandhi rules. In this paper, various handcrafted rules designed for the suffix separation process in the English Malayalam SMT are presented. A classification of these rules is done based on the Malayalam syllable preceding the suffix in the inflected form of the word (check_letter). The suffixes beginning with the vowel sounds like ആല, ഉെെ, ഇല etc are mainly considered in this process. By examining the check_letter in a word, the suffix separation rules can be directly applied to extract the root words. The quick look up table provided in this paper can be used as a guideline in implementing suffix separation in Malayalam language

URI:

http://dyuthi.cusat.ac.in/purl/4185

Files in this item: 1

Files	Size
A Classificatio ... eparation in Malayalam.pdf	(420.0Kb)

Extension schemes for the Alignment Model of English-Malayalam Statistical Machine Translator

Santhosh Kumar, G; Mary, Priya Sebastian; Sheena Kurian, K (IEEE, 2012)

[+]

Abstract:	In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam sentence using statistical models. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set among the sentence pairs of the source and target language before subjecting them for training. This paper deals with certain techniques which can be adopted for improving the alignment model of SMT. Methods to incorporate the parts of speech information into the bilingual corpus has resulted in eliminating many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Presence of Malayalam words with predictable translations has also contributed in reducing the insignificant alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics.
Description:	2012 International Conference on Advances in Computing and Communications
URI:	http://dyuthi.cusat.ac.in/purl/4160

Files in this item: 1

Files	Size
Extension schem ... cal Machine Translator.pdf	(219.2Kb)

Techniques to Improve the word alignments in Statistical Machine Translation from English to Malayalam

Santhosh Kumar, G; Mary, Priya Sebastian; Sheena Kurian, K (2010)

[+]

Abstract:

In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam translation using statistical models like translation model, language model and a decoder. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set up among the sentence pairs of the source and target language before subjecting them for training. This paper is deals with the techniques which can be adopted for improving the alignment model of SMT. Incorporating the parts of speech information into the bilingual corpus has eliminated many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics

URI:

http://dyuthi.cusat.ac.in/purl/4187

Files in this item: 1

Files	Size
Techniques to I ... m English to Malayalam.pdf	(368.5Kb)

Now showing items 1-3 of 3

Search Dyuthi

Advanced Search

Browse

All of Dyuthi
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

Browsing Dr.Santhosh Kumar G by Subject "English Malayalam translation"

Dyuthi/Manakin Repository

Browsing Dr.Santhosh Kumar G by Subject "English Malayalam translation"

Files in this item: 1

Files in this item: 1

Files in this item: 1

Search Dyuthi

Browse

All of Dyuthi

This Collection

My Account