Proceedings of International Conference on Applied Innovation in IT
2019/03/06, Volume 7, Issue 1, pp.79-85
The Improvement of Machine Translation Quality with Help of Structural Analysis and Formal Methods-Based Text Processing
Anna Mylnikova, Aigul Akhmetgaraeva
Abstract: This article considers the issues of enhancing the quality of machine translation from one language into another one by structuring linguistic patterns and using identification methods for the situations that cannot be processed by the suggested approach and are subject to individual processing. According to the BLEU score metrics, the described approach allows to increase the quality of machine translation on average by 0.1 and reduce postprocessing time due to the identification of idioms and words with context-dependent meanings by translation. The experiment data base of the study was built upon online available pairs of texts that cover the events of FIFA World Cup 2018 and well-known idioms.
Keywords: Machine Translation, Machine-Aided Translation, Language Pair, Classification, BLEU Scores, the Frequency of Vocabulary Use, Algorithm, the Evaluation of Translation Quality
- B. Henisz-Dostert, R. R. Macdonald, and M. Zarechnak,Machine translation. The Hague ; New York: Mouton, 1979.
- P. Sojka, Ed., Text, speech and dialogue: 13th internationalconference, TSD 2010, Brno, Czech Republic, September6-10, 2010: proceedings. Berlin ; New York: Springer, 2010.
- M. R. Costa-Jussà and M. Farrús, “Statistical machinetranslation enhancements through linguistic levels: Asurvey,” ACM Comput. Surv., vol. 46, no. 3, pp. 1-28, Jan.2014.
- M. R. Costa-jussà, A. Allauzen, L. Barrault, K. Cho, and H.Schwenk, “Introduction to the special issue on deep learningapproaches for machine translation,” Comput. Speech Lang.,vol. 46, pp. 367-373, Nov. 2017.
- E. Hasler, A. de Gispert, F. Stahlberg, A. Waite, and B.Byrne, “Source sentence simplification for statistical machinetranslation,” Comput. Speech Lang., vol. 45, pp. 221–235,Sep. 2017.
- F. J. Och and H. Ney, “A Systematic Comparison ofVarious Statistical Alignment Models,” Comput. Linguist., vol. 29, no. 1, pp. 19–51, Mar. 2003.
- K. A. Papineni, S. Roukos, and R. T. Ward,“Maximum likelihood and discriminative training ofdirect translation models,” in Proceedings of the 1998IEEE International Conference on Acoustics, Speechand Signal Processing, ICASSP ’98(Cat. No.98CH36181), Seattle, WA, USA, 1998,vol. 1, pp. 189-192.
- F. J. Och, C. Tillmann, and H. Ney, “ImprovedAlignment Models for Statistical MachineTranslation,” vol. 1999 Joint SIGDAT Conference onEmpirical Methods in Natural Language Processingand Very Large Corpora.
- H. Alshawi, S. Bangalore, and S. Douglas,“Automatic acquisition of hierarchical transductionmodels for machine translation,” in Proceedings ofthe 36th annual meeting on Association forComputational Linguistics -, Montreal, Quebec,Canada, 1998, vol. 1, p. 41.
- D. Wu, “Stochastic inversion transduction grammarsand bilingual parsing of parallel corpora,” Comput.Linguist., vol. 23, no. 3, pp. 377-403, Sep. 1997.
- J. Uszkoreit, J. M. Ponte, A. C. Popat, and M.Dubiner, “Large scale parallel document mining formachine translation,” COLING 10 Proc. 23rd Int.Conf. Comput. Linguist., pp. 1101–1109, Aug. 2010.
- M. Aiken, K. Ghosh, J. Wee, and M. Vanjani, “AnEvaluation of the Accuracy of Online TranslationSystems,” Commun. IIMA, vol. 09, no. 04, 2009.
- P. N. Astya et al., Proceeding, InternationalConference on Computing, Communication andAutomation (ICCCA 2016): 29-30 April, 2016. 2016.
- Association for Computational Linguistics,P.Isabelle, and Association for ComputationalLinguistics, Eds., Proceedings of the conference, 40thannual meeting of the Association for ComputationalLinguistics: Philadelphia, [6 - 13] July 2002,University of Pennsylvania, Philadelphia,Pennsylvania. Hauptbd. ... San Francisco: MorganKaufmann, 2002.
- E. Sumita and H. Iida, “Heterogenous Computing forExample-Based Translation of Spoken Language,”Proc. Sixth Int. Conf. Theor. Methodol. Issues Mach.Transl., pp. 273-286, 1995.
- A. V. Novikova and L. A. Mylnikov, “Problems ofmachine translation of business texts from Russianinto English,” Autom. Doc. Math. Linguist., vol. 51,no. 3, pp. 159-169, Jun. 2017.
- A. Novikova, “Direct Machine Translation andFormalization Issues of Language Structures andTheir Matches by Automated Machine Translationfor the Russian-English Language Pair,” Proc. Int.Conf. Appl. Innov. IT, vol. 6, no. 1, p. 85-92., 2018.
- E. Stamatatos, N. Fakotakis, and G. Kokkinakis,“Automatic Text Categorization in Terms of Genreand Author,” Comput. Linguist., vol. 26, no. 4,pp. 471-495, Dec. 2000.
- M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede, “Lexicon-Based Methods for SentimentAnalysis,” Comput. Linguist., vol. 37, no. 2,pp. 267-307, Jun. 2011.