Proceedings of International Conference on Applied Innovation in IT
2022/03/09, Volume 10, Issue 1, pp.93-97

The Computer Program for the Treatment of Big Data in the Field of Literature Science

Liliia Bodnar, Kateryna Shulakova, Olena Tyurikova

Abstract: The problem of processing large databases is important for solving many pressing problems of science and technology. In this paper, we have developed a computer program (Conan 3.0) for processing large text arrays. However, other applications are possible. We have applied the developed program for the analysis of large texts on the basis of Zipf's laws. The task, which was solved in this work, is connected with the laws of the evolution of languages; in particular, correlations in the development of different Slavic languages were traced. It was assumed that an important characteristic of the language is the Zipf's constant. As a result of calculating the changes in the short-circuit over the 18th, 19th and 20th centuries for the Ukrainian, Russian and Polish languages, no significant changes in the short-circuit were revealed. Small fluctuations in the short-circuit for these languages do not correlate.

Keywords: Program for Big Date Treatment, Zipf’s Laws, Evolution of Languages

DOI: 10.25673/76936

Download: PDF


  1. A. Corral, G. Boleda, and R. Ferrer-i-Cancho, “Zipf’sLaw for Word Frequencies: Word Forms versus Lemmas in Long Texts”, PLoS ONE, vol. 10 (7), 2015.
  2. M. Cristelli, M. Batty, and L. Pietronero, “There isMore than a Power Law in Zipf”, Scientific Report 2,no. 812, 2012.
  3. A. Saichev, Y. Malevergne, and D. Sornette, “Theoryof Zipf ’s law and beyond”, Lecture Notes inEconomics and Mathematical Systems 632, Springer,Heidelberg, Germany, 2010.
  4. V. Bochkarev and E. Lerner, “Calculation of PreciseConstants in a Probability Model of Zipf’s LawGeneration and Asymptotics of Sums of MultinomialCoefficients”, International Journal of Mathematicsand Mathematical Sciences, vol. 17, 2017.
  5. M. Hauser, Ch. Yang, R. Berwick, I. Tattersall,M.Ryan, J. Watumull, N. Chomsky, andR.Lewontin, “The mystery of language evolution”,Frontiers in Psychology, vol. 5, 2014.
  6. W. Fitch, “Empirical approaches to the study oflanguage evolution”, Psychonomic Bulletin &Review, vol. 24 (1), 2017, pp. 3-33.
  7. Y. Zhao and J. Zobel, “Search with style: Authorshipattribution in classic literature”, In Proceedings of theThirtieth Australasian Computer Science Conference,Association for Computing Machinery, 2007.
  8. K.Shilova, D.Goncharenko,L.Bodnar,O.Britavska, and A. Grechkosiy, “Zipf’s laws andtranslation approaches”, Proc. of the 7th Intern,Conference “Information Technologies andManagement”, Riga, 2009, pp. 61-62.
  9. А. Kiv, D. Goncharenko, Ye. Sedov, L. Bodnar, andN.Yaremchuk, “Mathematical study of evolution ofRussian language”, Computer Modeling & NewTechnologies, vol. 12 (1), 2008, pp. 56-59.
  10. A. Kiv, L. Bodnar, O. Britavska, E. Sedov,N.Yaremchuk, and M. Yakovleva, “Quantitativeanalysis of translation texts”, Computer Modeling &New Technologies, vol. 18 (12C), 2014, pp. 260-263.
  11. Analyzing and Interpreting Large Datasets. Atlanta,GA: Centers for Disease Control and Prevention(CDC), 2013.
  12. S. Bird, “Natural Language Processing with Python”,O'Reilly Media Inc, 2009, р. 504.
  13. R. Dale, H. Moisl, and H. Somers, “Handbook ofNatural Language Processing”, Marcel Dekker, 2000.
  14. D. Mertz, “Text Processing in Python”, Addison-Wesley, Boston, MA, 2003.



       - Committees
       - Proceedings


       - Volume 10, Issue 1 (ICAIIT 2022)
       - Volume 9, Issue 1 (ICAIIT 2021)
       - Volume 8, Issue 1 (ICAIIT 2020)
       - Volume 7, Issue 1 (ICAIIT 2019)
       - Volume 7, Issue 2 (ICAIIT 2019)
       - Volume 6, Issue 1 (ICAIIT 2018)
       - Volume 5, Issue 1 (ICAIIT 2017)
       - Volume 4, Issue 1 (ICAIIT 2016)
       - Volume 3, Issue 1 (ICAIIT 2015)
       - Volume 2, Issue 1 (ICAIIT 2014)
       - Volume 1, Issue 1 (ICAIIT 2013)


       ICAIIT 2022
         - Message

       ICAIIT 2021
         - Photos
         - Reports

       ICAIIT 2020
         - Photos
         - Reports

       ICAIIT 2019
         - Photos
         - Reports

       ICAIIT 2018
         - Photos
         - Reports





           ISSN 2199-8876
           Copyright © 2013-2021 Leonid Mylnikov, © 2022 at Anhalt University of Applied Sciences. All rights reserved.