Proceedings of International Conference on Applied Innovation in IT
2022/03/09, Volume 10, Issue 1, pp.93-97

The Computer Program for the Treatment of Big Data in the Field of Literature Science

Liliia Bodnar, Kateryna Shulakova, Olena Tyurikova

Abstract: The problem of processing large databases is important for solving many pressing problems of science and technology. In this paper, we have developed a computer program (Conan 3.0) for processing large text arrays. However, other applications are possible. We have applied the developed program for the analysis of large texts on the basis of Zipf's laws. The task, which was solved in this work, is connected with the laws of the evolution of languages; in particular, correlations in the development of different Slavic languages were traced. It was assumed that an important characteristic of the language is the Zipf's constant. As a result of calculating the changes in the short-circuit over the 18th, 19th and 20th centuries for the Ukrainian, Russian and Polish languages, no significant changes in the short-circuit were revealed. Small fluctuations in the short-circuit for these languages do not correlate.

Keywords: Program for Big Date Treatment, Zipf’s Laws, Evolution of Languages

DOI: 10.25673/76936

Download: PDF


  1. A. Corral, G. Boleda, and R. Ferrer-i-Cancho, “Zipf’sLaw for Word Frequencies: Word Forms versus Lemmas in Long Texts”, PLoS ONE, vol. 10 (7), 2015.
  2. M. Cristelli, M. Batty, and L. Pietronero, “There isMore than a Power Law in Zipf”, Scientific Report 2,no. 812, 2012.
  3. A. Saichev, Y. Malevergne, and D. Sornette, “Theoryof Zipf ’s law and beyond”, Lecture Notes inEconomics and Mathematical Systems 632, Springer,Heidelberg, Germany, 2010.
  4. V. Bochkarev and E. Lerner, “Calculation of PreciseConstants in a Probability Model of Zipf’s LawGeneration and Asymptotics of Sums of MultinomialCoefficients”, International Journal of Mathematicsand Mathematical Sciences, vol. 17, 2017.
  5. M. Hauser, Ch. Yang, R. Berwick, I. Tattersall,M.Ryan, J. Watumull, N. Chomsky, andR.Lewontin, “The mystery of language evolution”,Frontiers in Psychology, vol. 5, 2014.
  6. W. Fitch, “Empirical approaches to the study oflanguage evolution”, Psychonomic Bulletin &Review, vol. 24 (1), 2017, pp. 3-33.
  7. Y. Zhao and J. Zobel, “Search with style: Authorshipattribution in classic literature”, In Proceedings of theThirtieth Australasian Computer Science Conference,Association for Computing Machinery, 2007.
  8. K.Shilova, D.Goncharenko,L.Bodnar,O.Britavska, and A. Grechkosiy, “Zipf’s laws andtranslation approaches”, Proc. of the 7th Intern,Conference “Information Technologies andManagement”, Riga, 2009, pp. 61-62.
  9. А. Kiv, D. Goncharenko, Ye. Sedov, L. Bodnar, andN.Yaremchuk, “Mathematical study of evolution ofRussian language”, Computer Modeling & NewTechnologies, vol. 12 (1), 2008, pp. 56-59.
  10. A. Kiv, L. Bodnar, O. Britavska, E. Sedov,N.Yaremchuk, and M. Yakovleva, “Quantitativeanalysis of translation texts”, Computer Modeling &New Technologies, vol. 18 (12C), 2014, pp. 260-263.
  11. Analyzing and Interpreting Large Datasets. Atlanta,GA: Centers for Disease Control and Prevention(CDC), 2013.
  12. S. Bird, “Natural Language Processing with Python”,O'Reilly Media Inc, 2009, р. 504.
  13. R. Dale, H. Moisl, and H. Somers, “Handbook ofNatural Language Processing”, Marcel Dekker, 2000.
  14. D. Mertz, “Text Processing in Python”, Addison-Wesley, Boston, MA, 2003.


       - Call for Papers
       - Paper Submission
       - For authors
       - Important Dates
       - Conference Committee
       - Editorial Board
       - Reviewers
       - Last Proceedings


       - Volume 12, Issue 1 (ICAIIT 2024)        - Volume 11, Issue 2 (ICAIIT 2023)
       - Volume 11, Issue 1 (ICAIIT 2023)
       - Volume 10, Issue 1 (ICAIIT 2022)
       - Volume 9, Issue 1 (ICAIIT 2021)
       - Volume 8, Issue 1 (ICAIIT 2020)
       - Volume 7, Issue 1 (ICAIIT 2019)
       - Volume 7, Issue 2 (ICAIIT 2019)
       - Volume 6, Issue 1 (ICAIIT 2018)
       - Volume 5, Issue 1 (ICAIIT 2017)
       - Volume 4, Issue 1 (ICAIIT 2016)
       - Volume 3, Issue 1 (ICAIIT 2015)
       - Volume 2, Issue 1 (ICAIIT 2014)
       - Volume 1, Issue 1 (ICAIIT 2013)


       ICAIIT 2024
         - Photos
         - Reports

       ICAIIT 2023
         - Photos
         - Reports

       ICAIIT 2021
         - Photos
         - Reports

       ICAIIT 2020
         - Photos
         - Reports

       ICAIIT 2019
         - Photos
         - Reports

       ICAIIT 2018
         - Photos
         - Reports







         Proceedings of the International Conference on Applied Innovations in IT by Anhalt University of Applied Sciences is licensed under CC BY-SA 4.0

                                                   This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

           ISSN 2199-8876
           Publisher: Edition Hochschule Anhalt
           Location: Anhalt University of Applied Sciences
           Phone: +49 (0) 3496 67 5611
           Address: Building 01 - Red Building, Top floor, Room 425, Bernburger Str. 55, D-06366 Köthen, Germany

        site traffic counter

Creative Commons License
Except where otherwise noted, all works and proceedings on this site is licensed under Creative Commons Attribution-ShareAlike 4.0 International License.