Abstract
This article is devoted to the problem of identifying propaganda in text files. Developing methods and techniques that can be used for this analysis is an important task, since the amount of propaganda is enormous. Such amounts of information cannot be analyzed by specialists. A study of how agitation is changing over time is valuable for understanding which areas of our life are particularly covered by propaganda, how its rhetoric is changing, and what impact it will have. All of the above indicates the relevance of research in this area.
The objects of the research are the content of electronic media news, users, and the interrelations between them.
The purpose of this work is to improve the accuracy of the classification of textual information through the appropriate use of existing methods of data mining using the most effective methods of text preprocessing and powerful machine learning algorithms for classification problems.
The methods for solving the problem are considered to solve the problem of classifying text information for spam filtration tasks, contextual advertising, news categorization, creation of subject catalogs.
That is why it is necessary to automate the process of searching, filtering and structuring text data. To solve this problem, the automated classification of texts is used - the task of machine learning from the field of natural language processing. The task of text classification has practical application in many areas, for example, spam filtering, contextual advertising, news categorization, creation of thematic catalogs. Most methods of automatic classification of texts are based on the assumption that the texts of each thematic heading contain certain features, the presence or absence of which indicates the belonging of the text of a rubric. The task of classification methods is to select the best following characteristics and formulate rules, will decide whether to refer the text to a certain category and conduct interactive drilling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agirre, E., Edmonds, P.: Word Sense Disambiguation: Algorithms and Applications, pp. 5–7. Springer Publishing Company Incorporated (2007)
Woolley, S.C., Howard, P.N. (eds.): Computational Propaganda: Political Parties, Politicians, and Political Manipulation on Social Media. Oxford University Press (2018)
Wooley, S., Howard, P.: Political communication, computational propaganda, and autonomous agents: Introduction. Int. J. Commun. 10, 4882–4890 (2016)
Spangher, A., Ranade, G., Nushi, B., Fourney, A., Horvitz, E.: Analysis of Strategy and Spread of Russia-sponsored Content in the US in 2017. arXiv preprint arXiv:1810.10033 (2018)
Gavrilenko, O., Oliynik, Y., Khanko, H.: Review and analysis of algorithms text mining. Project management, systems analysis and logistics, vol. 19, pp. 32–40 (2017)
Wakil, K., Ghafoor, M., Abdulrahman, M., Tariq, S.: Plagiarism detection system for the Kurdish language. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 9(12), 64–71 (2017). https://doi.org/10.5815/ijitcs.2017.12.08
Gazeau, V., Varol, C.: Automatic spoken language recognition with neural networks. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 10(8), 11–17 (2018). https://doi.org/10.5815/ijitcs.2018.08.02
Panda, M.: Developing an efficient text pre-processing method with sparse generative Naive Bayes for text mining. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 10(9), 11–19 (2018). https://doi.org/10.5815/ijmecs.2018.09.02
Zaki, T., Bazzi, M.S.E.L., Mammass, D.: An evolutionary model for selecting relevant textual features. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 10(11), 43–50 (2018). https://doi.org/10.5815/ijmecs.2018.11.06
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson Education, Inc., pp. 130–156 (2009)
Wang, Y.: Various approaches in text pre-processing. TM Work Paper No, vol. 2, no. 5, pp. 1–3 (2004)
Berger, J.: The metronome of apocalyptic time: social media as carrier wave for millenarian contagion. Perspect. Terror. 9, 2–12 (2015)
Vinciarelli, A.: Noisy text categorization, pattern recognition. In: 17th International Conference on ICPR 2004, pp. 554–557 (2004)
Reid, S.: 10 misconceptions about neural networks, pp. 2–4, May 2014
Olah, C.: Understanding LSTM networks. colah’s blog, 27 August 2015. http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Han, J., Kamber, M.: Data Mining Concepts and Techniques, pp. 12–16. Morgan Kaufmann (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Gavrilenko, O., Oliinyk, Y., Khanko, H. (2020). Analysis of Propaganda Elements Detecting Algorithms in Text Data. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds) Advances in Computer Science for Engineering and Education II. ICCSEEA 2019. Advances in Intelligent Systems and Computing, vol 938. Springer, Cham. https://doi.org/10.1007/978-3-030-16621-2_41
Download citation
DOI: https://doi.org/10.1007/978-3-030-16621-2_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16620-5
Online ISBN: 978-3-030-16621-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)