Abstract
Email has become a significant correspondence medium. Official, personal, social and promotional and other messages hit our mail box every day. From the research, it has discovered that the normal office specialist gets 121 messages for every day. Now and then because of flooding of messages in inbox, a portion of the some mails stay unattended, so on the off chance that messages are characterized into top need folders, at that point, the issue of unattended or unanswered mail will be tackled. In this paper, we identified the key features of email classification are temporal, behavioral, single email multinomial valued, content and local and global features. Also datasets, techniques and tools in various email classification like spam, phishing, multifolder and machine generated email classification were studied. Different email classifiers provide different mechanisms for classification. Challenges in email classification are discussed. From the study, it is found that J48 classification algorithm works the best for spam and ham email classification. In comparison with various email service provider, Microsoft Outlook filters the mail based on many criteria.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mujtaba G, Shuib L, Raj RG, Majeed N, Al-Garadi MA (2017) Email classification research trends: review and open Issues. IEEE Access 5:9044–9064
Alsmadi I, Alhami I (2015) Clustering and classification of email contents. J King Saud Univ Comput Inf Sci 27(1):46–57
Youn S, McLeod D (2007) A comparative study for email classification. Advances and innovations in systems, computing sciences and software engineering. Springer, Dordrecht, pp. 387–391
Tang G, Pei J, Luk WS (2014) Email mining: tasks, common techniques, and tools. Knowl Inf Syst 41(1):1–31
Ailon N, Karnin ZS, Liberty E, Maarek Y (2013) Threading machine generated email. In: Proceedings of 6th ACM international conference on web search data mining, pp 405–414
Smadi S, Aslam N, Zhang L, Alasem R, Hossain MA (2016) Detection of phishing emails using data mining algorithms. In: 9th international conference on software, knowledge, information management and applications
Şentürk Ş, Yerli E, Soǧukpnar İ (2017) Email phishing detection and prevention by using data mining techniques. In: 2nd international conference on computer science and engineering (UBMK), pp 707–712
Aski AS, Sourati NK (2016) Proposed efficient algorithm to filter spam using machine learning techniques. Pacific Sci Rev A Nat Sci Eng 18(2):145–149
Chae MK, Alsadoon A, Prasad PWC, Sreedharan S (2017) Spam filtering email classification (SFECM) using gain and graph mining algorithm. In: 2nd international conference on anti-cyber crimes, pp 217–222
Bekkerman R, McCallum A, Huang G (2004) Automatic categorization of email into folders: benchmark experiments on Enron and SRI corpora. Science 80(418):1–23
Kanja S. Editing training data for multi-label classification with the k-nearest neighbor rule. https://www.hds.utc.fr/~tdenoeux/dokuwiki/_media/en/publi/paaa2015.pdf
Di Castro D (2018) Automated extractions for machine generated mail. In: WWW ’18 companion: the 2018 web conference companion, vol 2, pp 655–662
Sun Y, Garcia-Pueyo L, Wendt JB, Najork M, Broder A (2019) Learning effective embeddings for machine generated emails with applications to email category prediction. In: Proceedings—2018 IEEE international conference on big data (Big Data), vol ii, pp 1846–1855
Brutlag JD, Meek C (2000) Challenges of the email domain for text classification. In: Proceedings of the seventeenth international conference on machine learning
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shroff, N., Sinhgala, A. (2021). Email Classification Techniques—A Review. In: Kotecha, K., Piuri, V., Shah, H., Patel, R. (eds) Data Science and Intelligent Applications. Lecture Notes on Data Engineering and Communications Technologies, vol 52. Springer, Singapore. https://doi.org/10.1007/978-981-15-4474-3_21
Download citation
DOI: https://doi.org/10.1007/978-981-15-4474-3_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-4473-6
Online ISBN: 978-981-15-4474-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)