Email Classification Techniques—A Review

Shroff, Namrata; Sinhgala, Amisha

doi:10.1007/978-981-15-4474-3_21

Namrata Shroff⁶ &
Amisha Sinhgala⁷

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 52))

1577 Accesses
2 Citations

Abstract

Email has become a significant correspondence medium. Official, personal, social and promotional and other messages hit our mail box every day. From the research, it has discovered that the normal office specialist gets 121 messages for every day. Now and then because of flooding of messages in inbox, a portion of the some mails stay unattended, so on the off chance that messages are characterized into top need folders, at that point, the issue of unattended or unanswered mail will be tackled. In this paper, we identified the key features of email classification are temporal, behavioral, single email multinomial valued, content and local and global features. Also datasets, techniques and tools in various email classification like spam, phishing, multifolder and machine generated email classification were studied. Different email classifiers provide different mechanisms for classification. Challenges in email classification are discussed. From the study, it is found that J48 classification algorithm works the best for spam and ham email classification. In comparison with various email service provider, Microsoft Outlook filters the mail based on many criteria.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mujtaba G, Shuib L, Raj RG, Majeed N, Al-Garadi MA (2017) Email classification research trends: review and open Issues. IEEE Access 5:9044–9064
Article Google Scholar
Alsmadi I, Alhami I (2015) Clustering and classification of email contents. J King Saud Univ Comput Inf Sci 27(1):46–57
Google Scholar
Youn S, McLeod D (2007) A comparative study for email classification. Advances and innovations in systems, computing sciences and software engineering. Springer, Dordrecht, pp. 387–391
Google Scholar
Tang G, Pei J, Luk WS (2014) Email mining: tasks, common techniques, and tools. Knowl Inf Syst 41(1):1–31
Article Google Scholar
Ailon N, Karnin ZS, Liberty E, Maarek Y (2013) Threading machine generated email. In: Proceedings of 6th ACM international conference on web search data mining, pp 405–414
Google Scholar
Smadi S, Aslam N, Zhang L, Alasem R, Hossain MA (2016) Detection of phishing emails using data mining algorithms. In: 9th international conference on software, knowledge, information management and applications
Google Scholar
Şentürk Ş, Yerli E, Soǧukpnar İ (2017) Email phishing detection and prevention by using data mining techniques. In: 2nd international conference on computer science and engineering (UBMK), pp 707–712
Google Scholar
Aski AS, Sourati NK (2016) Proposed efficient algorithm to filter spam using machine learning techniques. Pacific Sci Rev A Nat Sci Eng 18(2):145–149
Google Scholar
Chae MK, Alsadoon A, Prasad PWC, Sreedharan S (2017) Spam filtering email classification (SFECM) using gain and graph mining algorithm. In: 2nd international conference on anti-cyber crimes, pp 217–222
Google Scholar
Bekkerman R, McCallum A, Huang G (2004) Automatic categorization of email into folders: benchmark experiments on Enron and SRI corpora. Science 80(418):1–23
Google Scholar
Kanja S. Editing training data for multi-label classification with the k-nearest neighbor rule. https://www.hds.utc.fr/~tdenoeux/dokuwiki/_media/en/publi/paaa2015.pdf
Di Castro D (2018) Automated extractions for machine generated mail. In: WWW ’18 companion: the 2018 web conference companion, vol 2, pp 655–662
Google Scholar
Sun Y, Garcia-Pueyo L, Wendt JB, Najork M, Broder A (2019) Learning effective embeddings for machine generated emails with applications to email category prediction. In: Proceedings—2018 IEEE international conference on big data (Big Data), vol ii, pp 1846–1855
Google Scholar
Brutlag JD, Meek C (2000) Challenges of the email domain for text classification. In: Proceedings of the seventeenth international conference on machine learning
Google Scholar

Download references

Author information

Authors and Affiliations

Gujarat Technological University, Chandkheda, Gujarat, India
Namrata Shroff
S.V.I.T Vasad, Vasad, India
Amisha Sinhgala

Authors

Namrata Shroff
View author publications
You can also search for this author in PubMed Google Scholar
Amisha Sinhgala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Namrata Shroff .

Editor information

Editors and Affiliations

Faculty of Engineering, Symbiosis Institute of Technology, Pune, India
Ketan Kotecha
Department of Computer Science, Università degli Studi di Milano, Milan, Italy
Vincenzo Piuri
Gandhinagar Institute of Technology, Gandhinagar, Gujarat, India
Hetalkumar N. Shah
Department of Computer Engineering, Gandhinagar Institute of Technology, Gandhinagar, Gujarat, India
Rajan Patel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shroff, N., Sinhgala, A. (2021). Email Classification Techniques—A Review. In: Kotecha, K., Piuri, V., Shah, H., Patel, R. (eds) Data Science and Intelligent Applications. Lecture Notes on Data Engineering and Communications Technologies, vol 52. Springer, Singapore. https://doi.org/10.1007/978-981-15-4474-3_21

Download citation

DOI: https://doi.org/10.1007/978-981-15-4474-3_21
Published: 18 June 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-4473-6
Online ISBN: 978-981-15-4474-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics