Efficient Query Clustering Technique and Context Well-Informed Document Clustering

Rani, Manukonda Sumathi; Babu, Geddati China

doi:10.1007/978-981-13-3600-3_25

Manukonda Sumathi Rani¹⁸ &
Geddati China Babu¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 900))

839 Accesses
2 Citations

Abstract

Data clustering plays a crucial role in extracting useful information based on the user interest. Traditional query clustering algorithms work on the collection of previously available data from the query stream. As we observe day by day the topic of interests, popularity, query meaning is changing. However, it is quite challenging as the queries are incomplete, ambiguous and short. Existing clustering methods like k-means or DBSCAN cannot assure to perform well in such fully measurable environment. There is high demand for enhancement of algorithms that can indulge in the prediction of characteristics, as the new data is being added to the data mob without implementing a complete re-clustering. So, proposing a new enhancement paradigm for query and context well-informed query document clustering. Even through analysis of user’s click-through log and hierarchical agglomerative clustering, we can achieve good results, but, however, it is computationally quite expensive. In order to overcome the problem, the proposed enhancement model attains both the query and document cluster quality. This model in regular intervals updates the new information which is being produced and can be applied in a distributed environment. And also, the suggested paradigm can be related to the outcome of hierarchical query clustering algorithms which produces query clusters and as well as document clusters. This proposed system not only concentrates on achieving accuracy, but also can show a remarkable speedup.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

J. Wen, J. Nie, H. Zhang, Clustering user queries of a search engine, in WWW: Proceedings of the 10th International World Wide Web Conference (ACM, Hong Kong, 1–5 May 2001), pp. 162–168
Google Scholar
D. Beeferman, A. Berger, Agglomerative clustering of a search engine query log, in ACM SIGKDD: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA USA, 20–23 Aug 2000, pp. 407–416
Google Scholar
H. Chien-Kang, C. Lee-Feng, O. Yen-Jen, Clustering similar query sessions toward interactive web search, in Proceedings of Research on Computational Linguistics Conference XIII (2000)
Google Scholar
H. Cao, D. Jiang, J. Pei, Q. He, Z. Liao, E. Chen, H. Li, Context-aware query suggestion by mining click-through and session data, in KDD: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, Las Vegas, NV, USA, 24–27 Aug 2010), pp. 875–883
Google Scholar
K.W-T. Leung, W. Ng, D.L. Lee, Personalized concept-based clustering of search engine queries. IEEE Trans. Knowl. Data Eng. 20(11), 1505–1518
Article Google Scholar
G. Ranjna, D. Neelam, A.K. Sharma, A. Neha, Query based duplicate data detection on WWW. Int. J. Comput. Sci. Eng. (IJCSE) 2(4), 1395–1400
Google Scholar
R. Zaiane, A. Strilets, Finding similar queries to satisfy searches based on query traces, in EWIS: Proceedings of the International Workshop on Efficient Web-based Information Systems, Montpellier, France, 2 Sept 2002, pp. 207–216
Google Scholar
R. Baeza-Yates, C. Hurtado, M. Mendoza, Query recommendation using query logs in search engines, in EDBT Workshop on Current Trends in Database Technology (Springer, Berlin, Heidelberg, 2004), pp. 588–596
Chapter Google Scholar
G. Dupret, M. Mendoza, Recommending better queries from click-through data, in SPIRE: Proceedings of the 12th International Symposium on String Processing and Information Retrieval (Springer, Buenos Aires, Argentina, 2–4 Nov 2005), pp. 41–44
Google Scholar
S.S. Kumar, S. Ugrasen, An efficient semantic clustering of URLs for web page recommendation. Int. J. Data Anal. Tech. Strat. (IJDATS) 5(4), 339–358
Google Scholar
P. Boldi, F. Bonchi, C. Castillo, D. Donato, S. Vigna, Query suggestions using query-flow graphs, in WSCD: Proceedings of the Workshop on Web Search Click Data (ACM, New York, USA, 9 Feb 2009), pp. 56–63
Google Scholar
M. Steinbach, G. Karypis, V. Kumar, A comparison of document clustering techniques, in KDD Workshop on Text Mining, Boston, MA, pp. 109–111 (2000)
Google Scholar
W-C. Wong, A.W. Fu, Incremental document clustering for web page classification, in International Conference on Information Society in the 21st Century: Emerging Technologies and New Challenges, Fukushima, Japan [online] (2000). http://citeseer.nj.nec.com/article/wong01incremental.html. Accessed July 2011
B. Daniele, F. Ophir, M. Franco, P. Raffaele, S. Fabrizio, Incremental algorithms for effective and efficient query recommendation, in SPIRE: Proceedings of the 17th International Symposium on String Processing and Information Retrieval, (Springer, Los Cabos, Mexico, 11–13 Oct 2010), pp. 13–24
Google Scholar
P. Goyal, N. Mehala, Concept based query recommendation, in AusDM: Proceedings of the Ninth Australasian Data Mining Conference, vol. 121 (ACM, Ballarat, Australia, 1–2 Dec 2011), pp. 69–78
Google Scholar

Download references

Author information

Authors and Affiliations

Department of CSE, Keshav Memorial Institute of Technology (KMIT), Hyderabad, India
Manukonda Sumathi Rani
Department of MBA, Bandari Srinivas Institute of Technology (BSIT), Hyderabad, India
Geddati China Babu

Authors

Manukonda Sumathi Rani
View author publications
You can also search for this author in PubMed Google Scholar
Geddati China Babu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manukonda Sumathi Rani .

Editor information

Editors and Affiliations

Department of Computer Science and Software Engineering, Monmouth University, West Long Branch, NJ, USA
Jiacun Wang
Department of Information Technology, National Institute of Technology Karnataka, Surathkal, Mangaluru, Karnataka, India
G. Ram Mohana Reddy
Department of Computer Science and Engineering, JNTUH College of Engineering Hyderabad, Hyderabad, Telangana, India
V. Kamakshi Prasad
Department of Electronics and Communication Engineering, Malla Reddy College of Engineering and Technology, Secunderabad, Telangana, India
V. Sivakumar Reddy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rani, M.S., Babu, G.C. (2019). Efficient Query Clustering Technique and Context Well-Informed Document Clustering. In: Wang, J., Reddy, G., Prasad, V., Reddy, V. (eds) Soft Computing and Signal Processing . Advances in Intelligent Systems and Computing, vol 900. Springer, Singapore. https://doi.org/10.1007/978-981-13-3600-3_25

Download citation

DOI: https://doi.org/10.1007/978-981-13-3600-3_25
Published: 17 January 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-3599-0
Online ISBN: 978-981-13-3600-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics