Abstract
In a standard cluster analysis, such as k-means, in addition to clusters locations and distances between them it is important to know if they are connected or well separated from each other. The main focus of this paper is discovering the relations between the resulting clusters. We propose a new method which is based on pairwise overlapping k-means clustering, that in addition to means of clusters provides the graph structure of their relations. The proposed method has a set of parameters that can be tuned in order to control the sensitivity of the model and the desired relative size of the pairwise overlapping interval between means of two adjacent clusters, i.e., level of overlapping. We present the exact formula for calculating that parameter. The empirical study presented in the paper demonstrates that our approach works well not only on toy data but also compliments standard clustering results with a reasonable graph structure on a real datasets, such as financial indices and restaurants.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Jardine, N., Sisbon, R.: Mathematical Taxonomy. John Wiley, London (1971)
Andersen, R., Gleich, D.F., Mirrokni, V.: Overlapping clusters for distributed computation. In: WSDM 2012. ACM, New York (2012)
Khandekar, R., Kortsarz, G., Mirrokni, V.: Advantage of overlapping clusters for minimizing conductance. In: Proceedings of the 10th Latin American International Conference on Theoretical Informatics, LATIN 2012, pp. 494–505. Springer, Heidelberg (2012)
Szalay-Bekő, M., Palotai, R., Szappanos, B., Kovács, I.A., Papp, B., Csermely, P.: Moduland plug-in for cytoscape. Bioinformatics 28(16), 2202–2204 (2012)
Gregory, S.: A fast algorithm to find overlapping communities in networks. In Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I, ECML PKDD 2008, pp. 408–423. Springer, Heidelberg (2008)
Obadi, G., Drazdilova, P., Hlavacek, L., Martinovic, J., Snasel, V.: A tolerance rough set based overlapping clustering for the DBLP data. In: IEEE/WIC/ACM, WI-IAT 2010, pp. 57–60. IEEE Computer Society (2010). https://doi.org/10.1109/WI-IAT.2010.286
Meilă, M., Heckerman, D.: An experimental comparison of model-based clustering methods. Mach. Learn. 42(1–2), 9–29 (2001). https://doi.org/10.1023/A:1007648401407
Battle, A., Segal, E., Koller, D.: Probabilistic discovery of overlapping cellular processes and their regulation. In: RECOMB 2004. ACM, New York (2004). https://doi.org/10.1145/974614.974637
Banerjee, A., Krumpelman, C., Ghosh, J., Basu, S., Mooney, R.J.: Model-based overlapping clustering. In: KDD 2005. ACM, New York (2005). https://doi.org/10.1145/1081870.1081932
Fu, Q., Banerjee, A.: Bayesian overlapping subspace clustering. In: ICDM 2009, pp. 776–781. IEEE Computer Society, Washington (2009). https://doi.org/10.1109/ICDM.2009.132
Shafiei, M.M., Milios, E.E.: Latent dirichlet co-clustering. In: ICDM 2006. IEEE Computer Society, Washington (2006). https://doi.org/10.1109/ICDM.2006.94
Cleuziou, G.: A generalization of k-means for overlapping clustering. In: Research Report NRR-2007-15, LIFO - University of Orleans (2007)
Cleuziou, G.: Osom: a method for building overlapping topological maps. Pattern Recogn. Lett. 34(3), 239–246 (2013). https://doi.org/10.1016/j.patrec.2012.10.013
Whang, J., Dhillon, I.S., Gleich, D.: Non-exhaustive, overlapping k-means. In: SIAM International Conference on Data Mining (SDM), May 2015
Bauman, E., Muchnik, I.: Restructuring algorithm in the graph approximation problem. Autom. Remote Control 37(6), 920–926 (1976)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Kluwer Academic Publishers, Norwell (1981)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). http://dl.acm.org/citation.cfm?id=944919.944937
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Bauman, E., Bauman, K. (2019). Discovering the Graph Structure in Clustering Results. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Advances in Information and Communication Networks. FICC 2018. Advances in Intelligent Systems and Computing, vol 886. Springer, Cham. https://doi.org/10.1007/978-3-030-03402-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-03402-3_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03401-6
Online ISBN: 978-3-030-03402-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)