Abstract
This paper presents a newly modified K-Mean technique for clustering data that are not situated around a single point center. When the clusters are elongated, the traditional K-Mean technique cannot yield meaningful results. In modifying the K-Mean technique to allow a center to be a line segment, elongated clusters can be extracted for analysis. The distance function is modified to measure the distance between a point and a set (line segment). The modified technique can be easily extended to multidimensional data where the center is shaped as a hyperplane, and the clusters of data that are situated around the hyperplane can be easily extracted and modeled into a regression model. The technique is applied to economic data of Chile, where the clusters are shown to be of irregular shapes, and where it is common to find regression model representing data sets.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. Chapman & Hall/CRC, Boca Raton (2013)
Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications. Society for Industrial and Applied Mathematics (SIAM), Philadelphis (2007)
Akhiezer, N.I., Glazman, I.M.: Theory of Linear Operators in Hilbert Space. Dover Publications, New York (1993)
Young, N.: An Introduction to Hilbert Space. Cambridge University Press, Cambridge (1988)
Schroeder, L.D., Sjoquist, D.L., Stephan, P.E.: Understanding Regression Analysis: An Introductory Guide. SAGE Publications, Thousand Oaks (2017)
Treiman, D.J.: Quantitative Data Analysis: Doing Social Research to Test Ideas. Jossey-Bass, San Francisco (2009)
Berkhin, P.: A survey of clustering data mining techniques. In: Grouping Multidimensional Data, pp. 25–71. Springer, Heidelberg (2006)
Popat, S.K., Emmanuel, M.: Review and comparative study of clustering techniques. Int. J. Comput. Sci. Inf. Technol. 5(1), 805–812 (2014)
Wu, J.: Advances in K-Means Clustering: A Data Mining Thinking. Springer-Verlag, Berlin (2012)
Wang, J., Wang, J., Song, J., Xu, X.S., Shen, H.T., Li, S.: Optimized cartesian K-means. IEEE Trans. Knowl. Data Eng. 27(1), 180–192 (2015)
Memon, K.H., Lee, D.H.: Generalised fuzzy c-means clustering algorithm with local information. IET Image Process. 11(1), 1–12 (2017)
Sato, M., Sato, Y.: Fuzzy Clustering Models and Applications. Physica-Verlag, Heidelberg (2002)
Huang, W., Ribeiro, A.: Hierarchical clustering given confidence intervals of metric distances. IEEE Trans. Signal Process. 66(10), 2600–2615 (2018)
Zhou, S., Xu, Z., Liu, F.: Method for determining the optimal number of clusters based on agglomerative hierarchical clustering. IEEE Trans. Neural Netw. Learn. Syst. 28(12), 3007–3017 (2017)
Nguyen, H.D., McLachlan, G.J., Orban, P., Bellec, P., Janke, A.L.: Maximum pseudolikelihood estimation for model-based clustering of time series data. Neural Comput. 29(4), 990–1020 (2017)
Chen, L., Jiang, Q., Wang, S.: Model-based method for projective clustering. IEEE Trans. Knowl. Data Eng. 24(7), 1291–1305 (2012)
Kutner, M.H., Nachtsheim, C.K., Neter, J.: Applied Linear Regression Models. McGraw-Hill Education, New York (2004)
Darlington, R.B., Hayes, A.F.: Regression Analysis and Linear Models: Concepts, Applications, and Implementation. The Guilford Press, New York (2016)
Breuer, J.: Introduction to the Theory of Sets. Dover Publications, New York (2006)
Cunningham, D.W.: Set Theory: A First Course. Cambridge University Press, Cambridge (2016)
Brand, L.: Vector Analysis. Dover Publications, New York (2006)
Alabiso, C., Weiss, I.: A Primer on Hilbert Space Theory: Linear Spaces, Topological Spaces, Metric Spaces, Normed Spaces, and Topological Groups. Springer, New York (2015)
Barvinok, A.: A Course in Convexity. American Mathematical Society, Providence (2002)
Berkovitz, L.D.: Convexity and Optimization in Rn. Wiley-Interscience, New York (2001)
Acknowledgment
Part of this study was supported by the Chilean R&D Agency CONICYT, under the research grant FONDEF IT15I10042 for the duration of 2016–2018. Economic data used in this paper were obtained from the Central Bank of Chile.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Pham, T.T. (2020). Clustering of Economic Data with Modified K-Mean Technique. In: Arai, K., Bhatia, R. (eds) Advances in Information and Communication. FICC 2019. Lecture Notes in Networks and Systems, vol 69. Springer, Cham. https://doi.org/10.1007/978-3-030-12388-8_44
Download citation
DOI: https://doi.org/10.1007/978-3-030-12388-8_44
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12387-1
Online ISBN: 978-3-030-12388-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)