Skip to main content

Optimization Problem of k-NN Classifier for Missing Values Case

  • Chapter
  • First Online:
Interval-Valued Methods in Classifications and Decisions

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 378))

Abstract

In this chapter we present a method of dealing with missing values in data sets. This method uses interval-valued fuzzy calculus and we show that it outperforms other methods that were previously known. The obtained results may be useful in diverse computer support systems but especially in the computer support systems devoted to support the medical diagnosis.

You have to teach your algorithm what it can do and what it cannot do because, otherwise, there is a risk that the algorithms will learn the tricks of the old cartels.

Margrethe Vestager

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Michie, D., Spiegelhalter, D.J., Taylor, D.J.: Machine Learning, Neural and Statistical Classification. Ellis Horwood Limited, England (1994)

    Google Scholar 

  2. Pawlak, Z., Skowron, A.: Rudiments of rough sets. Inf. Sci. 177, 3–27 (2007)

    Article  MathSciNet  Google Scholar 

  3. Bazan, J.G.: Hierarchical classifiers for complex spatio-temporal concepts. Transactions on Rough Sets IX, pp. 474–750. Springer, Berlin (2008)

    Google Scholar 

  4. Bazan, J.G., Buregwa-Czuma, S., Jankowski, A.: A domain knowledge as a tool for improving classifiers. Fundam. Inform. 127(1–4), 495–511 (2013)

    Google Scholar 

  5. Bazan, J.G., Bazan-Socha, S., Buregwa-Czuma, S., Dydo, L., Rzasa, W., Skowron, A.: A classifier based on a decision tree with verifying cuts. Fundam. Inform. 143(1–2), 1–18 (2016)

    MathSciNet  Google Scholar 

  6. Buregwa-Czuma, S., Bazan, J.G., Bazan-Socha, S., Rzasa, W., Dydo, L., Skowron, A.: Resolving the conflicts between cuts in a decision tree with verifying cuts (The best application paper award). In: Proceedings of IJCRS 2017, Olsztyn, 3–7 July. Lecture Notes in Computer Science (LNCS), vol. 10314, pp. 403–422. Springer (2017)

    Google Scholar 

  7. Bailey, T., Jain, A.: A note on distance-weighted k-nearest neighbor rules. IEEE Trans. Syst. Man, Cybern. 8, 311–313 (1978)

    Google Scholar 

  8. Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man, Cybern., SMC 6, 325–327 (1976)

    Article  Google Scholar 

  9. Frank, E., Hall, M.A., Witten, I.H.: The WEKA workbench. Online Appendix for Data Mining Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann, Burlington (2016)

    Google Scholar 

  10. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  11. Dubois, D., Prade, H.: Gradualness, uncertainty and bipolarity: making sense of fuzzy sets. Fuzzy Sets Syst. 192, 3–24 (2012)

    Article  MathSciNet  Google Scholar 

  12. Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning-I. Inf. Sci. 8, 199–249 (1975)

    Article  MathSciNet  Google Scholar 

  13. Bentkowska, U.: New types of aggregation functions for interval-valued fuzzy setting and preservation of pos-B and nec-B-transitivity in decision making problems. Inf. Sci. 424, 385–399 (2018)

    Article  MathSciNet  Google Scholar 

  14. Dubois, D., Prade, H.: Possibility Theory. Plenum Press, New York (1988)

    Book  Google Scholar 

  15. Fawcett, T.: An introduction to ROC analysis. Pattern Recognit. Lett. 27(8), 861–874 (2006)

    Article  MathSciNet  Google Scholar 

  16. Swets, J.A.: Measuring the accuracy of diagnostic systems. Science 240, 1285–1293 (1988)

    Article  MathSciNet  Google Scholar 

  17. UC Irvine Machine Learning Repository: http://archive.ics.uci.edu/ml/

  18. De Waal, T., Pannekoek, J., Scholtus, S.: Handbook of Statistical Data Editing and Imputation, vol. 563. Wiley, Hoboken (2011)

    Book  Google Scholar 

  19. Grzymala-Busse, J.W.: Three approaches to missing attribute values: a rough set perspective. Stud. Comput. Intell. (SCI) 118, 139–152 (2008)

    MATH  Google Scholar 

  20. Dyczkowski, K.: Intelligent Medical Decision Support System Based on Imperfect Information. The Case of Ovarian Tumor Diagnosis. Studies in Computational Intelligence. Springer, Berlin (2018)

    Book  Google Scholar 

  21. Wójtowicz, A., Żywica, P., Stachowiak, A., Dyczkowski, K.: Solving the problem of incomplete data in medical diagnosis via interval modeling. Appl. Soft Comput. 47, 424–437 (2016)

    Article  Google Scholar 

  22. Żywica, P., Dyczkowski, K., Wójtowicz, A., Stachowiak, A., Szubert, S., Moszyński, R.: Development of a fuzzy-driven system for ovarian tumor diagnosis. Biocybern. Biomed. Eng. 36(4), 632–643 (2016)

    Article  Google Scholar 

  23. Żywica, P., Wójtowicz, A., Stachowiak, A., Dyczkowski, K.: Improving medical decisions under incomplete data using intervalvalued fuzzy aggregation. In: Proceedings of the IFSA-EUSFLAT 2015, pp. 577–584. Atlantis Press (2015)

    Google Scholar 

  24. Wójtowicz, A., Żywica, P., Szarzyński, K., Moszyński, R., Szubert, S., Dyczkowski, K., Stachowiak, A., Szpurek, D., Wygralak, M.: Dealing with uncertinity in ovarian tumor diagnosis. Modern Approaches in Fuzzy Sets, Intuitionistic Fuzzy Sets, Generalized Nets and Related Topics. Vol. II: Applications, pp. 151–158. SRI PAS, Warszawa (2014)

    Google Scholar 

  25. Szubert, S., Wójtowicz, A., Moszyński, R., Żywica, P., Dyczkowski, K., Stachowiak, A., Sajdak, S., Szpurek, D., Alcázar, J.L.: External validation of the IOTA ADNEX model performed by two independent gynecologic centers. Gynecol. Oncol. 142(3), 490–495 (2016)

    Article  Google Scholar 

  26. Stachowiak, A., Dyczkowski, K., Wójtowicz, A., Żywica, P., Wygralak, M.: A bipolar view on medical diagnosis in ovaexpert system. In: Andreasen, T., Christiansen, H., Kacprzyk, J., et al. (eds.) Flexible Query Answering Systems 2015, Proceedings of FQAS 2015, Cracow, Poland, October 26–28, 2015. Advances in Intelligent Systems and Computing, vol. 400, pp. 483–492. Springer International Publishing, Cham, Switzerland (2016)

    Google Scholar 

  27. Moszyński, R., Żywica, P., Wójtowicz, A., Szubert, S., Sajdak, S., Stachowiak, A., Dyczkowski, K., Wygralak, M., Szpurek, D.: Menopausal status strongly influences the utility of predictive models in differential diagnosis of ovarian tumors: an external validation of selected diagnostic tools. Ginekol. Pol. 85(12), 892–899 (2014)

    Article  Google Scholar 

  28. Dyczkowski, K., Wójtowicz, A., Żywica, P., Stachowiak, A., Moszyński, R., Szubert, S.: An intelligent system for computer-aided ovarian tumor diagnosis. Intelligent Systems 2014, pp. 335–344. Springer International Publishing, Cham (2015)

    Google Scholar 

  29. Fix, E., Hodges, J.L.: discriminatory analysis, aonparametric discrimination: consistency properties. Technical Report 4, USAF School of Aviation Medicine, Randolph Field, Texas (1951)

    Google Scholar 

  30. Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. Inform. Theory 13(1), 21–27 (1967)

    Article  Google Scholar 

  31. Bermejo, S., Cabestany, J.: Adaptive soft k-nearest-neighbour classifiers. Pattern Recognit. 33, 1999–2005 (2000)

    Article  Google Scholar 

  32. Jozwik, A.: A learning scheme for a fuzzy k-nn rule. Pattern Recognit. Lett. 1, 287–289 (1983)

    Article  Google Scholar 

  33. Keller, J.M., Gray, M.R., Givens, J.A.: A fuzzy k-nn neighbor algorithm. IEEE Trans. Syst. Man Cybern. SMC 15(4), 580–585 (1985)

    Article  Google Scholar 

  34. Moore, R.E.: Interval Analysis, vol. 4. Prentice-Hall, Englewood Cliffs (1966)

    Google Scholar 

  35. Bentkowska, U., Bazan, J.G., Rza̧sa, W., Zarȩba, L.: Application of interval-valued aggregation to optimization problem of \(k\)-\(NN\) classifiers for missing values case. Inf. Sci. (under review)

    Google Scholar 

  36. Mansouri, K., Ringsted, T., Ballabio, D., Todeschini, R., Consonni, V.: Quantitative structure - activity relationship models for ready biodegradability of chemicals. J. Chem. Inf. Model. 53, 867–878 (2013)

    Article  Google Scholar 

  37. Wolberg, W.H., Mangasarian, O.L.: Multisurface method of pattern separation for medical diagnosis applied to breast cytology. In: Proceedings of the National Academy of Sciences, vol. 87, pp. 9193–9196. U.S.A., Dec 1990

    Google Scholar 

  38. Zhang, K., Fan, W.: Forecasting skewed biased stochastic ozone days: analyses, solutions and beyond. Knowl. Inf. Syst. 14(3), 299–326 (2008)

    Article  Google Scholar 

  39. Antal, B., Hajdu, A.: An ensemble-based system for automatic screening of diabetic retinopathy. Knowl. Based Syst. 60, 20–27 (2014)

    Article  Google Scholar 

  40. Japkowicz, N., Shah, M.: Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press, New York (2011)

    Book  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Urszula Bentkowska .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bentkowska, U. (2020). Optimization Problem of k-NN Classifier for Missing Values Case. In: Interval-Valued Methods in Classifications and Decisions. Studies in Fuzziness and Soft Computing, vol 378. Springer, Cham. https://doi.org/10.1007/978-3-030-12927-9_4

Download citation

Publish with us

Policies and ethics