Skip to main content

Accurate, Timely, Reliable: A High Standard and Elusive Goal for Traveler Information Data Quality

  • Conference paper
  • First Online:
Advances in Information and Communication (FICC 2019)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 69))

Included in the following conference series:

  • 1287 Accesses

Abstract

In this paper, we demonstrate the difficulty of conducting spatio-temporal data quality control for sensor data. Our motivation is the provision of quality traveler information by departments of transportation. We show that assessment of accuracy of air temperature requires robust methods that go beyond the identification of outliers and inliers to mitigate the impact of bad data and bad metadata. We give a representative approach and demonstrate the challenges of assessment, particularly in the presence of incorrect data quality labels and the absence of ground truth for this air temperature data. Our approach is model-based and can be used to estimate not only outliers versus inliers, but also degree of outlyingness. It can not only be used to identify bad data in general as well as bad metadata. We evaluate our approach against other methods that use interpolation to model the data. We use an Area Under the ROC (AUROC) analysis to compare methods when data quality labels are provided. We use mean-squared-error and t-tests to compare methods both when labels are provided and when not. We measure scalability using computation time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barnes, S.L.: A technique for maximizing details in numerical weather map analysis. J. Appl. Meteorol. 3(4), 396–409 (1964)

    Article  Google Scholar 

  2. Shepard, D.: A two-dimensional interpolation function for irregularly-spaced data. In: 23rd ACM National Conference, pp. 517–524 (1968)

    Google Scholar 

  3. Shafer, M.A., Fiebrich, C.A., Arndt, D.S., Fredrickson, S.E., Hughes, T.W.: Quality assurance procedures in the Oklahoma Mesonetwork. J. Atmos. Ocean. Technol. 17(4), 474–494 (2000)

    Article  Google Scholar 

  4. Limber, M., Drobot, S., Fowler, T.: Clarus Quality Checking Algorithm Documentation Report (2010)

    Google Scholar 

  5. University of Utah: MesoWest Data. http://mesowest.utah.edu/. Accessed 26 Dec 2015

  6. Splitt, M.E., Horel, J.D.: Use of Multivariate Linear Regression for Meteorological Data Analysis and Quality Assessment in Complex Terrain. http://mesowest.utah.edu/html/help/regress.html. Accessed 26 Dec 2015

  7. University of Utah: MesoWest Quality Control Flags Help Page. http://mesowest.utah.edu/html/help/key.html. Accessed 26 Dec 2015

  8. NOAA: Meteorological Assimilation Data Ingest System (MADIS). http://madis.noaa.gov/. Accessed 26 Dec 2015

  9. NOAA: MADIS Meteorological Surface Quality Control. https://madis.ncep.noaa.gov/madis_sfc_qc.shtml. Accessed 26 Dec 2015

  10. NOAA: MADIS Quality Control. http://madis.noaa.gov/madis_qc.html. Accessed 26 Dec 2015

  11. Belousov, S.L., Gandin, L.S., Mashkovich, S.A.: Computer Processing of Current Meteorological Data. Translated from Russian to English by Atmospheric Environment Service. Nurklik, Meteorol. Transl., no. 18, p. 227 (1972_

    Google Scholar 

  12. Zimmerman, D., Pavlik, C., Ruggles, A., Armstrong, M.P.: An experimental comparison of ordinary and universal kriging and inverse distance weighting. Math. Geol. 31(4), 375–390 (1999)

    Article  Google Scholar 

  13. Lu, G.Y., Wong, D.W.: An adaptive inverse-distance weighting spatial interpolation technique. Comput. Geosci. 34(9), 1044–1055 (2008)

    Article  Google Scholar 

  14. Mueller, T.G., Pusuluri, N.B., Mathias, K.K., Cornelius, P.L., Barnhisel, R.I., Shearer, S.A.: Map quality for ordinary kriging and inverse distance weighted interpolation. Soil Sci. Soc. Am. J. 68(6), 2042 (2004)

    Article  Google Scholar 

  15. Galarus, D.E., Angryk, R.A., Sheppard, J.W.: Automated weather sensor quality control. In: FLAIRS Conference, pp. 388–393 (2012)

    Google Scholar 

  16. Galarus, D.E., Angryk, R.A.: Mining robust neighborhoods for quality control of sensor data. In: Proceedings of 4th ACM SIGSPATIAL International Workshop GeoStreaming (IWGS 2013), pp. 86–95, November 2013

    Google Scholar 

  17. Galarus, D.E., Angryk, R.A.: A SMART approach to quality assessment of site-based spatio-temporal data. In: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS 2016) (2016)

    Google Scholar 

  18. Galarus, D.E., Angryk, R.A.: The SMART approach to comprehensive quality assessment of site-based spatial-temporal data. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 2636–2645 (2016)

    Google Scholar 

  19. Galarus, D.E., Angryk, R.A.: Beyond accuracy - a SMART approach to site-based spatio-temporal data quality assessment. Intell. Data Anal. 22(1), 21–43 (2018)

    Article  Google Scholar 

  20. Galarus, D.E., Angryk, R.A.: Quality control from the perspective of the real-time spatial-temporal data aggregator and (re)distributor. In: Proceedings of the 22nd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL 2014), pp. 389–392 (2014)

    Google Scholar 

  21. Galarus, D.E., Angryk, R.A.: Spatio-temporal quality control: implications and applications for data consumers and aggregators. Open Geospatial Data Softw. Stand. 1(1), 1 (2016)

    Article  Google Scholar 

  22. Xie, H., McDonnell, K.T., Qin, H.: Surface reconstruction of noisy and defective data sets. In: Proceedings of the Conference on Visualization 2004, pp. 259–266 (2004)

    Google Scholar 

  23. Li, L., Zhou, X., Kalo, M., Piltner, R.: Spatiotemporal interpolation methods for the application of estimating population exposure to fine particulate matter in the contiguous US and a Real-Time web application. Int. J. Environ. Res. Public Health 13(8), 749 (2016)

    Article  Google Scholar 

  24. Grieser, J.: Interpolation of global monthly rain gauge observations for climate change analysis. J. Appl. Meteorol. Climatol. 54(7), 1449–1464 (2015)

    Article  Google Scholar 

  25. Cressie, N.: The origins of kriging. Math. Geol. 22(3), 239–252 (1990)

    Article  MathSciNet  Google Scholar 

  26. Wackernagel, H.: Multivariate Geostatistics: An Introduction with Applications. Springer, Heidelberg (2013)

    MATH  Google Scholar 

  27. Handcock, M.S., Stein, M.L.: A Bayesian analysis of kriging. Technometrics 35(4), 403–410 (1993)

    Article  Google Scholar 

  28. Hunter, G.J., Bregt, A.K., Heuvelink, G.B.M., De Bruin, S., Virrantaus, K.: Spatial data quality: problems and prospects. In: Research Trends in Geographic Information Science, pp. 101–121. Springer (2009)

    Google Scholar 

  29. Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)

    MATH  Google Scholar 

  30. Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Education, Inc., London (2006)

    Google Scholar 

  31. Aggarwal, C.C.: Outlier Analysis. Springer Publishing Company Incorporated, Heidelberg (2013)

    Book  Google Scholar 

  32. Isaaks, E.H., Srivastava, R.M.: An Introduction to Applied Geostatistics. Oxford University Press, Oxford (1989)

    Google Scholar 

  33. Huijbregts, C., Matheron, G.: Universal kriging (an optimal method for estimating and contouring in trend surface analysis). In: Proceedings of Ninth International Symposium on Techniques for Decision-Making in the Mineral Industry (1971)

    Google Scholar 

  34. Galassi, M., et al.: GNU Scientific Library Reference Manual, 3rd edn. Free Software Foundation

    Google Scholar 

  35. Bohling, G.: Introduction to Geostatistics and Variogram Analysis. http://people.ku.edu/~gbohling/cpe940/Variograms.pdf

  36. Rousseeuw, P.J., Van Driessen, K.: Computing LTS regression for large data sets. Data Min. Knowl. Discov. 12(1), 29–45 (2006)

    Article  MathSciNet  Google Scholar 

  37. Voss, R.F.: Random fractal forgeries. In: Fundamental Algorithms for Computer Graphics, pp. 805–835. Springer (1985)

    Google Scholar 

  38. Feder, J.: Fractals. Springer Science & Business Media, Heidelberg (2013)

    MATH  Google Scholar 

  39. Barnsley, M.F., Devaney, R.L., Mandelbrot, B.B., Peitgen, H.-O., Saupe, D., Voss, R.F., Fisher, Y., McGuire, M.: The Science of Fractal Images. Springer Publishing Company Incorporated, Heidelberg (2011)

    Google Scholar 

  40. Goodchild, M.F., Gopal, S.: The Accuracy of Spatial Databases. CRC Press, Boca Raton (1989)

    Google Scholar 

  41. Bailey, T.C., Gatrell, A.C.: Interactive Spatial Data Analysis, vol. 413. Longman Scientific & Technical Essex (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Douglas Galarus .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Galarus, D., Turnbull, I., Campbell, S., Pearce, J., Koon, L., Angryk, R. (2020). Accurate, Timely, Reliable: A High Standard and Elusive Goal for Traveler Information Data Quality. In: Arai, K., Bhatia, R. (eds) Advances in Information and Communication. FICC 2019. Lecture Notes in Networks and Systems, vol 69. Springer, Cham. https://doi.org/10.1007/978-3-030-12388-8_41

Download citation

Publish with us

Policies and ethics