Information Quality: The Potential of Data and Analytics to Generate Knowledge
Provides an important framework for data analysts in assessing the quality of data and its potential to provide meaningful insights through analysis
Analytics and statistical analysis have become pervasive topics, mainly due to the growing availability of data and analytic tools. Technology, however, fails to deliver insights with added value if the quality of the information it generates is not assured. Information Quality (InfoQ) is a tool developed by the authors to assess the potential of a dataset to achieve a goal of interest, using data analysis. Whether the information quality of a dataset is sufficient is of practical importance at many stages of the data analytics journey, from the pre-data collection stage to the post-data collection and post-analysis stages. It is also critical to various stakeholders: data collection agencies, analysts, data scientists, and management.
- Explains how to integrate the notions of goal, data, analysis and utility that are the main building blocks of data analysis within any domain.
- Presents a framework for integrating domain knowledge with data analysis.
- Provides a combination of both methodological and practical aspects of data analysis.
- Discusses issues surrounding the implementation and integration of InfoQ in both academic programmes and business / industrial projects.
- Showcases numerous case studies in a variety of application areas such as education, healthcare, official statistics, risk management and marketing surveys.
- Presents a review of software tools from the InfoQ perspective along with example datasets on an accompanying website.
This book will be beneficial for researchers in academia and in industry, analysts, consultants, and agencies that collect and analyse data as well as undergraduate and postgraduate courses involving data analysis.
About the authors xi
Quotes about the book xv
About the companion website xviii
PART I THE INFORMATION QUALITY FRAMEWORK 1
1 Introduction to information quality 3
2 Quality of goal, data quality, and analysis quality 18
3 Dimensions of information quality and InfoQ assessment 31
4 InfoQ at the study design stage 53
5 InfoQ at the postdata collection stage 67
PART II APPLICATIONS OF InfoQ 79
6 Education 81
7 Customer surveys 109
8 Healthcare 134
9 Risk management 160
10 Official statistics 181
PART III IMPLEMENTING InfoQ 219
11 InfoQ and reproducible research 221
12 InfoQ in review processes of scientific publications 234
13 Integrating InfoQ into data science analytics programs, research methods courses, and more 252
14 InfoQ support with R 265
15 InfoQ support with Minitab 295
16 InfoQ support with JMP 324
Ron S. Kenett, KPA Ltd. and University of Torino, Turin, Italy
Ron S. Kenett, Chairman and CEO of the KPA Group and KPA Ltd., Research Professor at the University of Turin, Italy, International Professor Associate at the Center for Research in Risk Engineering, NYU-Poly, New York, USA and Visiting Professor at the Faculty of Economics, University of Ljubljana, Slovenia. He has over 25 years of experience in restructuring and improving the competitive position of organizations by integrating statistical methods, process analysis, supporting technologies and modern human resource management systems. Ron Kenett is Editor in Chief of the Wiley Encyclopedia of Statistics in Quality and Reliability, a Fellow of the Royal Statistical Society, Senior Member of the American Society for Quality, Past President of the Israeli Statistical Association and Past President of ENBIS, the European Network for Business and Industrial Statistics and is the 2013 Greenfield Medalist of the Royal Statistical Society.
Galit Shmueli, Indian School of Business, India
Galit Shmueli is SRITNE Chaired Professor of Data Analytics and Associate Professor of Statistics & Information Systems at the Indian School of Business. She is best known for her research and teaching in business analytics, with a focus on statistical and data mining methods for contemporary data and applications in information systems and healthcare. Dr. Shmueli's research has been published in the statistics, management, information systems, and marketing literature. She authors over seventy journal articles, books, textbooks and book chapters, including the popular textbook Data Mining for Business Intelligence and Practical Time Series Forecasting. Dr. Shmueli is an award-winning teacher and speaker on data analytics. She has taught at Carnegie Mellon University, University of Maryland, the Israel Institute of Technology, Statistics.com and the Indian School of Business.