Not all data are created equal": Inequality in utility and implications for data management

Adir Even, G. Shankaranarayanan

Research output: Contribution to conferencePaperpeer-review

2 Scopus citations

Abstract

With the rapid growth of organizational data resources and associated costs, understanding the effect of data management decisions on cost-benefit trade-offs becomes critical. This study links these trade-offs to the utility contribution of data resources. The magnitude of inequality in utility, the extent to which data records differ in their business-value contribution, is shown to be a fundamental driver of key data management decisions involving data quality management, system and database design, data acquisition and retention policies, and pricing of information products. By adapting and developing statistical tools for modeling and measuring inequality, this study analyzes some typical utility distributions and demonstrates the possible effects of inequality on utility-cost trade-offs and the overall net-benefit. A low magnitude of inequality is shown to result in "all-or-nothing" decisions. High inequality is likely to involve a more refined treatment of data resources based on their utility contribution, and will possibly require the development of management policies that differentiate these resources accordingly.

Original languageEnglish
StatePublished - 1 Dec 2006
Externally publishedYes
Event11th International Conference on Information Quality, ICIQ 2006 - Cambridge, MA, United States
Duration: 10 Nov 200612 Nov 2006

Conference

Conference11th International Conference on Information Quality, ICIQ 2006
Country/TerritoryUnited States
CityCambridge, MA
Period10/11/0612/11/06

Keywords

  • Data Management
  • Data Warehouse
  • Database
  • Design
  • Information Value
  • Utility

ASJC Scopus subject areas

  • Information Systems
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Not all data are created equal": Inequality in utility and implications for data management'. Together they form a unique fingerprint.

Cite this