Automated quality assurance of continuous data

M Last, A Kandel

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

Most real-world databases contain some amount of inaccurate data. Reliability of critical attributes can be evaluated from the values of other attributes in the same data table. This paper presents a new fuzzy-based measure of data reliability in continuous attributes. We partition the relational schema of a database into a subset of input (predicting) and a subset of target (dependent) attributes. A data mining model, called information-theoretic connectionist network, is constructed for predicting the values of a continuous target attribute. The network calculates the degree of reliability of the actual target values in each record by using their distance from the predicted values. The approach is demonstrated on the voting data from the 2000 Presidential Elections in the US.
Original languageEnglish
Title of host publicationSystematic Organisation of Information in Fuzzy Systems
EditorsP. Melo-Pinto, H.-N. Teodorescu, T. Fukuda
Pages89-104
Volume184
StatePublished - Apr 2003

Publication series

NameNATO Science Series, III: Computer and Systems Sciences
ISSN (Print)1387-6694
ISSN (Electronic)1879-8276

Keywords

  • Data reliability
  • data quality
  • information-theoretic networks
  • fuzzy databases
  • Data mining

Fingerprint

Dive into the research topics of 'Automated quality assurance of continuous data'. Together they form a unique fingerprint.

Cite this