TY - GEN
T1 - A new method of multiple imputation for completely (or almost completely) missing data
AU - Bolotin, Arkady
PY - 2010/12/1
Y1 - 2010/12/1
N2 - One of the important questions the researcher must answer assessing data quality while preparing information for a data mining procedure is whether missing observations in the dataset are missing at random, and whether some form of imputation is needed. If all (or almost all) observations of a variable are missing, they cannot be classified as missing at random. Therefore, most known methods of imputation of missing values cannot be applied to this variable. This paper studies a particular way for creating imputations in datasets containing completely (or almost completely) missing variables. As it is shown in the paper, if no external data are available, the maximum entropy distribution is the only reasonable probability distribution for producing proper imputation in case of such variables. Two examples of real-life epidemiological studies demonstrate this approach.
AB - One of the important questions the researcher must answer assessing data quality while preparing information for a data mining procedure is whether missing observations in the dataset are missing at random, and whether some form of imputation is needed. If all (or almost all) observations of a variable are missing, they cannot be classified as missing at random. Therefore, most known methods of imputation of missing values cannot be applied to this variable. This paper studies a particular way for creating imputations in datasets containing completely (or almost completely) missing variables. As it is shown in the paper, if no external data are available, the maximum entropy distribution is the only reasonable probability distribution for producing proper imputation in case of such variables. Two examples of real-life epidemiological studies demonstrate this approach.
KW - Maximum entropy distributions
KW - Missing variables
KW - Non-random missingness
UR - http://www.scopus.com/inward/record.url?scp=79959883997&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:79959883997
SN - 9789604742431
T3 - International Conference on Mathematical and Computational Methods in Science and Engineering - Proceedings
SP - 34
EP - 45
BT - Advances in Mathematical and Computational Methods - 12th WSEAS International Conference on Mathematical and Computational Methods in Science and Engineering, MACMESE'10
T2 - 12th WSEAS International Conference on Mathematical and Computational Methods in Science and Engineering, MACMESE'10
Y2 - 3 November 2010 through 5 November 2010
ER -