Probabilistic approaches to overcome content heterogeneity in data integration: A study case in systematic lupus erythematosus

on behalf of the MASTER plans Consortium

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Integrating data from different sources into homogeneous dataset increases the opportunities to study human health. However, disparate data collections are often heterogeneous, which complicates their integration. In this paper, we focus on the issue of content heterogeneity in data integration. Traditional approaches for resolving content heterogeneity map all source datasets to a common data model that includes only shared data items, and thus omit all items that vary between datasets. Based on an example of three datasets in Systemic Lupus Erythematosus, we describe and experimentally evaluate a probabilistic data integration approach which propagates the uncertainty resulting from content heterogeneity into statistical inference, avoiding the need to map to a common data model.

Original languageEnglish
Title of host publicationDigital Personalized Health and Medicine - Proceedings of MIE 2020
EditorsLouise B. Pape-Haugaard, Christian Lovis, Inge Cort Madsen, Patrick Weber, Per Hostrup Nielsen, Philip Scott
PublisherIOS Press
Pages387-391
Number of pages5
ISBN (Electronic)9781643680828
DOIs
StatePublished - 16 Jun 2020
Externally publishedYes
Event30th Medical Informatics Europe Conference, MIE 2020 - Geneva, Switzerland
Duration: 28 Apr 20201 May 2020

Publication series

NameStudies in Health Technology and Informatics
Volume270
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365

Conference

Conference30th Medical Informatics Europe Conference, MIE 2020
Country/TerritorySwitzerland
CityGeneva
Period28/04/201/05/20

Keywords

  • Biomedical data harmonisation
  • Content heterogeneity
  • Missing data
  • Probabilistic data integration

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics
  • Health Information Management

Fingerprint

Dive into the research topics of 'Probabilistic approaches to overcome content heterogeneity in data integration: A study case in systematic lupus erythematosus'. Together they form a unique fingerprint.

Cite this