On Hidden Disorder in XML Tags

Eli Rohn, Robb Klashner

Research output: Contribution to conferencePaperpeer-review


Automatic integration of structured and semi-structured data has been a goal of nearly thirty years of integration research. The eXtensible Markup Language (XML) integration attempts have not succeeded either. The lack of progress demonstrates early assumptions were naïve. This paper will show XML schemas possess natural language characteristics that pose significant barriers to achieving full automatic integration of semi-structured data. This aspect of XML evolution has not been addressed in the literature. XML exhibits many of the same ambiguities as natural language. Automating XML integration puts forward a challenge similar in magnitude to automated natural language translation efforts. Correct and complete integration needs to satisfy meaning-preservation constraints; e.g., a mapping function that is invertible, proof preserving, vocabulary preserving and structure preserving. Natural language characteristics in XML can theoretically be utilized to predict and explain the level of XML automated integration possible.

Original languageEnglish
Number of pages6
StatePublished - 1 Jan 2004
Externally publishedYes
Event10th Americas Conference on Information Systems, AMCIS 2004 - New York, United States
Duration: 6 Aug 20048 Aug 2004


Conference10th Americas Conference on Information Systems, AMCIS 2004
Country/TerritoryUnited States
CityNew York


  • Computer Applications
  • Computing Methodologies
  • Natural Language Processing
  • XML Integration

ASJC Scopus subject areas

  • Library and Information Sciences
  • Information Systems
  • Computer Science Applications
  • Computer Networks and Communications


Dive into the research topics of 'On Hidden Disorder in XML Tags'. Together they form a unique fingerprint.

Cite this