Evaluation of an automated knowledge-based textual summarization system for longitudinal clinical data, in the intensive care domain

Ayelet Goldstein, Yuval Shahar, Efrat Orenbuch, Matan J. Cohen

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Objectives To examine the feasibility of the automated creation of meaningful free-text summaries of longitudinal clinical records, using a new general methodology that we had recently developed; and to assess the potential benefits to the clinical decision-making process of using such a method to generate draft letters that can be further manually enhanced by clinicians. Methods We had previously developed a system, CliniText (CTXT), for automated summarization in free text of longitudinal medical records, using a clinical knowledge base. In the current study, we created an Intensive Care Unit (ICU) clinical knowledge base, assisted by two ICU clinical experts in an academic tertiary hospital. The CTXT system generated free-text summary letters from the data of 31 different patients, which were compared to the respective original physician-composed discharge letters. The main evaluation measures were (1) relative completeness, quantifying the data items missed by one of the letters but included by the other, and their importance; (2) quality parameters, such as readability; (3) functional performance, assessed by the time needed, by three clinicians reading each of the summaries, to answer five key questions, based on the discharge letter (e.g., “What are the patient's current respiratory requirements?”), and by the correctness of the clinicians’ answers. Results Completeness: In 13/31 (42%) of the letters the number of important items missed in the CTXT-generated letter was actually less than or equal to the number of important items missed by the MD-composed letter. In each of the MD-composed letters, at least two important items that were mentioned by the CTXT system were missed (a mean of 7.2 ± 5.74). In addition, the standard deviation in the number of missed items in the MD letters (STD = 15.4) was much higher than the standard deviation in the CTXT-generated letters (STD = 5.3). Quality: The MD-composed letters obtained a significantly better grade in three out of four measured parameters. However, the standard variation in the quality of the MD-composed letters was much greater than the standard variation in the quality of the CTXT-generated letters (STD = 6.25 vs. STD = 2.57, respectively). Functional evaluation: The clinicians answered the five questions on average 40% faster (p < 0.001) when using the CTXT-generated letters than when using the MD-composed letters. In four out of the five questions the clinicians’ correctness was equal to or significantly better (p < 0.005) when using the CTXT-generated letters than when using the MD-composed letters. Conclusions An automatic knowledge-based summarization system, such as the CTXT system, has the capability to model complex clinical domains, such as the ICU, and to support interpretation and summarization tasks such as the creation of a discharge summary letter. Based on the results, we suggest that the use of such systems could potentially enhance the standardization of the letters, significantly increase their completeness, and reduce the time to write the discharge summary. The results also suggest that using the resultant structured letters might reduce the decision time, and enhance the decision quality, of decisions made by other clinicians.

Original languageEnglish
Pages (from-to)20-33
Number of pages14
JournalArtificial Intelligence in Medicine
Volume82
DOIs
StatePublished - 1 Oct 2017

Keywords

  • ICU
  • Medical informatics
  • Natural Language Generation
  • Quantitative evaluation
  • Temporal abstraction
  • Textual summarization

ASJC Scopus subject areas

  • Medicine (miscellaneous)
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Evaluation of an automated knowledge-based textual summarization system for longitudinal clinical data, in the intensive care domain'. Together they form a unique fingerprint.

Cite this