TY - JOUR
T1 - Evaluation of uterine cervix segmentations using ground truth from multiple experts
AU - Gordon, Shiri
AU - Lotenberg, Shelly
AU - Long, Rodney
AU - Antani, Sameer
AU - Jeronimo, Jose
AU - Greenspan, Hayit
N1 - Funding Information:
We would like to thank Dr. Simon K. Warfield for his notes and support with the STAPLE implementation software. We also would like to acknowledge the medical expert contributions to this work from the National Institutes of Health/American Society for Colposcopy and Cervical Pathology (NIH-ASCCP) Research Group. This research was supported by the Intramural Research Program of the U.S. National Institutes of Health, National Library of Medicine, and Lister Hill National Center for Biomedical Communications.
PY - 2009/4/1
Y1 - 2009/4/1
N2 - This work is focused on the generation and utilization of a reliable ground truth (GT) segmentation for a large medical repository of digital cervicographic images (cervigrams) collected by the National Cancer Institute (NCI). NCI invited twenty experts to manually segment a set of 939 cervigrams into regions of medical and anatomical interest. Based on this unique data, the objectives of the current work are to: (1) Automatically generate a multi-expert GT segmentation map; (2) Use the GT map to automatically assess the complexity of a given segmentation task; (3) Use the GT map to evaluate the performance of an automated segmentation algorithm. The multi-expert GT map is generated via the STAPLE (Simultaneous Truth and Performance Level Estimation) algorithm, which is a well-known method to generate a GT segmentation from multiple observations. A new measure of segmentation complexity, which relies on the inter-observer variability within the GT map, is defined. This measure is used to identify images that were found difficult to segment by the experts and to compare the complexity of different segmentation tasks. An accuracy measure, which evaluates the performance of automated segmentation algorithms is presented. Two algorithms for cervix boundary detection are compared using the proposed accuracy measure. The measure is shown to reflect the actual segmentation quality achieved by the algorithms. The methods and conclusions presented in this work are general and can be applied to different images and segmentation tasks. Here they are applied to the cervigram database including a thorough analysis of the available data.
AB - This work is focused on the generation and utilization of a reliable ground truth (GT) segmentation for a large medical repository of digital cervicographic images (cervigrams) collected by the National Cancer Institute (NCI). NCI invited twenty experts to manually segment a set of 939 cervigrams into regions of medical and anatomical interest. Based on this unique data, the objectives of the current work are to: (1) Automatically generate a multi-expert GT segmentation map; (2) Use the GT map to automatically assess the complexity of a given segmentation task; (3) Use the GT map to evaluate the performance of an automated segmentation algorithm. The multi-expert GT map is generated via the STAPLE (Simultaneous Truth and Performance Level Estimation) algorithm, which is a well-known method to generate a GT segmentation from multiple observations. A new measure of segmentation complexity, which relies on the inter-observer variability within the GT map, is defined. This measure is used to identify images that were found difficult to segment by the experts and to compare the complexity of different segmentation tasks. An accuracy measure, which evaluates the performance of automated segmentation algorithms is presented. Two algorithms for cervix boundary detection are compared using the proposed accuracy measure. The measure is shown to reflect the actual segmentation quality achieved by the algorithms. The methods and conclusions presented in this work are general and can be applied to different images and segmentation tasks. Here they are applied to the cervigram database including a thorough analysis of the available data.
KW - Cervical cancer
KW - Evaluation of segmentation
KW - Image segmentation
KW - Multi-expert ground truth
KW - Segmentation complexity
KW - Uterine cervix images
UR - http://www.scopus.com/inward/record.url?scp=60749091702&partnerID=8YFLogxK
U2 - 10.1016/j.compmedimag.2008.12.002
DO - 10.1016/j.compmedimag.2008.12.002
M3 - Article
C2 - 19217754
AN - SCOPUS:60749091702
SN - 0895-6111
VL - 33
SP - 205
EP - 216
JO - Computerized Medical Imaging and Graphics
JF - Computerized Medical Imaging and Graphics
IS - 3
ER -