Affordable Access

deepdyve-link
Publisher Website

Open access image repositories: high-quality data to enable machine learning research.

Authors
  • Prior, F1
  • Almeida, J2
  • Kathiravelu, P3
  • Kurc, T4
  • Smith, K5
  • Fitzgerald, T J6
  • Saltz, J4
  • 1 Department of Biomedical Informatics, University of Arkansas for Medical Sciences, 4301 W. Markham St, Little Rock, AR 72205, USA. Electronic address: [email protected]
  • 2 National Institutes of Health, National Cancer Institute, 9609 Medical Center Drive, Bethesda, MD 20892, USA.
  • 3 Department of Biomedical Informatics, Emory University, 101 Woodruff Circle, #4104, Atlanta, GA 30322, USA.
  • 4 Department of Biomedical Informatics, Stoney Brook University, Health Science Center Level 3, Room 043, Stony Brook, NY 11794, USA.
  • 5 Department of Biomedical Informatics, University of Arkansas for Medical Sciences, 4301 W. Markham St, Little Rock, AR 72205, USA.
  • 6 Department of Radiation Oncology, University of Massachusetts Medical School, Worcester, MA 01655, USA.
Type
Published Article
Journal
Clinical radiology
Publication Date
Jan 01, 2020
Volume
75
Issue
1
Pages
7–12
Identifiers
DOI: 10.1016/j.crad.2019.04.002
PMID: 31040006
Source
Medline
Language
English
License
Unknown

Abstract

Originally motivated by the need for research reproducibility and data reuse, large-scale, open access information repositories have become key resources for training and testing of advanced machine learning applications in biomedical and clinical research. To be of value, such repositories must provide large, high-quality data sets, where quality is defined as minimising variance due to data collection protocols and data misrepresentations. Curation is the key to quality. We have constructed a large public access image repository, The Cancer Imaging Archive, dedicated to the promotion of open science to advance the global effort to diagnose and treat cancer. Drawing on this experience and our experience in applying machine learning techniques to the analysis of radiology and pathology image data, we will review the requirements placed on such information repositories by state-of-the-art machine learning applications and how these requirements can be met. Copyright © 2019 The Royal College of Radiologists. All rights reserved.

Report this publication

Statistics

Seen <100 times