This website uses a variety of cookies, which you consent to if you continue to use this site. You can read our privacy policy for details about how these cookies are used, and to grant or withdraw your consent for certain types of cookies. Consent and dismiss this banner by clicking agree.

NIH Makes Largest Set of Medical Imaging Data Available to Public

The dataset contains over 32,000 medical images that may improve the detection of lesions or new disease and support future deep learning algorithms.

July 23, 2018 - The National Institutes of Health (NIH) Clinical Center has released a dataset of more than 32,000 medical images to help enhance the accuracy of lesion detection.

The dataset, called DeepLesion, contains thoroughly anonymized images representing over 4400 unique patients. DeepLesion contains significantly more images than other publicly available medical image datasets, which typically have less than a thousand lesions.

NIH radiologists annotate clinically meaningful findings on CT images with an electronic bookmark tool, which allows them to mark features of interest and return to these findings later. The bookmarks provide arrows, lines, diameters, and text that can tell the exact size and location of a lesion so that experts can easily identify growth or new disease.

Dig Deeper

NIH scientists used this critical metadata to develop the DeepLesion dataset.

Annotating medical images requires extensive clinical experience, but scientists are hoping to change that with DeepLesion.

The dataset is large enough that it could train a deep neural network and enable scientists to create a large-scale lesion detector with one unified framework.

NIH also expects that the data will allow researchers to study the relationships between different types of lesions and make new discoveries. Additionally, DeepLesion offers the possibility for scientists to identify an individual’s lesions more accurately, which will enable them to quickly assess the whole body for cancer risk.

The DeepLesion dataset will build on NIH’s past efforts to improve disease detection and diagnosis. In September 2017, the Clinical Center released over 100,000 anonymized chest x-ray images to the scientific community to improve diagnostic decisions for patients.

NIH Clinical Center hopes to continue to collect more data and further improve the detection accuracy of the DeepLesion dataset. Once scientists can leverage 3D and lesion type information, a universal lesion detecting capability will become more reliable.

In the future, it may also be possible to extend DeepLesion into other imaging modalities, including MRI, as well as combine data from multiple hospitals.