a systematic collection of medical image datasets for deep
Post on 02-Oct-2021
1 Views
Preview:
TRANSCRIPT
Noname manuscript No.(will be inserted by the editor)
A Systematic Collection of Medical Image Datasets for DeepLearning
Johann Li 1 · Guangming Zhu 1 · Cong Hua 1 · Mingtao Feng 1 ·Basheer Bennamoun 2 · Ping Li 3 · Xiaoyuan Lu 3 · Juan Song 1 · PeiyiShen 1 · Xu Xu 4 · Lin Mei 4 · Liang Zhang 1 · Syed Afaq Ali Shah 5 ·Mohammed Bennamoun 6
Received: date / Accepted: date
Abstract The astounding success made by artificialintelligence (AI) in healthcare and other fields provesthat AI can achieve human-like performance. However,success always comes with challenges. Deep learning al-gorithms are data-dependent and require large datasetsfor training. The lack of data in the medical imagingfield creates a bottleneck for the application of deeplearning to medical image analysis. Medical image ac-quisition, annotation, and analysis are costly, and theirusage is constrained by ethical restrictions. They alsorequire many resources, such as human expertise andfunding. That makes it difficult for non-medical re-searchers to have access to useful and large medicaldata. Thus, as comprehensive as possible, this paperprovides a collection of medical image datasets withtheir associated challenges for deep learning research.We have collected information of around three hundreddatasets and challenges mainly reported between 2013and 2020 and categorized them into four categories:head & neck, chest & abdomen, pathology & blood,
BCorresponding author: Guangming Zhu, Liang Zhang, andSyed Afaq Ali ShahE-mail: gmzhu@xidian.edu.cnE-mail: liangzhang@xidian.edu.cnE-mail: Afaq.Shah@murdoch.edu.au
1 School of Computer Science and Technology, XidianUniversity, China2 School of Medicine, The University of Notre Dame, Aus-tralia3 Data and Virtual Research Room, Shanghai BroadbandNetwork Center, China4 The Third Research Institute of The Ministry of PublicSecurity, China5 Discipline of Information Technology, Media and Commu-nications, Murdoch University, Australia6 Department of Computer Science and Software Engineer-ing, The University of Western Australia, Australia
and “others”. Our paper has three purposes: 1) to pro-vide a most up to date and complete list that can beused as a universal reference to easily find the datasetsfor clinical image analysis, 2) to guide researchers onthe methodology to test and evaluate their methods’performance and robustness on relevant datasets, 3) toprovide a “route” to relevant algorithms for the relevantmedical topics, and challenge leaderboards.
Keywords Medical image analysis · Deep learning ·Datasets · Challenges · Computer-aided diagnosis
1 Introduction
Since the invention of medical imaging technology, thefield of medicine had entered a new era. The begin-ning of medical imaging started with the adoption ofX-Rays. With further technical advancements, manyother imaging methods, including 3D computed tomog-raphy (CT), magnetic resonance imaging (MRI), nu-clear medicine, ultrasound, endoscopy, and optical co-herence tomography (OCT), were also exploited. Di-rectly or indirectly, these imaging modalities have con-tributed to the diagnosis and treatment of various dis-eases, and the research related to the human body’sstructure and intrinsic mechanisms.
Medical images can provide critical insight into thediagnosis and treatment of many diseases. The humanbody’s different reactions to imaging modalities areused to produce scans of the body. Reflection and trans-mission are commonly used in medical imaging becausethe reflection or transmission ratio of different body tis-sues and substances are different. Some other methodsacquire images by changing the energy transferred tothe body, e. g., magnetic field changes or the rays radi-ated from a chemical agent.
arX
iv:2
106.
1286
4v1
[ee
ss.I
V]
24
Jun
2021
2 Johann Li 1 et al.
Before modern AI was applied in medical imageanalysis, radiologists and pathologists needed to man-ually look for the critical “biomarkers” in the patient’sscans. These “biomarkers”, such as tumors and nod-ules, are the basis for the medics to diagnose and de-vise treatment plans. Such a diagnostic process needs tobe performed by medics with extensive medical knowl-edge and clinical experience. However, problems suchas diagnostic bias and the lack of medical resourcesare prevalent and cannot be avoided. After the recentbreakthroughs in AI (which achieve human-like perfor-mance, e. g., for image recognition [1,2,3], and can wingames such as Go [4] and real-time strategy games [5]),the development of AI-based automatic medical imageanalysis algorithms has attracted lots of attention. Re-cently, the application of AI in medical image analysishas become one of the major research focuses and hasattained many significant achievements [6, 7, 8].
Many researchers brought their focus to AI-basedmedical image analysis methods thinking that it mightbe one of the solutions to the challenges (e.g., medi-cal resource scarcity) and taking advantage of the tech-nological progress [9, 10, 11, 12, 13]. Traditional medi-cal image analysis focuses on detecting and identifyingbiomarkers for diagnosis and treatment. AI imitates themedic’s diagnosis through classification, segmentation,detection, regression, and other AI tasks in an auto-mated or semi-automated way.
AI has achieved a significant performance for manycomputer vision tasks. This success is yet to be trans-lated to the medical image analysis domain. Deep learn-ing (DL), a branch of AI, is a data-dependent methodas it needs massive training data. However, when DLis applied to medical image analysis, the paucity of la-beled data becomes a major challenge and a bottleneck.
Data scarcity is a common problem when applyingDL methods to a specific domain, and this problem be-comes more severe in the case of medical image anal-ysis. Researchers, who apply DL methods to medicalimage analysis research, do not usually have a medicalbackground, commonly computer scientists. They can-not collect data independently because of the lack of ac-cess to medical equipment and patients, and they can-not annotate the acquired data either because they lackthe relevant medical knowledge. Furthermore, medicaldata is owned by institutions who cannot easily makeit public due to privacy and ethics restrictions. Whenresearchers evaluate their algorithms on their privatedata, the results of their research become incompara-ble.
To address some of these problems, MICCAI, ISBI,AAPM, and other conferences and institutions havelaunched many DL-related medical image analysis chal-
lenges. These aim to design and develop automatic orsemi-automatic algorithms and promote medical imageanalysis research with computer-aided methods. Con-currently, some researchers and institutions also orga-nize projects to collect medical datasets and publishthem for research purposes.
Despite all these developments, it is still challeng-ing for novice medical image analysis researchers tofind medical data. This paper addresses this challengeand presents a comprehensive survey of existing medi-cal datasets. The paper also identifies and summarizesmedical image analysis challenges. It also provides apathway to identify the most relevant datasets for eval-uation and the suitable methods they need in the re-spective challenge leader board.
This paper refers to other research papers witha number between square brackets and refers to thedatasets listed in the tables with numbers betweenparentheses.
The following sections present the details of thekey datasets and challenges. Section 2 summarizesthe datasets and challenges, including the years, bodyparts, tasks, and other information that is relevantto the dataset development. Section 3 discusses thedatasets and challenges of the head and neck. Section4 covers the datasets and challenges related to the chestand abdomen organs. Section 5 examines the datasetsand challenges of pathology and blood related tasks.Section 6 introduces other datasets and challenges re-lated to bone, skin, phantom, and animals. We have alsocreated a website with a git repo1, which shows the listof these datasets and their respective challenges.
2 Medical image datasets
In this section, we provide an overview of the imagedatasets and challenges. Our collection contains overthree hundred medical image datasets and challengesorganized between 2004 and 2020. This paper focusesmainly on the ones between 2013 and 2020. Subsec-tions 2.1, 2.2, 2.3, and 2.4 provide information aboutthe year, body parts,modalities, and tasks, respec-tively. In Subsection 2.5, we introduce the sourcesfrom where we have collected these datasets and thechallenges. Details about the categorization of theseimage datasets and challenges into four groups are pro-vided in the subsequent sections. We provide a taxon-omy of our paper in Figure 1 to help the reader navigatethrough the different sections.
1 The website with the git repo will be public after the paperis accepted.
A Systematic Collection of Medical Image Datasets for Deep Learning 3
Medical ImageDatasets
Head & NeckSec. 3
Chest & Abdomen
Sec. 4
Pathology & Blood
Sec. 5
OtherSubjects
Sec. 6
Basic TasksSec. 3.1
Brain DiseaseSec. 3.2
EyeSec. 3.3
Other SubjectsSec. 3.4
Segmentation of White and Gray
Matter
Segmentation of Other Tissues & Functional Areas
Other tasks
Segmentation of Tumors and Lesions
Classification of Disease
Alzheimer'sdisease
Other diseaseNeckCephalometric
and toothBehavior and
cognition
OrganSegmentation
Sec 4.1
Diagnosis of Diseases
Sec 4.2
Other TasksSec 4.3
Organ Contour Segmentation
Organ Parts Segmentation
Modality
DetectionSegmentation
COVID-19
Regression TrackingRegistration Others
PathologySec 5.1
BloodSec 5.2
Imaging
Stain
Diseases
Tasks
MicrocosmicWSI-level
Data
Cell detection & segmentation
Patchlevel classification
Other tasks
Classification
Segmentation and detection
Other tasks
BoneSec 6.1
SkinSec 6.2
PhantomSec 6.3
AnimalSec 6.4
Classification
Segmentation
Others
Modality
Focus
Metric
Tissuessegmentation
Functional areas
Generation
Registration
Tractography
Glioma
Ischemicstroke lesion
Multiplesclerosis lesion
Intracranial hemorrhage
ModalityTaskFocused diseases
Organs
ModalityFocusSingle-organ contour segmentation
Multi-organ contour segmentation
Heart Others
Classification
Fig. 1: An overall taxonomy to outline the organization of the paper.
2.1 Years
The timeline of these medical image datasets canbe split into two, starting from 2013 as the water-shed, since Krizhevsky et al.’s excellent success in theILSVRC competition with their AlexNet [14] in 2012.The continuous advancement of deep learning has, tosome extent, driven more and more researchers to fo-cus on medical image analysis and indirectly led to anincrease in the number of datasets and competitionseach year. The number of datasets and challenges be-fore 2013 are irregular according to our statistics. Themain reason is that many datasets developed before2012 are not aimed at computer-aided diagnosis, forexample, ADNI-1 (52), although those data could beused for DL. Therefore, we only focus on the datasetsand challenges which were released after 2013.
Figure 2A shows the statistics of the datasets andchallenges per year between 2013 and 2020. As shownin Figure 2A, the number of related datasets and chal-lenges increased year by year because of the progressand success of DL in computer vision and medical im-age analysis. That led more and more researchers tofocus on medical image analysis with DL-based meth-ods and more and more datasets and challenges withdifferent body parts and tasks started to appear.
As shown in Figures 2B and 2C, there was not onlyan increase in the number of datasets and challenges butalso in their variety with respect to the body parts andtypes of tasks. The research focus ranges from a sim-ple diagnosis or structural analysis (e. g., segmentationand classification) in the early stages to more complextasks or combinations of tasks that are closer to the clin-ical needs, including classification, segmentation, detec-tion, regression, generation, tracking, and registration,as time progresses. The focus of these datasets and chal-lenges has also changed from cancer diagnosis to the en-tire healthcare system. Meanwhile, the organs focusedon by researchers also range from the single and sim-ple, but important ones, such as the brain and lungs, tomany different other parts of the human body account-ing for different sizes, shapes, and other characteristics.
2.2 Body parts
With the success of DL, the number of focused bodyparts has increased, as shown in Figure 2B. We alsoshow the most targeted researched body parts in Figure2E, and the top-5 researched organs include the brain,lung, heart, eye, and liver. These organs have been thefocus of research because they are the most importantparts of the human body.
4 Johann Li 1 et al.
7 1224 25 28
4559 65
2013 2014 2015 2016 2017 2018 2019 2020
Segmentation
40%
Classification
31%
Detection 14%
Generation 5%
Registration4%
Regression 4%
Tracking 2%
MRI 35%
CT 24%Pathology image 11%
CR 8%
Other 8%
Ultrasound 4%
Fundus photo 3%
Endoscopy 3%
PET 3%
OCT 1%Other 29%
Brain 24%Lung 15%
Heart 8%
Eye 5%
Bone 5%
Liver 4%
Neck 4%
Prostate 2%
Kidney 2%
Breast 2%
2013 2014 2015 2016 2017 2018 2019 2020
Brain Heart Lung Liver Eye
Segmentation Classification Detection Other MRI CT/CR/PET Path. image Other
A B
C D
E F G
79
27
50
13
18
136
108
49
50
126
124
39
70
Fig. 2: Summary of medical image datasets and challenges from 2013 to 2020. Figure 2A shows the number ofdatasets and challenges published in each year. Figures 2 B, C, and D show the year-by-year trends along with thetrends in relative numbers for each of the different categories by year. The numbers listed right are the summarycount of each category, and the summary counts are not the same with the total numbers, because of 1) some ofthe categories are not shown, and 2) a dataset counts two times if it includes two of the categories. Figures 2 E,F, and G show the most predominant body parts, data modalities, and main tasks with the percentage of theirrespective dataset.
In the beginning, the main reason which motivatedresearchers to focus on these organs and parts was thata simple diagnosis and a structural study greatly helpedin the diagnosis and treatment of cancer (a major threatto human life). Many datasets focus on brain, lungand other organs, without considering DL, and manychallenges focus on simple tasks, such as segmentationand classification. Subsequently, AI showed to be morecompetent to tackle complex tasks, and therefore re-searchers started to focus on several other organs. Forexample, eye related diseases, which cause blindness,incited the collection of eye related datasets and the re-lease of challenges. Some other datasets and challengesfocus on the small organs, such as the prostate, whichare challenging to analyze due to the low resolution ofimages.
2.3 Modalities
There are several types of medical image modalities. Asshown in Figure 2F, the frequently used modalities toacquire medical datasets include Magnetic ResonanceImages (MRI), Computed Tomography (CT), Ultra-sound (US), Endoscopy, Positron Emission Tomogra-phy (PET), Computed Radiography (CR), Electrocar-diography, and Optical Coherence Tomography (OCT).We introduce below these main modalities and providea summary at the end of this subsection.
Radiography: Radiography is an imaging techniquebased on the difference of attenuation when X-rayspasses through the different organs and tissues of thehuman body. The primary used modalities include CRand CT. CR is a 2D image, and CT is a volume (3D) im-age. Radiography is the most commonly used method toimage the human body. For example, CR is frequentlyused to diagnose chest related diseases, such as pneumo-nia, tuberculosis, and COVID-19. Meanwhile, 3D CT
A Systematic Collection of Medical Image Datasets for Deep Learning 5
plays an important role in the diagnosis and treatmentrelated to cancer and lesions. The advantages of radio-graphy are 1) high resolution of the hard tissues (e. g.bones), 2) lower cost, and 3) compatibility of contrastagents, but the disadvantages are 1) X-rays are harmfulfor human health, 2) X-rays are ideal for distinguishingbetween healthy tissues and tumors without the help ofcontrast agents, and 3) their resolution is limited by theradiation intensity. Moreover, as the main componentof the human bone is calcium, CT plays an importantrole in many bone related diagnoses.
Magnetic resonance: MR images display the bodystructure caused by the difference of signal releasedby the different substances of the imaged organ as themagnetic field is changed. MR has many submodalities,such as T1 and T2. For essential organs and tissues,MR is a commonly used imaging method because it isconsidered non-invasive, effective, and safe. Due to theprinciple of MR imaging, MR plays an essential rolein the diagnosis of brain, heart, and soft tissues. Be-cause higher resolution MR images can be obtained byincreasing the magnetic field strength, MR is also suit-able for small organs or tissues. However, MR imagesdo have disadvantages such as high cost and incompat-ibility with metal (e. g., metallic orthopedic implants).
Nuclear medicine: Nuclear medicine captures imagesby the absorption of the targeted tissue of specific chem-ical components marked by radioactive isotopes. Tu-mors and healthy tissues absorb different chemical com-ponents, so medics use the specific chemical markedwith the radioactive isotope and receive the ray radi-ated by the chemical. An example is Positron Emis-sion Computed Tomography, i. e., PET, which per-forms imaging by capturing radiations produced by flu-orodeoxyglucose or other similar contrast agents ab-sorbed by the tissue or tumor. Nuclear medicine is goodat imaging regions of interest, such as tumors, but thedisadvantage is their high cost and the low-resolution.
Ultrasound: Ultrasound operates by acquiring the dif-ferences in the absorption and reflection of ultrasoundwaves when applied to tissues. It is widely used in imag-ing the heart and fetus because ultrasound causes nodamage to these parts and provides real-time imaging.Nevertheless, the main disadvantage is the noise causedby the reflection of irregular shapes of organs and tis-sues, and the interference with their imaging.
Eye-related modalities: An OCT image is obtainedby using low-coherence light to capture 2D and 3D
micrometer-resolution images within optical scatter-ing media to diagnose eye-related diseases. The fundusphoto is also used for diagnosis purposes. These twomethods are non-invasive eye-specific imaging modali-ties.
Pathology: Pathological data is the gold standard indiagnosing diseases. It is taken with microscopy of thestained tissue slides by the camera to show cell-levelfeatures. Pathology is used in the cell-level diagnosisfor cancer and tumors.
Other modalities: Other imaging modalities are usualbut specific to certain body parts, such as endoscopy,and provide the medics with various biomarkers tomake critical decisions when diagnosing, curing, andresearching.
Overall, MR, CT, and other modalities are the mostcommonly used imaging modalities. MR can providesharp images without harmful radiations of soft tissues.It is therefore widely used in the imaging of brain, heartand many other small organs. CT is an economical andsimple imaging approach, and it is widely used for thediagnosis of cancer, e. g., the neck, chest, and abdomen.A pathology image is different from MR and CT, be-cause it is a cell-level imaging method. Pathology iswidely used in cancer-related diagnosis.
2.4 Tasks
According to our analysis, our collected datasets andchallenges have been used for the tasks of classification,prediction, detection, segmentation, location, charac-terization, up-sampling, tracking, registration, regres-sion, estimation, coding, automatic annotation, andother tasks. As Figure 2G shows, we grouped thesetasks into seven categories: classification, segmenta-tion, detection, regression, generation, registration, andtracking. The following subsections briefly describe eachtask.
Classification: Classification is used for qualitativeanalysis. According to pre-defined specific rules, theclassification task aims to group medical images or par-ticular regions of an image into two or more distinctcategories. The classification task can be used alone formedical image analysis or as a subsequent task afterother lower level tasks, such as segmentation and de-tection, in order to analyze the results and further ex-tract features. There are many ways to express the clas-sification task, such as detection and prediction. The
6 Johann Li 1 et al.
detection tasks (which are also sometimes termed asclassification) are different from the ones introduced inthe following paragraph, although sometimes the sameword is used synonymously. The typical examples ofclassification tasks include AD prediction and the at-tributes classification of pulmonary nodules. AD pre-diction aims to group MR images in Alzheimer’s dis-ease (AD) and normal cognition (NC). The attributesclassification of pulmonary nodules aims to analyzethe pathology attributes of pulmonary nodules. Clas-sification performance measures mainly include accu-racy, precision, specificity, sensitivity, F-score, ROC,and AUC. All these measures are based on four ba-sic measures: true positive (TP), false positive (FP),true negative (TN), and false negative (FP).
Segmentation: The segmentation task can be regardedas a pixel-level or voxel-level classification task, but thedifference is that the segmentation task is limited tothe context. It aims to split an image into different ar-eas or contour specific regions. The regions can containtumor, tissue, or other specific targets. The results ofthe segmentation task consist of areas and boundaries.Since segmentation can be seen as a pixel-level classi-fication, the average precision (AP) can be used as ametric. Other performance metrics include intersectionover union (IoU), Dice index, Jaccard Index, Hausdorffdistance, and average surface distance.
Detection: The detection task aims to find an object ofinterest, and it also usually needs to classify such anobject (classification task). In this work, we categorizethe tasks which aim to determine the location of theobject of interest with a bounding box or a point. Thedetection task is sometimes represented as a localiza-tion task. A typical example of detection is pulmonarynodules detection, which aims to find the pulmonarynodules in chest CT images and annotate the noduleswith a bounding box. The performance measures usedin the detection tasks include mainly the intersectionover union (IoU), mean Average Precision (mAP), pre-cision and recall, false positive rate, receiver operatingcharacteristic curve (ROC), and other metrics. For thetask to locate an object without the boundary, the Eu-clidean Distance is the most commonly used measure.
Regression: Classification is used for qualitative anal-ysis, while regression is used for quantitative analysis.A typical example is the estimation of the volume ofa lesion. For the regression task, the root mean squareerror, i. e., rMSE, mean absolute error, and correlationcoefficient are the most commonly used metrics.
Tracking: The tracking task aims to locate specific tar-gets, but the tracking is a dynamic process and is there-fore different from the detection task. That means thetracking algorithms need to detect or localize targets indifferent frames. For medical image analysis, the track-ing tasks include the tracking of organs and tissues. Thetracking is not just of one point, but it can also be of anarea, e. g., every part of an organ or tissue. An exampleis the tracking of the lung when the subject breathes.
Generation: The image data generation task has manydifferent aims, but for simplicity we categorize all ofthese aims under the “generation task” because theyfocus on generating image data from other image data.Typical generation tasks include 1) to generate a T2-weight image from T1-weight images and 2) to generatea pathology image stained with one stain from an imagestained by another stain.
Registration: The image registration task aims to alignone image with another image, i. e., to find a trans-formation (e. g., rotation and translation) to alignthe two images. Registration is a necessary processfor computer-aided diagnosis algorithms from multi-modalities. During medical scanning, the movement ofthe human body cannot be avoided and is a challenge.At the same time, imaging cannot be taken instantly.As a result, images from different viewpoints cannotbe aligned directly or when two or more modalities areused. Therefore, researchers rely on registration tech-niques to solve these alignment problems.
2.5 Source and Term
We collected the datasets and challenges mainly fromThe Cancer Imaging Archive [15], Grand Challenge,Kaggle, OpenNeuro, PhysioNet [16], and Codalab.
The original records of datasets and challenges thatwe collected include four to five hundred, and we re-moved some of them as some datasets are not suitablefor DL and AI methods. We then categorized the re-maining datasets and challenges into different groups.Categorizing the datasets and challenges is not easy be-cause all these datasets and challenges are derived fromclinical research sources. Thus we used an asymmetriccategorization to group these datasets and challengesinto four groups, as shown in Figure 1. This means thatwe did not use the same sub-taxonomy in each categoryor sub-category.
First, we split the medical datasets and challengesinto two groups: body-level and cell-level (Section
A Systematic Collection of Medical Image Datasets for Deep Learning 7
5), according to the imaged body part. The body-level datasets focus on specific tissues, while the cell-level ones focus on cells. Second, we grouped thedatasets and challenges of the brain, eye, and neckinto one group (Section 3), because these are partsof the head. Third, we organized the datasets andchallenges related to the chest and abdomen into thesame group (Section 4). These datasets and challengesrelate to the diagnosis, anatomical segmentation, andtreatments. Finally, for the datasets and challengesthat cannot be categorized into the above groups, wegrouped them under “other” (Section 6), and thesedatasets and challenges are related to the skin, bone,phantom, and animals.
The introduction of each group and sub-group in-cludes mainly the type of modality, the task, the dis-ease, and the body part. However, not all the groupsof datasets can be introduced in that way. For somegroups, we introduce the datasets and challenges ac-cording to the domain-specific problems. For example,we categorize the pathology datasets into microcosmicand macrocosmic tasks.
3 Head and neck related datasets andchallenges
The head and neck are significant parts of the humanbody because many essential organs, glands, and tissuesare located there. Several researchers’ image analysiswork relate to the head and neck. To make an effectiveuse of computers for research, diagnosis, and treatment,many researchers have released datasets and challenges,for examples: 1) the analysis of tissue structureand functions (2, 3, 4, 6) and 2) diseases diagnosis(30, 39, 47).
Because the brain controls emotions, actions andfunctions of other organs, the brain’s area is significant.First, we introduce the datasets and challenges relatedto the analysis of the brain structure, function, imaging,and other basic tasks in Subsection 3.1. Second, weintroduce the datasets and challenges related to braindisease diagnosis in Subsection 3.2.
Moreover, since the eyes are crucial to our vision, thecomputer-aided diagnosis of eye-related diseases is alsoan important research focus. The eye-related datasetsand challenges are covered in Subsection 3.3. We intro-duce other datasets and challenges of the neck and thedatasets related to the brain’s behavior and cognitionin Subsection 3.4.
3.1 Structural analysis tasks of the brain
The basic analysis and processing of the brain med-ical images are clinically critical for the diagnosis,treatment, and other brain-related analysis tasks. Thedatasets and challenges we discuss are mainly for thesegmentation tasks and center around the brain struc-ture. In contrast, some datasets focus on imaging, in-cluding MR imaging acceleration, the non-linear reg-istration of different resolutions, and tissue reconstruc-tion. One of the most popular tasks is the segmentationof white matter (WM), gray matter (G<), and cere-brospinal fluid (CSF), and their respective datasets andchallenges are introduced in Subsection 3.1.1. Mean-while, other tissues and functional areas’ segmentationare also the focus of research, and their related datasetsand challenges are discussed in Subsection 3.1.2. Sub-section 3.1.3 describes the other basic tasks. Table 1shows the datasets and challenges of these basic tasks.
3.1.1 Segmentation of white and gray matter
The segmentation of WM, GM, and CSF has greatsignificance for brain structure research and computer-aided diagnosis, particularly using AI. Similarly, for AIalgorithms, it is also of great significance to understandthe human brain’s structure. Therefore, MICCAI andothers have held many challenges with this research fo-cus, and researchers could design automatic algorithmsto segment magnetic resonance images into differentparts. We introduce these datasets and challenges withrespect to their modalities and tasks.
Modality: The datasets and challenges which focus onthe WM, GM, and CSF segmentation, usually provideMR images. Challenges (2, 3, 4, 5, 6, 7) provide mainlytwo modalities: T1, T2, while datasets (1, 8) only pro-vide T1 for the white matter hyperintensities segmen-tation task. Note that, MR scans are sensitive to thehydrogen atom, and such a feature can effectively helpimage analysts to distinguish between different tissuesand parts of the image. Moreover, due to the color ofthe tissue imaged by MR, these scans are named as“white matter” and “gray matter”.
Task: The main focus of these datasets and challengesis the segmentation of WM, GM, and CSF. However,they do not only focus on that. Challenges (1, 4, 5, 6, 7)also provide the annotation of other parts of the brain,including basal ganglia, white matter lesions, cerebel-lum, and infarction. One of the challenge for segmen-tation is the presence of a lesion because of the unnat-ural characterization of lesions. A well-annotated data
8 Johann Li 1 et al.
Table1:
Summaryof
datasets
andchalleng
esfortheba
sicbrainim
agean
alysis.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rM
odal
itie
sFo
cus
Tas
ksIn
dex
T1
T2
dMR
OT
Segm
entatio
nGen
eration
Registration
1CEREBRuM
[17]
2019
XW
M,G
M,C
SFX
2iSeg
2019
[18]
2019
XX
WM,G
M,C
SFX
3iSeg
2017
[19]
2017
XX
WM,G
M,C
SFX
4MRBrainS18
2018
X1
X2
WM,G
M,C
SFX
5NEAT
BrainS15
[20]
2015
X1
X2
WM,G
M,C
SFX
6MRBrainS13
[21]
2013
X1
X2
WM,G
M,C
SFX
7Neona
talM
RBrainS12
[22]
2012
XX
WM,G
M,C
SFX
8W
MH
Seg.
Chlg.
[23]
2017
XW
M,G
M,C
SFX
9ENIG
MA
Cereb
ellum
2017
XCereb
ellum
X10
Mindb
oggle[24]
2012
XBrain
atlases
X11
Labe
ledBrain
Scan
s2012
XBrain
atlases
X12
CAUSE
07[25]
2007
XCau
date
X
13AutoImplan
t[26]
2020
X3
Craniop
lasty
X14
AccelMR
2020
[27]
2020
XX
Non
-line
armap
ping
ofdiffe
rent
resolutio
nsX
15MRIW
MRecon
struction[28]
2020
XW
hite
matterreconstructio
nX
16Calgary
Cam
pina
sBrain
Dataset
[29]
2020
XBrain
imagereconstructio
nX
17MRIRecon
structionCha
lleng
e2020
XMR
reconstructio
nX
18Fa
stMRI[30]
2018
XX
Acceleratingmagne
ticresona
nceim
aging
X19
MUSH
AC
2018
2018
XDW
MRIregistratio
nan
den
hancem
ent
XX
20HARDI2013
2013
XDiffusionMRIreconstructio
nX
21HARDI2012
2012
XDiffusionMRIreconstructio
nX
22CuR
IOUS2019
2019
XX4
Imageregistratio
nX
23CuR
IOUS2018
2018
XX4
Imageregistratio
nX
243D
VoTEM
2018
XFibe
rtractograp
hyX
25DTITr
actograp
hyCha
lleng
e2012
2012
XDTItractograp
hyX
26DTITr
actograp
hyCha
lleng
e2011
2011
XDTItractograp
hyX
1T1w
andT1w
-IR
2T2-FL
AIR
3CT
4Ultr
asou
nd
A Systematic Collection of Medical Image Datasets for Deep Learning 9
can help AI to overcome this problem and also achievemore robust results. Challenges (5, 7) use MR imagesof the neonatal brain, and consider tissue volumes asan indicator of long-term neurodevelopmental perfor-mance [22].
Performance metric: For the segmentation task, theDice score is one of the most commonly used metrics,and all these datasets and challenges adopt it as a per-formance measure. Besides the Dice score, datasets (4,6, 8) also use Hausdorff distance and volumetric sim-ilarity as metrics; datasets (2, 3) use the average theHausdorff distance and the average surface distance asone of their metrics; moreover, dataset (8) also uses sen-sitivity and F1-score as metrics for performance evalu-ation.
3.1.2 Segmentation of functional areas & other tissues
The segmentation of functional areas and tissues hasalso an essential meaning for brain-related research andcomputer-aided diagnosis. In this subsection, we intro-duce the datasets and challenges that are related to thesegmentation of functional areas and tissues.
Tissues segmentation: While, WM, GM, and CSF wereintroduced in Subsection 3.1.1, the segmentation ofother brain tissues is also an active research area. Chal-lenges (1, 4, 6, 7) aim to segment brain images into dif-ferent tissues, including ventricles, cerebellum, brain-stem, and basal ganglia. These challenges provide MRimages and the voxel-level annotations of the regions ofinterest with thirty or forty scans. Because these regionsare essential for brain health, researchers need to over-come the challenges related to their size and shape inorder to segment them. Dataset (9) focuses on the cere-bellum segmentation from the diffusion-weighted image(DWI), while dataset (12) focuses on the segmentationof caudate from the brain MR image.
Functional areas: The segmentation of the humanbrain cortex into different functional areas is of greatsignificance in education, clinical research, treatment,and other applications. Datasets (10, 11) provide im-ages and annotations for the design of automatic al-gorithms to segment the brain cortex into differentfunctional areas. Dataset (10) uses DTK protocol [24],which is modified from DK protocol [31], and theDTK protocol includes 31 labels, details of whichare listed in https://mindboggle.readthedocs.io/en/latest/labels.html. Dataset (11) is a commercialdataset for research in the segmentation of functionalareas of the brain cortex.
3.1.3 Imaging-related tasks
In addition to the segmentation tasks of the brain tis-sues and the functional areas, some of the datasets andchallenges also focus on the generation, registration,and tractography.
Generation: Datasets and challenges (14, 17, 18) aimto accelerate MR imaging or generate high-resolutionMR images from low-resolution ones. Usually, high-resolution imaging requires higher cost, while low-resolution imaging is cheaper but affects the analyticaljudgment and may lead to an incorrect diagnosis. Thesechallenges provide many scans at low-resolution to al-low researchers to design algorithms to convert or maplow-resolution images onto higher-resolution ones. Thedatasets and challenges mainly focus on the generationtasks. Another focus is the cranioplasty (13) to gener-ate a part of broken skull from CT images of the modelsof the broken skull. Other datasets and challenges (15,19, 20, 21) focus on the reconstruction of MR images.
Registration: The registration between different modal-ities is another research focus. Challenges (22, 23) fo-cus on the registration between ultrasound data andMR images of the brain. Cross-modality registrationis difficult because the subject is not absolutely static.Moreover, the MR is a 3D volume imaging modality andhence is different from ultrasound, which is a 2D imag-ing modality. Thus, these challenges focus on establish-ing the topological relation between Preoperative MRimage and intraoperative ultrasound. Challenge (19)also focuses on the diffusion MR image registration toeliminate differences between different vendors’ hard-ware devices and protocols.
Tractography: Tractography is another segmentationtask and focuses on the segmentation and imaging ofthe fiber in the WM. Dataset (24) aims to segmentthe fiber bundles from brain images, including phan-tom, squirrel monkey, and macaque, while challenges(25, 26) focus on the tractography with DTI, anothertype of MR image.
3.2 Brain diseases related datasets and challenges
Besides the structural analysis and image processingtasks, computer-aided diagnosis is also a research focusin healthcare. Medical image analysis plays a criticalrole in clinical research, diagnosis, and treatment. Thedatasets and challenges we have included are for twotasks: 1) the segmentation of lesions and tumors and 2)the classification of diseases. For the segmentation task,
10 Johann Li 1 et al.
the respective datasets and challenges focus on the tu-mor and lesion segmentation of the human brain, markthe lesion’s contour for diagnosis and treatment, andthe relevant details are shown in Subsection 3.2.1. Forclassification tasks, the datasets and challenges havebeen used for the development of automatic algorithmsto classify or predict diseases from medical images, andthese datasets and challenges are presented in Subsec-tion 3.2.2.
3.2.1 Datasets for segmentation of tumors and lesions
Tumors and lesions in the brain affect human’s healthylife and safety, and image analysis is an effective wayto diagnose the relevant diseases. In this subsection, re-lated datasets and challenges are introduced, and theyare reported in Table 2.
Glioma datasets and challenges: Gliomas are one of themost common brain malignancies for adults. Therefore,many challenges and datasets focus on the segmenta-tion of glioma for its diagnosis and treatment. BraTSchallenge series (30, 31, 32, 33, 34, 35, 36, 37, 38) havebeen occurring since 2012 to segment the glioma. Thechallenges of such a segmentation task are caused bythe heterogeneous appearance and shape of gliomas.The heterogeneity of glioma reflects its shape, modal-ities, and many different histological sub-regions, suchas the peritumoral edema, the necrotic core, enhancing,and the non-enhancing tumor core. Therefore, these se-ries of challenges provide multi-modal MR scans to helpresearchers design and train algorithms to segment tu-mors and their sub-regions. The tasks of this challengeseries include 1) low- and high-grade glioma segmenta-tion (37, 38), 2) survival prediction from pre-operativeimages (32, 33), and 3) the quantification of segmenta-tion uncertainty (30, 31). Besides the BraTS challengeseries, dataset (47) is another one for the segmenta-tion of low-grade glioma and provides T1-weight andT2-weight MR images with biopsy-proven gene statusof each subject by fluorescence in-situ hybridization,a. k. a. FISH [46]. Dataset (46) focuses on the process-ing of brain tumor and aims to design and evaluateDL-based automatic algorithms for glioblastoma seg-mentation and further research.
Ischemic stroke lesion datasets and challenges: Simi-lar to tumor segmentation, brain lesion segmentationalso focuses on detecting brain abnormalities. However,the difference is that lesion segmentation deals withdamaged tissues. Challenges (39, 40, 41, 42, 48) fo-cus on stroke lesion segmentation because stroke is alsolife-threatening and can disable the surviving patients.
Stroke is often associated with high socioeconomic costsand disabilities. Automatic analysis algorithms help todiagnose and treat stroke, since its manifestation is trig-gered by local thrombosis, hemodynamic factors, or em-bolic causes. In MR images, the infarct core can be iden-tified with diffusion MR images, while the penumbra(which can be treated) can be characterized by perfu-sion MR images. The challenge ISLES 2015 (42) focuseson sub-acute ischemic stroke lesion segmentation andacute stroke outcome/penumbra estimation and pro-vides 50 and 60 multi-modalities MR scans of data fortraining and validation, respectively, for two subtasks,i. e., sub-acute ischemic stroke lesion segmentation andacute stroke outcome/penumbra estimation. The subse-quent year’s challenge, ISLES 2016 (41), focuses on thesegmentation of lesions and the prediction of the degreeof disability. This challenge provides about 70 scans, in-cluding clinical parameters and MR modalities, such asDWI, ADC, and perfusion maps. The challenge ISLES2017 (40) focuses on the segmentation with acute MRimages, and ISLES 2018 (39), focuses on the segmenta-tion task based on acute CT perfusion data. Moreover,dataset (48) focuses on the segmentation of the brainafter stroke for further treatments.
Intracranial hemorrhage related datasets: Intracranialhemorrhage is another type of medical condition thataffects our health. Dataset and challenge (43, 44) fo-cus on the detection and segmentation of intracranialhemorrhage to help medics locate the hemorrhage re-gions and decide on a treatment plan. Dataset (45) alsoprovides data for the classification of normal or hemor-rhage CT images.
Multiple sclerosis lesion related datasets: Multiple scle-rosis lesion is another kind of lesion in the brain whichis not life-threatening and deadly but can cause dis-abilities. Datasets and challenges (49, 50, 51) are aboutthe multiple sclerosis lesion segmentation with multi-modalities MR data (T1w, T2w, FLAIR, etc.).
3.2.2 Classification of brain disease
Except for the tumor and lesion segmentation, braindisease classification also plays an essential role inhealthcare. Brain related diseases have a severe effecton patients’ health and their lives, e. g., Alzheimer’s dis-ease (AD) [63,64,65,66] and Parkinson’s disease (PD).Therefore, effective diagnosis and early intervention caneffectively reduce the health damage to patients, the ef-fect on the social times of families, and the economicalimpact on society. In this section, we first introduce thedatasets and challenges of AD (52, 53, 54, 55, 56, 62),
A Systematic Collection of Medical Image Datasets for Deep Learning 11Ta
ble2:
Summaryof
datasets
andchalleng
esforthebrainlesio
nan
dtumor
segm
entatio
ntask.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rM
odal
itie
sFo
cus
Les
ion/
Tum
or
Inde
xT1
T2
OT
brain
tumor
stroke
lesion
intracranial
hemorrhage
sclerosis
lesion
27CADA-A
S2020
X5
Cereb
rala
neurysm
X
28CADA-R
RE
2020
X5
Cereb
rala
neurysm
X
29CADA
2020
X5
Cereb
rala
neurysm
X
30BraTS2020
[32,33,34]
2020
X1
X2
Mutli-mod
alities
X
31BraTS2019
[32,33,34]
2019
X1
X2
Mutli-mod
alities
X
32BraTS2018
[34]
2018
X1
X2
Mutli-mod
alities
X
33BraTS2017
[35]
2017
X1
X2
Mutli-mod
alities
X
34BraTS2016
[32]
2016
X3
X4
Mutli-mod
alities
X
35BraTS2015
[32]
2015
X3
X4
Mutli-mod
alities
X
36BraTS2014
[32]
2014
X3
X4
Mutli-mod
alities
X
37BraTS2013
[32]
2013
X3
X4
Mutli-mod
alities
X
38BraTS2012
[32]
2012
X3
X4
Mutli-mod
alities
X
39ISLE
S2018
2018
X6
Ischem
icstroke
X
40ISLE
S2017
2017
XX7
Ischem
icstroke
X
41ISLE
S2016
2016
X8
Ischem
icstroke
X
42ISLE
S2015
[36]
2015
XX
X9
Ischem
icstroke
X
43CT-ICH
[37]
2020
X10
Intracranial
hemorrhage
X
44RSN
AIntracranial
Hem
orrhageDetectio
n2019
X10
Intracranial
hemorrhage
X
45HeadCT
-Hem
orrhage
2019
X10
Intracranial
hemorrhage
X
46Brain
Tumor
Progression
[38]
2018
XX
X9
Brain
tumor
X47
LGG-1p1
9qDeletion[39,40]
2017
XX
Low-grade
gliomas
X48
ATLA
S[41]
2017
XAna
tomical
segm
entatio
nX
49MSS
EG
Cha
lleng
e[42,43]
2016
XX
X11
Multip
lesclerosis
X
50MSC
halle
nge2015
[44,45]
2015
XX
X12
Long
itudina
lmultip
lesclerosis
X
51MSS
eg2008
2008
XX
X9
Multip
lesclerosis
X1
ForBraTS17
to20,T
1mod
ality
includ
esT1im
agean
dT1G
dim
age.
2Fo
rBraTS17
to20,T
2mod
ality
includ
esT2im
agean
dT2-FL
AIR
image.
3Fo
rBraTS12
to16,T
1mod
ality
includ
esT1im
agean
dT1c
image.
4Fo
rBraTS12
to16,T
2mod
ality
includ
esT2im
agean
dT2w
-FLA
IRim
age.
5MR
Ang
iograp
hy6
DW
Ian
dPe
rfusionMR
image
7T2w
andFL
AIR
8ADC
andPe
rfusionMR
image
9FL
AIR
10
CT
11
DP/T
2an
dFL
AIR
12
PDw
andFL
AIR
12 Johann Li 1 et al.
Table 3: Summary of datasets and challenges for brain disease classification tasks.
Reference Dataset/Challenge Year Modalities Diseases CategoryIndex T1 T2 DWI PT OT AD PD OT
52 ADNI-1 [47] 2004 X X X X X NC, MCI, AD53 ADNI-GO [47] 2009 X X X X X NC, MCI, AD54 ADNI-2 [48] 2011 X X X X X NC, EMCI, LMCI, AD55 ADNI-3 [49] 2016 X X X X X NC, EMCI, LMCI, AD56 OASIS 1 [50] 2007 X X NC, AD57 OASIS 2 [51] 2009 X X NC, AD58 OASIS 3 [52] 2019 X X X X6 X NC, AD59 MRIHS [53,54] 2019 X X NC, AD60 TADPOLE [55] 2017 X X X X X1 X NC, MCI, AD61 MRI and Alzheimers 2017 X X NC, AD62 CADDementia [56] 2014 X X NC, MCI, AD63 ANT [57,58] 2019 X X2 X NC, PD64 PD De Novo [59,60] 2018 X X3 X NC, PD65 SCA2 DTI [61,62] 2018 X X X4 NC, SCA2
66 MTOP 2016 X X X5 Healthy, category I or cate-gory II
1 CSF 2 Events and bold 3 Bold 4 Spinocerebellar ataxia type II 5 Mild traumatic brain injury 6 MRI FLAIR
and then we introduce other diseases (63, 64, 65). Table3 shows the relevant challenges and datasets.
Alzheimer’s disease: AD affects a person’s behavior,cognition, memory, and daily life activities. Such a pro-gressive neurodegenerative disorder affects the normaldaily life of patients because suffering from such a dis-ease makes patients not know who they are and whatthey should do which then progresses to the point untilthey forget everything they know. The disease takes anunbearable toll on the patient and leads to a high costto their loved ones and to the society. For example, ac-cording to [67], AD became the sixth deadly cause inthe U.S. in 2018 and costs more than two to three hun-dred billion U.S. dollars.
Therefore, researchers are doing everything theycould to explore the causes of AD and its treatments.Diagnosis based on medical images has become a re-search focus because early diagnosis and interventionhave significance on the progress of this disease. Hencemany researchers work on the classification, i. e., pre-diction of AD using brain images. The datasets mainlyinclude “Alzheimer’s Disease Neuroimaging Initiative(ADNI)” and “Open Access Series of Imaging Studies(OASIS)”.
The ADNI is a series of projects that aim to developclinical, imaging, genetic, and biochemical biomarkersfor the early detection and tracking of AD. It includesfour stages: ADNI-1 (52), ADNI-GO (53), ADNI-2 (54),and ADNI-3 (55). These projects provide image data ofthe brain for researchers, and the modalities of images
include MR (T1 and T2) and PET (FDG, PIB, Florbe-tapir, and AV-1451). These four stages consists of 1400subjects. The subjects can be categorized into normalcognition (NC), mild cognitive impairment (MCI), andAD, where MCI can be split into early mild cognitiveimpairment (EMCI), later mild cognitive impairment(LMCI).
The OASIS is a series of projects aiming to pro-vide neuroimaging data of the brain, which researcherscan freely access. OASIS released three datasets, whichare named OASIS-1, OASIS-2, and OASIS-3. All thesethree datasets are related to AD, but these datasets arealso used in functional areas segmentation and othertasks. The OASIS-1 (56) contains 418 subjects agedfrom 18 to 96, and for the subjects older than 60, thereare 100 subjects diagnosed with AD. The dataset in-cludes 434 MR sessions. The OASIS-2 (57) contains 150subjects, aged between 60 to 96, and each subject in-cludes three or four MR sessions (T1). About 72 sub-jects were diagnosed as normal, while 51 subjects werediagnosed with AD. Besides, there are 14 subjects whowere diagnosed as normal but were characterized as ADat a later visit. The OASIS-3 (58) includes more than1000 subjects, more than 2000 MR sessions (T1w, T2w,FLAIR, etc.), and more than 1500 PET sessions (PIB,AV45, and FDG). The dataset includes 609 normal sub-jects and 489 AD subjects.
Moreover, there are many other challenges based onADNI and OASIS or independence datasets. Challenge(60) is based on ADNI and aims at the prediction of thelongitudinal evolution. Dataset (61) is based on OASIS
A Systematic Collection of Medical Image Datasets for Deep Learning 13
and it is released on Kaggle for the classification of AD.Challenge (62) is an independent AD-related challengeto classify subjects into NC, MCI, and AD.
Other diseases: Similar to AD, other brain diseases arealso important from the diagnosis and treatment per-spective. However, the number of datasets and chal-lenges of these diseases is not as large as AD. Afew datasets focus on Parkinson’s disease (PD) andspinocerebellar ataxia type II (SCA2). Datasets (63,64) provide images of PD with MR images and classi-fication labels. Dataset (65) provides images and clas-sification labels of spinocerebellar ataxia-II, i.e., SCA2.Dataset (66) provides images and annotations for thediagnosis of mild traumatic brain injury.
3.3 Eye related datasets and challenges
As the human’s imaging sensor, the eyes’ health is es-sential for human beings, and eye diseases may leadto blindness. We introduce the relevant challenges anddatasets in this subsection and list them in Table 4.
Datasets according to the modality: With regards tothe eye-related datasets and challenges, the main usedmodalities are the fundus photo (70, 72, 73, 74, 75, 76,77, 82, 84) and OCT (71, 78, 79, 81, 83). The fundusphoto can help medics evaluate the eye’s health and lo-cate the retinal lesions because the fundus photo clearlyshows the important parts of the eye, such as the bloodvessels and the optic disc. OCT is a new imaging ap-proach that is safe for the eye and shows the retinaltissues’ in details. However, it has also disadvantages– it is not suitable for diagnosing microangioma andthe planning of retinal laser for the photocoagulationtreatment.
Datasets according to the analysis task: These datasetsand challenges can be used for four tasks.
1) Classification tasks focus on classifying whetherthe subject has specific diseases or judging whether thesubject is abnormal. Datasets and challenges (69, 70,71, 74, 75, 76, 82) focus on predicting a single disease,while others (73, 77, 78, 79) focus on diagnosing multi-ple diseases.
2) Segmentation is another task, which providesmore information compared to classification. Datasetsand challenges (70, 72, 74, 77, 81, 83) focus on the seg-mentation of the tissues and lesions for further diagnosisand disease analysis.
3) Datasets and challenges (70, 71, 74, 76, 77, 84)focus on the detection of lesions or other landmarks.
These tasks help medics locate key targets, such as ar-eas and tissues, for effective diagnosis or provide featuredetails for other automated algorithms.
4) Unlike other tasks, the last one focuses on theannotation of the tools used for eye-related surgery(80).
Datasets according to focused eye diseases: Researchersmainly focus on these diseases:
– Diabetes retinopathy (73, 75, 77, 78, 79, 82, 84)– (Age-related) macular degeneration (73, 76, 78)– Pathologic myopia (73, 74)– Diabetic macular edema (77, 83)– Glaucoma (70, 73)– Cataract (73)– Closure glaucoma (71).– Hypertension (73)
Besides these diseases, dataset (80) aims at the anno-tation of images.
3.4 Datasets and challenges of other Subjects
Besides the brain’s structural analysis, the image pro-cessing, and the computer-aided diagnosis tasks, an-other important research focus is the human neck be-cause it holds many essential glands and organs. Thissubsection discusses the datasets and challenges of theneck and teeth, covered in Subsection 3.4.1 and Sub-section 3.4.2, respectively. Moreover, many researchersare working on the analysis of behavior and cognitionwith DL-based methods. We discuss the details in Sub-section 3.4.3.
3.4.1 Neck related datasets
The neck is also essential for our health. The neck holdsmany glands and organs, and when these become ab-normal, effective diagnosis and segmentation play anessential role in their treatments. The related imagedatasets and challenges are listed in Table 5.
Datasets and challenges (85, 87, 88, 90, 94, 95) fo-cus on the segmentation of glands and the lesions andtumors in relevant glands. Dataset (89) focuses on thebinary classification tumor vs. normal. Challenge (86)aims at the task of thyroid gland nodules detection withultrasound images and videos. Challenge (91) focuseson the nerves segmentation in the neck, while challenge(96) focuses on evaluating carotid bifurcation.
14 Johann Li 1 et al.
Table4:
Summaryof
datasets
andchalleng
esof
eye-diseaserelatedtasks.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rM
odal
itie
sD
isea
ses1
Tas
ksIn
dex
OCT
FPOT
AMD
DR
GOT
Classificatio
nSe
gmentatio
nDectection
Other
67RIA
DD
[68]
2020
XX
XX2
X68
The
2ndDeepDRiD
2020
XX
X69
REFU
GE
2[69]
2020
XX
XX
X70
REFU
GE
[69]
2018
XX
XX
X
71AGE
[70]
2019
XX7
XX
72DRIV
E2019
XX6
X
73ODIR
-2019
2019
XX
XX8
X
74PA
LM2019
XX9
XX
X75
APTOS2019
2019
XX
X76
ADAM
[71]
2018
XX
XX
77ID
RiD
[72,73,74]
2018
XX
X10
XX
X
78Retinal
OCT
Images
[75,76]
2018
XX
X11
X79
ROCC
2017
XX
X
80CAT
ARACTS[77,78]
2017
X3
X4
X12
81RETOUCH
[79]
2017
XX5
X82
Diabe
ticRetinop
athy
Detectio
n[80]
2015
XX
X
83Se
gOCT
(DME)[81]
2015
XX10
X84
ROC
[82]
2009
XX
X1
AMD:a
ge-related
macular
degene
ratio
n;DR:d
iabe
ticretin
opathy
;G:g
laucom
a.2
Allthediseases
ofthis
datasetarelistedon
theoffi
cial
web
site.S
eeht
tps:
//ri
add.
gran
d-ch
alle
nge.
org/
Data
/.3
Video
4Su
rgerytoolsde
tection
5Fluidsegm
entatio
n6
Vessel
extractio
n7
Closure
glau
coma
8Diabe
tes,
cataract,h
ypertension,
andmyo
pia.
9Pa
thologic
myo
pia
10
Diabe
ticmacular
edem
a11
Macular
degene
ratio
n12
Tool
anno
tatio
n
A Systematic Collection of Medical Image Datasets for Deep Learning 15
Table 5: Summary of datasets and challenges of head and neck related diseases.
Reference Dataset/Challenge Year Modalities Focus TasksIndex CT PT OT Seg. Other
85 MICCAI 2020: HECKTOR 2020 X X Head and neck primary tumors X
86 TN-SCUI 2020 [83] 2020 X1 Thyroid gland nodules diagnosis X2
87 Head Neck Radiomics HN1 [84,85] 2019 X Head and neck squamous cell carcinoma X
88 AAPM RT-MAC 2019 [86] 2019 X3 Soft tissue and tumor X
89 Head Neck PET-CT [87,88] 2017 X X Tumor X4
90 Head & Neck AutoSeg Challenge [89] 2015 X Tumor X
91 Ultrasound Nerve Segmentation 2016 X1 Nerve X
92 Dental X-Ray Analysis 2 2015 X5 Cephalometric X6
93 Dental X-Ray Analysis 1 2015 X5 Caries X
94 Head & Neck AutoSeg 2010 [90] 2010 X7 Parotid gland X95 Head & Neck AutoSeg 2009 2009 X Multi-organs and tissues X
96 CLS 2009 [91] 2009 X8 Carotid bifurcation X1 Ultrasound 2 Detection 3 MR T2-weighted image 4 Classification 5 Computed Radiography 6 Localization7 MR 8 CT Angiography
3.4.2 Cephalometric and teeth related datasets
Challenges (92, 93) focus on the diagnosis of dental X-Ray images. The main tasks of these two challengesinclude landmark localization and caries segmentation.Challenge (92) provides around 400 cephalometric X-ray images with the annotation of landmarks by twoexperienced dentists. Challenge (93) provides about 120bitewing images with experts’ annotations of differentparts of the teeth.
3.4.3 Behavior and cognition datasets
To understand what we see, hear, smell, and feel, ourbrain draws on neurons in our brain to compute andanalyze the stimulations and understand what, where,why, and when questions and scenarios are. Many re-searchers now use Artificial Neural Networks as a re-search method to analyze the relationship betweenbrain activities and stimulation. They use functionalMR images to scan our brain activity, analyze thehemodynamic feedback, and identify the area of theneurons which react. Therefore, the analysis of the re-actions of the brain in response to a specific stimulationis an important research focus. Researchers use DL todetect or decode the stimulation of subjects to work outthe brain’s functionality. The related datasets are listedin Table 6.
Some datasets (98, 103, 107) focus on classifyingthe stimulations or the subject’s attribution based onthe subject’s functional MR images. Dataset (98) aimsto identify whether the subject is a beginner or an ex-pert in programming via the reaction of their brain tosource codes. Dataset (103) focuses on diagnosing sub-
jects with depression vs. subjects with no-depressionusing audio stimulations and analyzing the subjects’brain activity. Dataset (107) works on the influence ofcannabis on the brain.
Datasets (97, 99, 101, 102, 104, 105, 106) focus onthe encoding of the stimulations, i. e., brain activities’decoding. Datasets (101, 104, 105) aim to rebuild whatsubjects have seen using DL-based methods from theirbrain activities using functional MR images. On theother hand, datasets (99, 106) work on the encodingof faces that subjects have seen from functional MRimages with similar modalities.
4 Chest and abdomen related datasets andchallenges
There are many vital organs in the chest and abdomen.For example, the heart is responsible for the blood sup-ply; the lungs are responsible for breathing; the kidneysare responsible for the production of urine to elimi-nate toxins from the body. Therefore, the medical imageanalysis of organs in the chest and abdomen is an im-portant research focus. Most of the tasks are computer-aided diagnosis with classification, detection, and seg-mentation of lesions being the most targeted tasks.
Many datasets and challenges aim to segment oneor more organs in the chest and abdomen for diagno-sis or treatment planning. Subsection 4.1 discusses thedatasets and challenges for segmentation. Subsection4.2 introduces the datasets and challenges which focuson the diagnosis of organs in the chest and abdomen.While, Subsection 4.3 describes the datasets and chal-lenges of the chest and abdomen that are not catego-
16 Johann Li 1 et al.
Table 6: Summary of datasets and challenges that are used for behavioral and perception related tasks.
Reference Dataset/Challenge Year Modalities Tasks StimulationIndex T1 T2 Bold Events OT
97Cognitive control of sensory painencoding in the pregenual ante-rior cingulate cortex. [92]
2020 X X Sensory pain encoding pain
98 fMRI dataset on program com-prehension and expertise [93,94] 2020 X X X
Different between expertand novices in corticalrepresentations of sourcecode
brain action
99Reconstructing Faces from fMRIPatterns using Deep GenerativeNeural Networks. [95]
2019 X Face Reconstruction vision
100 Resting State - TMS [96,97] 2019 X XEffect of iTBS on fronto-striatal network and ROIsegmentation
iTBS
101 Deep Image Reconstruction [98,99] 2018 X X X X
Image reconstructionfrom human brain activ-ity
vision
102 BOLD5000 [100] 2018 X X X X1 Brain reaction to vision vision
103Neural Processing of EmotionalMusical and Nonmusical Stimuliin Depression [101,102,103]
2018 X XBrain reaction to audi-tion audio
104 Visual image reconstruc-tion [104] 2018 X X X
Image reconstructionfrom human brain activ-ity
vision
105 Generic Object Decoding [105] 2018 X X X XImage reconstructionfrom human brain activ-ity
vision
106Adjudicating between face-coding models with individual-face fMRI responses [106]
2018 XDecoding face from brainactivity vision
107T1w structural MRI study ofcannabis users at baseline and 3years follow up [107]
2018 XImpact of cannabis onbrain cannabis
1 DWI, Field map
rized above, including regression, tracking, registration,and other tasks related to the chest and abdomen or-gans.
4.1 Datasets for chest & abdomen organ segmentation
This subsection covers the datasets and challengesof the chest and abdomen organs that are used foranatomic segmentation tasks. The anatomic segmen-tation tasks include the organ contour segmentation(Subsection 4.1.1) and organ segmentation (Subsection4.1.2). The contour segmentation is different from or-gan segmentation–the former aims to separate an organfrom the backgroup or mark the boundaries betweenmultiple organs and the background. The latter aims tosegment the organ into different parts at the anatomi-cal level. Table 7 presents the datasets and challengesthat are used for the segmentation of the chest and ab-domen organs.
4.1.1 Datasets of chest and abdomen organs
Organ contour segmentation is a necessary informa-tion for the preplanning of surgery and diagnosis. Awell-segmented contour of the organs provides a pre-cise mask, which helps to produce accurate segmenta-tion results for the diagnosis, treatment, and operation.This subsection introduces datasets and challenges forthe contour segmentation of a single organ and of mul-tiple organs.
Chest & abdomen datasets according to the organ: Thedatasets and challenges that we have covered are shownhere. The following organs and parts are involved in:
– Liver (113, 114, 116, 122, 127, 128, 144)– Lung (113, 116, 118, 120, 128, 139, 143)– Kidney (110, 113, 114, 127, 128)– Prostate (116, 126, 137, 138, 142)– Heart (111, 115, 116, 120)– Pancreas (116, 123, 127, 128)
A Systematic Collection of Medical Image Datasets for Deep Learning 17
Table 7: Summary of datasets and challenges for the chest and abdomen organ segmentation tasks.
Reference Dataset/Challenge Year Modalities OrgansIndex MR CT CR OT Liver Lung Kidney Prostate Heart Other
108 MNMS Challenge [108] 2020 X X
109 Automated Segmentation ofCoronary Arteries 2020 X1 X2
110 C4KC-KiTS [109,110] 2019 X X111 MS-CMRSeg 2019 [111,112] 2019 X X
112 CAMUS [113] 2019 X3 X
113 CT-ORG [114,115,116] 2019 X X X X X4
114 CHAOS [117,118,119] 2019 X X X X5
115 SegTHOR [120] 2019 X X X6
116 Medical Segmentation De-cathlon [121] 2019 X X X X X X7
117 PAVES 2018 X X8
118 SHCXR Lung Mask [122,123,124] 2018 X X
119 AtriaSeg 2018 [125] 2018 X X
120 Lung CT Segmentation Chal-lenge 2017 [126,127] 2017 X X9 X X
121 AAPM Thoracic AutoSeg 2017 X X X122 LiTS [116] 2017 X X
123 Pancreas CT [128,129] 2016 X X10
124 Breast MRI NACT Pilot [130] 2016 X X11
125 HVSMR 2016 [120,131] 2016 X X126 Prostate Diagnosis [132] 2015 X X
127 Multi-Atlas Labeling Beyondthe Cranial Vault 2015 X X X X16
128 Anatomy3 [133] 2015 X X X X X X17
129 CT Lymph Nodes [134, 135,136,137] 2015 X X12
130 CETUS 2014 X3 X
131 VISCERAL Benchmark2 [138] 2014 X X X X X X17
132 Left Atrium SegmentationChallenge [139] 2013 X X X
133 NCI-ISBI 2013 [115] 2013 X X
134 Left Atrium Fibrosis and ScarSegmentation Challenge 2012 X X13
135 VESSEL12 2012 X X
136 CRASS12 2012 X X14
137 PROMISE12 [140] 2012 X X138 Prostate-3T [141] 2012 X X139 LOLA11 2011 X X
140 Left Ventricular SegmentationChallenge 2011 X X
141 IVUS11 [142] 2011 X X15
142 Promise09 2009 X X143 EXACT09 [143] 2009 X X144 SLIVER07 [144] 2007 X X
1 Coronary CT angiography 2 Coronary arteries 3 Ultrasound 4 Bladder 5 Spleen6 Aorta, trachea, and esophagus 7 Pancreas, colon, and other non-chest or -abdomen organs. 8 Vein9 Radiotherapy Structure Set 10 Pancreas 11 Breast12 Lymph 13 Left atrium fibrosis and scar 14 Chest structure 15 Vessel16 Adrenalglands, aorta, esophagus, gall bladder, pancreas, splenic/portal veins, spleen, stomach, vena cava17 Spleen, urinary bladder, rectus abdominis muscle, 1st lumbar vertebra, pancreas, psoas major, muscle,gall bladder, sternum, aorta, trachea, and adrenal gland
18 Johann Li 1 et al.
– Aorta (115, 127, 128)– Esophagus (115, 120, 127)– Spleen (114, 127, 128)– Adrenal glands (127, 128)– Bladder (113, 128)– Gall bladder (127, 128)– Trachea (115, 128)– Colon (116)– Breast (124)– Lymph (129)– Spinal cord (120)– Stomach (127)
Generally, these datasets and challenges focus onthe larger organs, such as the liver and the lungs, withthe aim to diagnose tumors and lesions, and wherecontour segmentation is a pre-processing step. How-ever, it is challenging to segment smaller organs withlow-resolution images, particularly for radiotherapy, be-cause an incorrect contour segmentation of these smallorgans can lead to severe consequences (e. g., organdamage). Small organs’ incorrect contour can lead totheir damage during radiotherapy.
Chest & abdomen datasets according to modality: Themost commonly used image modalities for chest and ab-domen organs segmentation are MR and CT. As Table1 shows, many datasets and challenges use MR images.MR images have higher resolution under certain con-ditions and have better resolution for soft body tissuesand organs, such as the heart and prostate. Meanwhile,CT is the most widely used modality for organ segmen-tation and other tasks and diagnosis that are relatedto chest and abdomen, such as the lung and liver, ac-cording to our research, because of its convenience, ef-fectiveness, and low cost.
Chest & abdomen datasets according to focus: The pur-pose of these datasets and challenges can be categorizedinto three groups: further analysis, benchmark, and ra-diotherapy. Most datasets and challenges which provideannotated organs’ contours are provided with the ob-jective to focus on further analysis and treatments. Oneof the challenges of segmentation is to achieve a robustsegmentation of the whole organ and separate it fromthe background, without omitting the lesions and tu-mors, and thus, some test benchmarks (116, 128) areprovided for researchers to evaluate their algorithms.Another challenge, which is addressed by datasets andchallenges (115, 120) is the imbalance between differ-ent organs because of their sizes and shapes, and suchan imbalance makes it challenging to segment small or-gans and provide valuable information for analysis andtreatment.
Single chest & abdomen organ contour segmentation:The single organ’s contour segmentation tasks usuallyfocus on segmenting a region for subsequent tasks (110,122, 123, 124, 129, 138, 144) or with an anatomical pur-pose (118, 126, 133, 137, 142, 143) for research. The dif-ficulty of the former task is that the lesions and tumorsmay affect the segmentation by separating the organfrom the background, while the latter’s difficulty is toperform more precise segmentation.
Chest & abdomen multi-organs contours segmentation:The chest and abdomen multiple organs contour seg-mentation focuses on splitting the organs from eachother. Some of these datasets and challenges (113, 114,116) focus on the segmentation of multiple organs, in-cluding the relatively larger organs, which are easier tosegment, and the relatively smaller organs, which canbe more challenging to segment compared to the largerones, especially when the model is handling the largerand smaller organs at the same time. Similarly, someof these datasets and challenges (115, 120) focus onthe “organ at risk” which means that these organs arehealthy but might be at risk because of radiation ther-apy. Dataset (127) focuses on multi-atlas-based meth-ods, which are widely used in brain-related research.Dataset (128) aims to provide a benchmark for the seg-mentation algorithms.
4.1.2 Chest & abdomen organ parts segmentation
Different from contour segmentation of the chest andabdomen organs, the organ segmentation aims to seg-ment the organ into different parts. Just as the handhas five fingers, organs are made up of multiple parts,and a typical example is the Couinaud liver segmenta-tion method. This subsection introduces the datasetsand challenges for organ segmentation. These datasetsand challenges are listed in Table 7.
Heart realted datasets and challenges: Most of thesedatasets and challenges (112, 119, 125, 130, 132, 134,140) are related to the heart segmentation. The mostfrequently used modalities are MR and ultrasound, andthe aim is to segment the heart into the left atrium,chambers, valves, and other parts. Though MR andultrasound can effectively image the different tissuesof the heart, the heartbeat results in blurred images,which makes the segmentation task more difficult, whilefor ultrasound, the dynamic nature of ultrasound im-ages is another challenge for the segmentation algo-rithm.
A Systematic Collection of Medical Image Datasets for Deep Learning 19
Others chest & abdomen body parts: Challenge (139)provides 55 CT scans and focuses on the segmentationof the lung with the labeling of its different parts: out-side the lungs, the left lung, the upper lobe of the leftlung, the lower lobe of the left lung, the upper lobe ofthe right lung, the middle lobe of the right lung, andthe lower lobe of the right lung. The biggest challengeis the effect of the lung lesions and diseases, such as tu-berculosis and pulmonary emphysema, on the perfor-mance of the segmentation. Moreover, challenges (135,141) focus on the segmentation of the lung vessels.
4.2 Datasets for diagnosis of chest & abdomen diseases
Diseases of organs in the chest and abdomen have asignificant impact on human health. Therefore, manyresearchers work on this problem by analyzing medi-cal images. Several researchers have designed automaticor semi-automatic algorithms for the classification, seg-mentation, detection, and characterization tasks to helpmedics diagnose these diseases. In this subsection, wedescribe the datasets and challenges related to the di-agnosis of diseases of the chest and abdomen that arereported in Tables 8, 9, and 10, respectively.
Chest & abdomen datasets according to modality: Ac-cording to the datasets and challenges collected, CT isthe most commonly used imaging modality for the chest& abdomen, because of its suitable imaging qualityand ability to clearly display tissues and lesions. Somedatasets and challenges also provide CT images usingcontrast agents for clearer images. Besides CT imaging,there are other modalities, including MR, X-Ray digitalradiographs, PET, endoscopy, etc. The MR images areused in breast-related diagnosis, cardiac-related tasks,soft tissue sarcoma detection, and ventilation imaging.Because of the organs’ size and the CT’s resolution,which is limited by the imaging exposure time and ra-diation dose, MR is a more suitable imaging modalityfor small or specific organs. The PET is always usedwith other modalities, such as CT and MRI. The con-trast agent’s density is related to the metabolism, whichmeans the density of radiation from contrast agent willbe high in the tumor, so PET is always used for tumorrelated tasks. Endoscopy images are used for medicalinspection of the stomach, intestines, and others.
Chest & abdomen datasets according to classification ofdiseases: The classification of diseases intends to deter-mine whether a subject is healthy or not. It is sometimescalled “detection” or “prediction”, and the prediction isdifferent from the detection task presented below.
The main focus of these datasets is to judge whetherthere is any cancer, lesion, or tumor, such as soft tissuesarcoma (192), prostate lesion (177, 184), lung cancer(161), and breast cancer (160). Classification is an ef-fective task for diagnosis, particularly computer-aidedtasks. A quick and early diagnosis can allow effectiveinterventions to increase the probability of the patientrecovery before the condition worsens.
Another focus is the classification of diseases. Thesediseases include mainly pneumothorax (164), cardiacdiseases (175), tuberculosis (178), pneumonia (179),and COVID-19, which are discussed at the end of thissubsection. The endoscopy related challenges providedata with the aim to classify RGB images and videos toclassify patient into “normal” vs. “abnormal”. Dataset(169) focuses on the classification based on the diag-nostic records. These datasets and challenges providedata for researchers to design AI-based algorithms todiagnose common diseases.
Chest & abdomen datasets for attribute classification:The characterization task of the tumor and lesion isalso called attribute classification, which focuses on thesubsequent characterization analysis of the tumors andlesions following the detection and segmentation tasksusing automatic analysis algorithms. A typical exampleis the attributes classification of pulmonary and lungcancer (159, 162, 168, 186, 189, 193). The datasets andchallenges usually provide CT scans with the annota-tion of different attributes, such as lesion type, spicu-lation, lesion localization, margin, lobulation, calcifica-tion, cavity, etc. Each attribute includes two or morecategories. Another focus is the characterization of thebreast related lesions and tumors (187, 191).
Chest & abdomen datasets for detection: In most re-search and clinical situations, classification is notenough. The medics and researchers usually focus onthe reason for such a disease, and the localization of thelesion or tumor. Further treatment evaluations, planand interpretability are the specific focus for medicsand DL researchers. Thus, detection and segmentationare the tasks which are receiving a lot of attention atpresent. The detection task aims to find a region of in-terest and localize its position. The regions of interestusually include:
– Lung cancer and tumor (173, 180, 189, 195, 197)– Pulmonary nodule (162, 174, 193, 197, 200)– Celiac-related damage (202, 203, 204, 206)– Other lung lesions (172, 183)– Polyp (198, 204)– Cervical cancer (182)– Liver cancer (188)
20 Johann Li 1 et al.
Table8:
Summaryof
datasets
andchalleng
esforchestan
dab
domen
organs-related
tasksI.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rM
odal
itie
sFo
cus
Tas
ksIn
dex
CT
MR
PT
CR
OT
Classificatio
nSe
gmentatio
nDetectio
n
145
CT
Diagn
osis
ofCOVID
-19[145]
2020
XCOVID
-19
X146
Covid19
Cha
lleng
e.eu
2020
XCOVID
-19
X147
ObjectCXR
2020
XCOVID
-19
XX
148
CORD-19[146]
2020
XCOVID
-19
X149
Covid
Che
stX-R
ayDataset
[147,148]
2020
XCOVID
-19
X
150
Detectio
nof
COVID
-19from
Ultr
asou
nd[149,150]
2020
X1
COVID
-19
X
151
COVID
-Net
[151]
2020
XCOVID
-19
X152
COVID
-19
2020
XCOVID
-19
X153
COVID
-19Rad
iograp
hyDatab
ase[152]
2020
XCOVID
-19
X154
COVID
-19Che
stX-R
ay2020
XCOVID
-19
X155
COVID
-19CT
Segm
entatio
nDataset
2020
XCOVID
-19
X
156
COVID
-19Lu
ngCTLe
sion
Segm
entatio
nCha
lleng
e2020
2020
XCOVID
-19
X
157
CT
Images
inCOVID
-19[153,154]
2020
XCOVID
-19
X158
COVID
-19AR
[155,156]
2020
XX
COVID
-19
X159
BIM
CV-C
OVID
192020
XX
COVID
-19
X
160
BCS-DBT
[157,158]
2020
X2
Breastcancer
XX
161
Lung
-PET-C
T-D
x[159]
2020
XX
Lung
cancer
X162
LNDbCha
lleng
e2020
XPulmon
aryno
dule
XX
163
A-A
FMA-D
etectio
n2020
X1
Amniotic
fluid
detection
X110
C4K
C-K
iTS[109,110]
2019
XKidne
ytumor
X164
SIIM
-ACR
Pne
umotho
raxSe
gmentatio
n2019
XPne
umotho
rax
XX
165
CT
ventila
tionim
agingevalua
tion2019
2019
X3
XVe
ntila
tionim
aging
X166
Che
Xpe
rt[160]
2019
XChe
stX
X
167
NSC
LC-R
adiomics-Interobserver1
[84,161]
2019
XX4
Non
-smallc
elllun
gcancer
X
168
StructSe
g2019
2019
XLu
ngcancer
&organs-at-
risk
X
169
MIM
IC-C
XR
[162,163]
2019
XX5
Che
stim
agean
alysis
X170
MIM
IC-C
XR-JPG
2019
XChe
stX-R
ayX
X1
Ultr
asou
nd2
Digita
lbreasttomosyn
thesis
34D
-CT
4Rad
iotherap
yStructureSe
t5
Electroniche
alth
record
andrepo
rt
A Systematic Collection of Medical Image Datasets for Deep Learning 21
Table9:
Summaryof
datasets
andchalleng
esforchestan
dab
domen
organs-related
tasksII.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rM
odal
itie
sFo
cus
Tas
ksIn
dex
CT
MR
PT
CR
OT
Classificatio
nSe
gmentatio
nDetectio
n
171
PadC
hest
2019
XChe
stX-R
ayX
172
RSN
APne
umon
iaDetectio
nCha
lleng
e2018
XPne
umon
iaX
173
ImageC
LEF2018
-Tub
erculosis
2018
XTu
berculosis
X
174
Lung
Fused-CT-Patho
logy
[164,165]
2018
XX1
Pulmon
aryno
dule
X175
ACAD
[166]
2017
XCardiac
diseases
X176
NSC
LCRad
iogeno
mics[160,167]
2017
XX
Non
-smallc
elllun
gcancer
X
177
ProstateX
2018
X2
Prostatelesion
X178
Pulmon
aryChe
stX-R
ayAbn
ormalities
2018
XTu
berculosis
X179
Che
stX-R
ayIm
ages
(Pne
umon
ia)[75,76]
2018
XPne
umon
iaX
122
LiTS[116]
2017
XLivertumor
X180
DataScienceBow
l2017
2017
XLu
ngcancer
X
181
ACRIN
-FLT
-Breast(A
CRIN
6688)[168,
169]
2017
XX
Breastcancer
X
182
CervicalC
ancerScreen
ing
2017
X3
Cervicalc
ancer
X183
NIH
Che
stX-R
ays[170]
2017
XLu
ngdisease
XX
184
SPIE
-AAPM-N
CI
PROST
ATEx
Cha
l-leng
es[171,172]
2016
X2
Prostatelesion
X
185
ImageC
LEFm
ed:T
heMed
ical
Task
2016
2016
XRep
ort-combine
dmed
ical
imageclassification
X
186
LUNA16
[173]
2016
XPulmon
aryno
dule
XX
187
The
Digita
lMam
mograph
yDREAM
Cha
lleng
e2016
XX4
Breastcancer
XX
188
Low
DoseCT
Cha
lleng
e[174,175]
2016
XLo
wdo
seCT&
Liverlesion
X189
RID
ER
Lung
CT
[176,177]
2015
XNon
–smallc
elllun
gcancer
X
190
Pha
ntom
FDA
[178,179]
2015
XPha
ntom
&pu
lmon
ary
nodu
le191
BREAST
-DIA
GNOSIS[180]
2015
XBreast
192
Soft-tissue-Sa
rcom
a[181]
2015
XX
XSo
fttis
suesarcom
aX
193
SPIE
-AAPM
Lung
CT
Cha
lleng
e[182,
183,184]
2014
XPulmon
aryno
dule
X
194
NSC
LC-R
adiomics[84]
2014
XNon
-smallc
elllun
gcancer
X1
Pathologyim
age
2T2-weigh
ted,
Protonde
nsity
weigh
ted,
Dyn
amic
contrast-enh
anced,
DE
3End
oscopy
4Mam
mograph
y
22 Johann Li 1 et al.
Table10:S
ummaryof
datasets
andchalleng
esforchestan
dab
domen
organs-related
TasksIII.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rM
odal
itie
sFo
cus
Tas
ksIn
dex
CT
MR
PT
CR
OT
Classificatio
nSe
gmentatio
nDetectio
n
133
Autom
ated
Segm
entatio
nof
Prostate
Structures
[115]
2013
XProstate
X
195
NSC
LCRad
iogeno
mics:
Initial
Stan
ford
Stud
yof
26Cases
[167,185]
2013
XX
Non
-smallc
elllun
gcancer
X
196
Ventric
ular
InfarctSe
gmentatio
n2012
XLe
ftventric
ular
myo
cardial
infarctio
nsegm
entatio
nX
197
LIDC-IDRI[186,187]
2011
XX
Lung
cancer
&pu
lmon
ary
nodu
leX
198
CT
COLO
NOGRAPHY
[188,189]
2011
XPo
lyp&
colono
grap
hyX
199
RID
ER
Collections
2011
XX
XCan
cer
X200
Autom
atic
Nod
uleDetectio
n2009
[190]
2009
XPulmon
aryno
dule
X201
VOLC
ANO
092009
XPulmon
aryno
dule
X
202
End
oCV
2021
[191,192]
2020
X1
Colon
(polyp
,cancer),
oe-
soph
agus
(Barrett’s,
dys-
plasia
and
cancer),
and
stom
ach
XX
203
EAD
2020
2020
X1
Artefactregion
XX
204
EDD
2020
2020
X1
Colon
(polyp
,cancer),
oe-
soph
agus
(Barrett’s,
dys-
plasia
and
cancer),
and
stom
ach
XX
205
SARAS-ESA
D[193]
2020
X1
Surgeonactio
nde
tection
XX
206
EAD
2019
[194]
2019
X1
Artefactregion
XX
207
AID
A-E
Subcha
lleng
e1
2016
X1
Mucosa
damage
incelia
cdisease
X
208
AID
A-E
Subcha
lleng
e2
2016
X1
Mucosain
Barrett’s
esop
h-agus
X
209
AID
A-E
Subcha
lleng
e3
2016
X1
Mucosa
ingastric
chro-
moend
oscopy
X
1End
oscopy
A Systematic Collection of Medical Image Datasets for Deep Learning 23
– Breast cancer (187)– Action and artefact of surgeon(205, 206)
Chest & abdomen datasets for segmentation: Segmen-tation is a refinement of the detection task because itprovides information about the location and the pixel-level labels. Pixel-level annotations can help researchersdesign pixel-level algorithms for accurate and effectivequantification, volume calculations, and other analy-sis and diagnosis of tumors and lesions at the pixel-level (e. g., monitoring of tumors size). According tothe datasets and challenges we have collected, most ofthem aim at the segmentation of the tumor and lesionfrom CT of:
– Lung cancer (167, 176)– Kidney tumor (110)– Pulmonary nodule (201)– Pneumothorax (164)– Liver tumor (122)– Polyp (204)
Furthermore, challenges (203, 206) focus on the seg-mentation of artifacts (e. g., polyps) in endoscopic im-ages.
COVID-19: In 2020, COVID-19 became a research fo-cus because it caused more than 100 million infectionsand two million deaths. Different datasets and chal-lenges focus on this devastating disease and providedata to help researchers develop deep learning modelsto detect COVID-19 via various medical image modal-ities.
In the view of modalities, most of these datasetsand challenges use either CT or CR images, and someprovide both modalities. One exception is dataset (150),which uses ultrasound images. These datasets provideimage annotations labeled by radiologists.
Most of these datasets and challenges are relatedto classification tasks. Datasets (145, 146, 147, 157,159) directly focus on diagnosing COVID-19 from nor-mal subjects. In contrast, datasets and challenges (149,150, 151, 153, 154) focus on diagnosing COVID-19 froma few other similar diseases, which can also lead tolung opacity or other symptoms, such as Middle EastRespiratory Syndrome (MERS), Severe Acute Respi-ratory Syndrome (SARS), and Acute Respiratory Dis-tress Syndrome. Moreover, other datasets and chal-lenges (148, 152) focus on the diagnosis task, with natu-ral language processing, genomics, or clinical methods.
Similarly, some other datasets (147, 155, 156) focuson the segmentation or detection of COVID-19 relatedlesions, such as ground-glass opacity, air-containingspace, and pleural effusion.
4.3 Datasets for other chest and abdomen-relatedtasks
Besides the classification, detection, and segmentationtasks, there are also several other tasks which are thecurrent focus of research. In the following, we presentthe datasets and challenges related to these tasks, andreport them in Table 11.
Chest & abdomen datasets for regression: Similar to at-tributes classification, regression is another task whichaims to compute or measure the target attributes fromgiven images, but the difference is that the outputs ofregression are continuous. A typical example is fetal bio-metric measurements (217, 227). These challenges pro-vide ultrasound images to help researchers design algo-rithms to measure such attributes to estimate the gesta-tional age and monitor the fetus’s growth. Besides, an-other example is cardiac measurements (212, 213, 214,218, 221, 226, 231). These datasets and challenges pro-vide MR or ultrasound images to analyze the heart’sattributes to detect heart diseases.
Chest & abdomen datasets for tracking: Tracking is acritical task because our body and organs move duringimaging. For organs, such as the heart, the character-istics of their motion is informative. Challenges (222,224) provide ultrasound data to track the liver to ana-lyze the following of a surgery and treatments. Datasetsand challenges (215, 228, 229) focus on the tracking ofthe heart. They provide ultrasound images to track andanalyze the heart.
Chest & abdomen datasets for registration: Challenge(216) focuses on the CT registration of lungs and pro-vides CT scans with and without enhanced and contrastagents. Meanwhile, challenges (220, 225) focus on theregistration between different modalities of the heartand provide MR, CT, and other modalities to registerimages with beating hearts.
Datasets for other chest & abdomen related tasks: Chal-lenges (210, 223) focus on localizing specific landmarks,including the amniotic and the heart, using ultrasoundand MR images. Challenge (211) focuses on the classi-fication of surgery videos. Dataset (232) focuses on thereconstruction of the coronary artery.
5 Datasets and challenges for pathology andblood
Though radiography, MR imaging, and other imagingmodalities have been used as the basis for diagnosis,
24 Johann Li 1 et al.Ta
ble11:S
ummaryof
datasets
andchalleng
esof
othe
rmed
ical
applications
inchestan
dab
domen.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rM
odal
itie
sFo
cus
Tas
ksIn
dex
CT
MR
US
OT
Regression
Registration
Tracking
Detectio
nOther
210
A-A
FMA-Localization
2020
XAmniotic
fluid
localization
X
211
SurgVisDom
2020
2020
X1
Surgical
task
classification
X5
212
CRT
-EPIG
GY19
2019
X2
Heart
mod
eling
X
213
LVQUAN19
2019
XFu
llqu
antifi
catio
nof
car-
diac
LVX
214
LVQUAN18
2018
XFu
llqu
antifi
catio
nof
car-
diac
LVX
215
EchoN
et-D
ynam
ic2017
XHeart
tracking
X
216
LUMIC
2018
X3
CT
registratio
nwith
phan
-tom
images
X
217
HC
18[?,195]
2018
XFe
talh
eadcircum
ference
X218
SLAW
T[196]
2016
XX
Left
atria
lwallthickne
ssX
219
Statistic
alAtla
ses
and
Com
puta-
tiona
lMod
ellin
gof
theHeart
-S2016
XX
Left
atriu
mwallthickne
ssX
220
MMMW
HS1
72017
XX
Multi-mod
alities
heartreg-
istration
X
221
2ndAnn
ualD
ataScienceBow
l2016
XCardiac
ejectio
nfractio
nX
222
CLU
ST2015
2015
XLivertracking
X223
Land
markDetectio
nCha
lleng
e2015
XLa
ndmarklocatio
nX
224
CLU
ST2014
2014
XLivertracking
X225
MotionCorrectionCha
lleng
e2014
XHeart
motioncorrectio
nX
226
Coron
aryArterySten
oses
Detectio
nan
dQua
ntificatio
nEvaluation[197]
2012
X4
Coron
ary
artery
stenoses
detection
and
quan
tifica-
tion
X
227
Cha
lleng
eUS:
ISBI2012
[198]
2012
XFe
tal
biom
etric
measure-
ments
X
228
MotionTr
acking
Cha
lleng
e2011
XHeart
motiontracking
X
229
Cardiac
Motion
Ana
lysisCha
lleng
e2011
2011
XHeart
motiontracking
X
230
RMPIR
E10
2010
XLu
ngregistratio
n231
LVMecha
nics
Cha
lleng
e2009
XMod
eling
232
Rotterdam
Coron
ary
Artery
Algo-
rithm
EvaluationFram
ework
2009
XRecon
struction
X6
1End
oscopy
2Se
eoffi
cial
descrip
tion:
http
://c
rt-e
pigg
y19.
surg
e.sh
/dat
aset
s.ht
ml.
3CT
Pulmon
aryAng
iograp
hy4
CT
Ang
iograp
hy5
Classificatio
n6
Gen
eration
A Systematic Collection of Medical Image Datasets for Deep Learning 25
pathology images are also used as a gold standard fordiagnosis, particularly for tumors and lesions. Digitalpathology images are generally obtained by collectingtissue samples, making slices, staining, and imaging.Therefore, pathology images are also one of the main-stream image modalities that are used for diagnosis.
The focus of these datasets and challenges include1) the identity and segmentation of basic elements(e. g., cell and nucleus) in pathology images, and 2)blood-based diagnosis from images. In this section, wepresent datasets and challenges of the pathology im-ages (Subsection 5.1), and at the same time, cover thedatasets and challenges of blood images in Subsection5.2.
5.1 Datasets & challenges for pathology
Pathology images are used as the basis of cancer diag-nosis. The pathologists and automatic algorithms ana-lyze images based on specific features, such as cancercells and cells under mitosis. Many organizations andresearchers provide datasets and challenges, which fo-cus on the microcosmic pathology and at the whole slideimage (WSI) level. The relevant datasets and challengesare listed in Table 12.
Imaging datasets & challenges: In most situations,WSI is used in pathology diagnosis. Unlike CT or MRimages, the pathology image is an optical image simi-lar to the picture photoed by a camera. However, onemajor difference is that a pathology image is imaged bytransillumination, while the usual photo is imaged byreflection. Another difficulty is in the size of the image.WSI is stored in a multi-resolution pyramid structure.A single multi-resolution WSI is generally achieved bycapturing many small high-resolution image patches,and it might contain up to billions of pixels. Thus, WSIis used as a virtual microscope in diagnosis for clinicalresearch, and many challenges use WSI, such as (237,245, 246, 254, 255, 256). However, in some situations,the WSI is not suitable for analysis tasks, for example,cell segmentation. Therefore, pathology image patchesare used in several other challenges, such as (233) forvisual question answering, (259) for mitosis classifica-tion, (238, 248) for multi-organs nucleus detection andsegmentation.
Datasets for stain: Slides made from human tissues arewithout color, and required to be stained. The com-monly used stains include Hematoxylin, Eosin, and Di-aminobenzidine. Usually, two or more stains are usedin staining the slide, and the most commonly used
combinations include Hematoxylin & Eosin (H&E) andHematoxylin & Diaminobenzidine (H-DAB).
Pathology datasets according to disease: The pathologyslides are widely used in the diagnosis of many diseases,especially cancer. The cancer cells and tissues have dif-ferent shapes compared to their normal counterpart.Thus, the diagnosis via pathology is the gold standard.Many datasets and challenges, such as (238, 248, 260),do not address any specific disease. At the same time,many datasets and challenges target specific diseases,such as breast cancer (239, 244), myeloma (262, 268),cancers in the digestive system (241), cervical cancer(257, 258), lung cancer (247), thyroid cancer (253), andosteosarcoma (240).
Pathology datasets according to task: Generally speak-ing, the tasks used with these datasets and challengescan be classified into two categories: microcosmic taskand WSI-level task. The latter targets the diagnosis ofdiseases, based on a classification task. Expanded fromthe simple classification tasks, many datasets and re-search methodologies focus on complex tasks, such asthe segmentation of tumor cell areas (238, 248, 249)and the detection of pathological features (241, 255).The microcosmic tasks derive from the clinical analy-sis to identify cells and detect mitosis to extract keyfeatures from pathology images to support further dis-ease diagnosis. The following subsections expand on themicrocosmic tasks and WSI-leveling tasks, respectively.
5.1.1 Microcosmic related datasets
Microcosmic tasks focus on microcosmic features ex-traction (e. g., nucleus features), for further diagnosisand WSI-level tasks. In this subsection, we introducethe microcosmic task related datasets and challenges.
Data: Unlike the WSI-level, the datasets and chal-lenges which focus on microcosmic tasks usually pro-vide small size patch-level images with high-resolution.These patches are suitable for the annotation ofmicrocosmic-level objects and resource-limited algo-rithms. The size of images varies depending on theimage analysis tasks. For the segmentation and detec-tion of cells and nucleus, the size of images is usuallya thousand-pixel square to contain the suitable num-ber of cells or nuclei. For individual cell analysis tasks(e. g., mitosis determination), the size is usually of asingle cell. For other tasks (e. g., the patch-level classi-fication), the size varies from dataset to dataset.
26 Johann Li 1 et al.Ta
ble12:S
ummaryof
datasets
andchalleng
esforpa
thology-relatedim
agean
alysis.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rFo
cus
Tas
ksSt
ain
Inde
xClassificatio
nSe
gmentatio
nDetectio
nOther
233
PathologyVQA
[199]
2020
Visua
lque
stionan
swering
X1
H&E
234
PANDA
Cha
lleng
e2020
Prostate
cancer
grad
eassess-
ment
XH&E
235
PAIP
2020
2020
Colorectalc
ancer
XX
H&E
236
MIC
CAI2020
CRPCC
2020
Com
bine
dradiologyan
dpa
thol-
ogyclassification
XH&E
237
HEROHE
2020
HER2
XH&E
238
MoN
uSAC
[200]
2020
Multi-organnu
cleu
sX
XH&E
239
Post-N
AT-B
RCA
[201,202]
2019
Cell
XX
X2
H&E
240
Osteosarcom
ada
taforViablean
dNecrotic
Tumor
Assessm
ent[203,204,205,206,207]
2019
Osteosarcom
apa
thologyim
age
XH&E
241
DigestP
ath2019
[208]
2019
Sign
etrin
gcell
XX
XH&E
242
LYON
19[209]
2019
Lymph
ocyte
XIH
C5
243
ANHIR
[210,211]
2019
Pathologyim
ageregistratio
nX3
-
244
BreastMetastasesto
Axilla
ryLy
mph
Nod
es[212,
213]
2019
Breast
cancer
metastases
tolymph
XH&E
245
Gleason
2019
2019
Gleason
grad
e/scorepred
ictio
nX
H&E
246
PAIP
2019
2019
Segm
entatio
nof
liver
cancer
&tumor
burden
XX
H&E
247
ACDC@LU
NGHP
[214]
2019
Lung
cancer
XX
H&E
248
MoN
uSeg
[215]
2018
Multi-organnu
cleu
sX
H&E
249
DataScienceBow
l2018
2018
Cell
XH&E
250
PatchCam
elyo
n[216,217]
2019
Pathologyim
agepa
tch
XH&E
251
ICIA
R2018
BACH
[218]
2018
Patchof
breast
cancer
XH&E
252
ColorectalH
istology
MNIST
2018
Colorectalp
atho
logy
patch
XH&E
253
Cha
lleng
eforTMA
inThy
roid
Can
cerDiagn
osis
2017
Thy
roid
cancer
XH&E
254
CAMELY
ON
17[217,219,220]
2017
Breastcancer
metastases
XX
H&E
255
CAMELY
ON
16[217]
2016
Breastcancer
metastases
XX
H&E
256
GlaS
2015
Colorectal
cancer
pathology
patch
XH&E
257
2ndOCCIS
Cha
lleng
e2015
Overla
ppingcervical
cell
XPa
panicolaou
258
OCCIS
2014
Overla
ppingcervical
cell
XPa
panicolaou
259
MIT
OS-AT
YPIA
-14
2014
Mito
ticX
H&E
260
CellT
rackingCha
lleng
e[221]
2020
Cell
X4
EM
6
261
Particle
Tracking
Cha
lleng
e2012
Particle
X4
EM
6
1Visua
lque
stionan
swering
2Cellc
ounting
3Non
-line
arim
ageregistratio
n4
Tracking
5Im
mun
ohistochem
istry
6Electronmicroscop
e
A Systematic Collection of Medical Image Datasets for Deep Learning 27
Pathology datasets for cell detection & segmentation:Cells are considered to be essential for the pathologyimage. The analysis of cells is one of the most effectiveways to extract pathology image features for diagnosis.The pathologists analyze the size, shape, pattern, andstained color of the cells with their knowledge and ex-pertise to make judgments about these cells and classifythem as normal or abnormal. Thus, many datasets andchallenges focus on the segmentation and detection ofcells. The cells and nucleus can be placed neatly in theslide. However, during the slide preparation, these cellscould overlap or locate randomly on the slide. Aimingat such a problem, challenges (257, 258) focus on thesegmentation and detection of overlapping cells and nu-clei. The shape and size of cells from different organsmight be different and can have different recognitionand analysis challenges. Therefore challenges (238, 248)focus on the multi-organ cells or nucleus segmentation.
Pathology datasets for patch-level classification: Gener-ally, the size of WSI is too large to be able to analyze ev-ery cell and relationships between cells. DL-based meth-ods can easily find essential information from the patch-level image to support the diagnosis based on featurelearning. Many datasets and challenges focus on thisproblem. The datasets and challenges, which providepatch-level images, mainly focus on the classification,segmentation, or detection tasks. Based on the qualityof feature learning, DL has reached the state-of-the-artperformance in many areas of computer vision. There-fore, some datasets and challenges focus on the patchitself, and not the cell itself. The tasks can vary fromthe segmentation, detection, and classification of thecell to the direct classification of the patch. Challenges(242, 250, 252, 253) focus on patch-level image classi-fication to determine whether metastatic or a differenttissue is present.
Datasets for other pathology tasks: Besides the detec-tion and segmentation of cells and the patch-level clas-sification, there are other microcosmic tasks. Challenge(259) focuses on the mitotic detection for nuclear atypiascoring. The atypical shape, size, and internal organi-zation of cells are related to the progress of cancer. Themore advanced the cancer is, the more atypical the celllooks like. Challenge (260) focuses on cell tracking, toknow how cells change shapes and move as they in-teract with their surrounding environment. This is thekey to understand cell migration’s mechanobiology andits multiple implications in normal tissue developmentand many respective diseases. Challenge (233) focuseson the visual question answering task of pathology im-ages using AI where the model is trained to pass theexamination of the pathologist.
5.1.2 Datasets for WSI-level tasks
WSI-level pathology tasks focus on the diagnosis of can-cer and pathology image processing. WSI contains allthe complete information of a patient to be able toestablish an accurate diagnosis. Automatic diagnosisalgorithms can quickly analyze the slide. This is use-ful, especially in developing countries where there isa lack of well-experienced pathologists. However, it isa challenge to directly analyze WSI for both patholo-gists and algorithms because the size of WSI can beup to 100, 000 × 100, 000 pixels. Thus, such analysisbecomes challenging, and to address this, most of thecurrent datasets and challenges focus on the classifica-tion and segmentation of biomarkers, cells, and otherregions of interest. At the end of this subsection, we in-troduce other datasets and challenges that are relatedto the tasks of regression and localization of tumors andbiomarkers.
Datasets for classification of WSI: The prime goal ofthe examination of pathological images, especially WSI,is to diagnose cancer. Thus, how to classify WSI withlarge size and limited computing resources becomes aresearch challenge. Datasets and challenges (234, 236,237, 245) focus on predicting cancer or evaluating WSI,such as Gleason grade or HER2 evaluation. At the sametime, some datasets and challenges (244, 254, 255) focuson the classification of metastasized cancer.
Datasets for segmentation and detection of WSI: DL-based methods are seen as a black box which processpathology images. The performance of these methodshas achieved the state-of-the-art performance, but theinterpretability of these methods is still difficult. Fromthe pathologists’ point of view, datasets and challenges(235, 241, 246, 254, 255) focus on the segmentation anddetection tasks to determine the critical elements whichled to a particular diagnosis, such as cancer cell areaand signet ring cell.
Datasets for other WSI tasks: Besides classificationand detection, there are a few other tasks based onWSI. This includes the registration of pathology im-ages (243) for data pre-processing and the localizationof lymphocytes (242).
5.2 Blood-related datasets
Blood image analysis is the basis of the diagnosisof many diseases. In contrast to the pathology im-ages, blood samples’ images mainly contain blood cells,
28 Johann Li 1 et al.Ta
ble13:S
ummaryof
datasets
andchalleng
esof
bloo
d-relatedim
agean
alysis
tasks.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rFo
cus
Tas
ksSt
ain
Inde
xClassificatio
nSe
gmentatio
nDetectio
nOther
262
SegP
C2021
2020
Myelomaplasmacell
XJenn
er-G
iemsa
263
Mito
EM
Cha
lleng
e[222]
2020
Mito
chon
dria
X(electronmicroscop
e)
264
Sing
le-cellM
orph
ological
Dataset
ofLe
uko-
cytes[223,224]
2019
Blast
cells
inacute
myeloid
leuk
aemia
XX
-
265
B-A
LLClassificatio
n[225,226,227]
2019
Immatureleuk
emic
blasts
XX
-266
Malaria
Bou
ndingBoxes
2019
Cells
inbloo
dX
XGiemsa
267
LYST
O2019
Lymph
ocytes
X-
268
MiM
MSB
ILab
Dataset
[228,229,230]
2019
Cell
XJenn
er-G
iemsa
269
SN-A
MDataset
[229,231,232,233]
2019
Stainno
rmalization
X1
Jenn
er-G
iemsa
270
CNMC
2019
Dataset
[226]
2019
Immatureleuk
emic
cellclassi-
fication
X-
271
Blood
CellImages
2018
Cells
inbloo
dX
Jenn
er-G
iemsa
1Stainno
rmalization.
and these datasets and challenges are aimed at blood-related cancer and cell counting. Similar to pathologyimages, these datasets and challenges also focus on thesegmentation, detection, and classification of cells. Therelevant datasets and challenges are listed in Table 13.
One of the primary tasks of these datasets is theclassification of cells, which focuses on identifying thedifferent types of cells. Dataset (271) focuses on classi-fying red blood cells, white blood cells, platelets, andother cells. At the same time, dataset (264) focuseson the classification of malignant and non-malignantcells. Other datasets and challenges (268) (multiplemyeloma segmentation), (263) (mitochondria segmen-tation), (266) (malaria detection) focus on the segmen-tation and detection of blood cells and biomarkers.
6 Other datasets
Although we have categorized the datasets and chal-lenges into three parts: “head and neck”, “chest andabdomen”, and “pathology and blood”, several otherdatasets cannot be categorized under these three ar-eas. In this section, we introduce the datasets andchallenges categorized under “other” which means thatthese datasets do not fit under the above categories butthey are still relevent to DL methods. The topics of thissection include bone (Subsection 6.1), skin (Subsection6.2), phantom (Subsection 6.3), and animal (Subsection6.4).
6.1 Bone-related datasets
Medical image analysis of bone is currently a majorresearch focus. Radioautography is the most effectiveway to image bones, because X-Ray is sensitive tocalcium that makes up human bones. The segmenta-tion of bone, the detection of abnormalities, and theircharacterization are meaningful clinical and researchtasks. Therefore, the following subsections discuss thedatasets and challenges for the classification, segmen-tation, and other tasks, and Table 14 reports thesedatasets and challenges.
Bone datasets for classification: The classificationtasks for bone related computer-aided diagnosis is thefocus for many researchers. Though the classificationcannot locate the regions of interest, it can still helporthopedists to judge whether the patient is healthy ornot, such as in dataset (283). The diagnosis of tears andabnormality is also a research focus, such as meniscaltears (279), vertebral fracture (282), and knee abnor-mality (279).
A Systematic Collection of Medical Image Datasets for Deep Learning 29
Table14:S
ummaryof
datasets
andchalleng
esof
bone
-related
imagean
alysis
tasks.
Ref
eren
ceD
atas
et/C
halle
nge
Yea
rM
odal
itie
sFo
cus
Tas
ksIn
dex
MR
CT
CR
Classificatio
nSe
gmentatio
nOther
272
KNOAP2020
2020
XX
Kne
eosteoarthritis
X
273
MIC
CAI2020
RibFrac
Cha
lleng
e[234]
2020
XRib
fracture
detectionan
dclassifica-
tion
XX1
274
Spinal
CordMRIPub
licDatab
ase
2020
XBon
eim
aging
X2
275
VerSe20
2020
XVe
rteb
raX
276
VerSe19
[235,236]
2019
XVe
rteb
raX
277
Pelvic
Referen
ceData[237]
2019
XPe
lvic
images
X3
278
AASC
E19
2019
XSp
inal
curvature
X4
279
MRNet
Dataset
[238]
2018
XKne
eX
280
MURA
[239]
2018
XAbn
ormality
inmusculoskeletal
X281
xVertSeg
Cha
lleng
e2016
XFracturedverteb
rae
XX
282
CSI
2016
2016
XIntervertebral
disc
and
verteb
ral
fracture
XX
X1
283
Bon
eTe
xtureCha
racterization[240]
2014
XBon
ecrisps
X
284
Spinean
dVe
rteb
raSe
gmentatio
nCha
lleng
e[235,236,
241]
2014
XSp
ine
X
285
SKI10
2010
XCartilagean
dbo
nein
knee
X1
Detectio
n2
Gen
eration
3Registration
4Regression
Bone datasets for segmentation: The segmentationtask of bone images plays a vital role in clinical diag-nosis and treatment. The computer-aided segmentationalgorithms and orthopedist need to segment the differ-ent parts of the bone from a given image and make asound judgment to provide a more adequate treatment.The difficulty with such tasks is the low-resolution ofimages compared with other image modalities. The fo-cus of these datasets and challenges include the spine(282, 284), vertebrae (275, 276, 281), and knee cartilage(285).
Other bone related tasks: Besides the classification andsegmentation tasks, the datasets and challenges of bonealso include imaging (274), registration (277), spinalcurvature estimation (278), labeling (276), and abnor-mality detection (280).
6.2 Skin-related datasets
Skin cancer is one of the most common type of cancer,and melanoma is one of the most lethal types of skincancer. To diagnose skin cancer, dermoscopy is usedto image the skin, and the classification, segmentation,and detection tasks are employed. The most relevantdatasets and challenges are reported in Table 15.
Aiming at the computer-aided diagnosis of melanoma,ISIC released datasets and a series of challenges for clin-ical training and for the development of automatic algo-rithms. The challenges of ISIC include: 2017 (290), 2018(289), 2019 (288). Challenges (289, 290) include threesub-challenges: lesion segmentation, lesion attributiondetection, and lesion classification with thousands ofdermoscopic images. Besides, challenge (288) focuseson the classification of melanoma, melanocytic nevus,basal cell carcinoma, actinic keratosis, benign kerato-sis, dermatofibroma, vascular lesion, squamous cell car-cinoma, and others. Challenge (287), i. e., ISIC 2020,focuses on the classification of melanoma to better sup-port dermatological clinical works with 33126 scans ofmore than 2000 patients.
Moreover, challenge (286) focuses on Diabetic FootUlcers, i. e., DFU. The challenges provide more than2000 images of feet photographed with regular camerasunder a consistent light source and annotated by ex-perts for training and testing of automatic detectionand classification algorithms.
6.3 Phantom-related datasets
Phantom is an object based on a specific material tomainly evaluate medical imaging equipment. Phantom
30 Johann Li 1 et al.
Table 15: Summary of datasets and challenges for skin, phantom, and animal related image analysis tasks.
Reference Dataset/Challenge Year Modalities FocusIndex MR CT RGB OT
286 DFU 2020 [242] 2020 X Diabetic foot ulcer detection
287 SIIM-ISIC Melanoma Classifica-tion [243] 2020 X Melanoma classification
288 ISIC 2019 [244,245] 2019 X Classification of 9 diseases
289 ISIC 2018 [244,245] 2018 XLesion segmentation and attribution clas-sification of Melanoma and disease classi-fication
290 ISIC 2017 [246] 2017 XLesion segmentation and attribution clas-sification of Melanoma and disease classi-fication
291 MATCH 2020 X Tumor tracking in markerless lung
292 MRI-DIR [247,248] 2018 X XMulti-modality registration with phan-tom
293 CC-Radiomics-Phantom-3 [249,250] 2018 X Phantom in different machine
294 CC-Radiomics-Phantom-2 [251] 2018 XFeature variability assessment with phan-tom
295 Credence Cartridge Radiomics Phan-tom CT Scans [252] 2017 X Phantom research
296 PET-Seg Challenge 2016 X X1 Phantom registration and research297 ISMRM 2015 2015 X2 Fiber imaging from phantom298 RIDER Phantom PET-CT [253] 2015 X X1 Phantom research & registration299 Mouse awake rest [254] 2020 X Mouse brain segmentation300 EndoVis 2019 SCARED 2019 X3 Depth estimation from endoscopic data301 MouseLemurAtlas MRIraw [255,256] 2019 X Mouse lemur brain image segmentation302 CPTAC-GBM [257] 2018 X X Glioblastoma multiforme303 BigNeuron 2016 X4 Animal neuron reconstruction304 Apples-CT 2020 X Apple reconstruction and segmentation
305 QUBIQ 2020 X XQuantification of uncertainties in biomed-ical image quantification
306 AnDi [258] 2020 X2 Anomalous diffusion
307 The Open Knowledge-Based PlanningChallenge 2020 X Dose distributions predictions
308 Learn2Reg [259] 2020 X X Image registration309 SNEMI3D 2013 X5 Segmentation of neurites310 SNEMI2D 2012 X5 Segmentation of neuronal structures
1 PET 2 DWI 3 Endoscopy 4 General microscopy 5 Electron microscope
can be used in the registration of different pieces ofequipment and using it as data for the developmentof automatic algorithms. Registration is essential forclinical diagnosis. For instance, it reduces the differ-ence between different medical devices with or with-out the same modalities (293, 294, 295, 298). Whendata is scarce, then the phantom becomes useful. Someimage analysis tasks or experiments require surgicallyinserted fiducial markers, which are costly and risky.However, the phantom has low cost and risk, easy toimage, and easy to be annotated (291, 296, 297). Therelated datasets and challenges are reported in Table15.
6.4 Animal-related datasets
Medical image analysis of animal material is relativelya smaller research area. However, it is not as limited byprivacy and stricter ethics restrictions as human medi-cal images. The datasets and challenges we found focuson animal brain segmentation (299, 301), depth estima-tion from endoscopic (300), and multi-modality regis-tration (292). The relevant datasets and challenges arereported in Table 15.
7 Discussions
The success of AI algorithms such as DL has led totheir widespread use in several fields, including for med-
A Systematic Collection of Medical Image Datasets for Deep Learning 31
ical image analysis. Researchers with different knowl-edge and background tackle image-based clinical tasksusing computer vision tools to design automatic algo-rithms for different applications [11,12,12,260,261,262,263, 264]. Though AI algorithms can successfully han-dle many tasks, several unsolved problems and chal-lenges hinder the development of AI-based medical im-age analysis.
7.1 Problems and challenges
DL-based algorithms learn from input images of realdata through gradient descent. Large-scale annotateddatasets and a powerful DL model are key to the de-velopment of successful DL models. For example, thesuccess of AlexNet [14], GoogleNet [2], ResNet [3] arebased on powerful models, which include millions ofparameters. At the same time, a large-scale dataset,such as ImageNet [265], is also necessary to train theDL model to be able to tune such a large number ofparameters. However, when these methods are appliedto medical image analysis, many domain-specific prob-lems and challenges start to appear. This subsectiondiscusses some of these challenges.
7.1.1 Data scarcity
The biggest challenge in the development of DL modelsis data scarcity. Different from other areas, the scale ofthe medical image datasets is usually smaller due tomany limitations, e. g., the ethical restrictions.
The commonly used datasets for traditional com-puter vision are in larger scale compared to medi-cal image datasets. For example, the handwritten dig-its dataset, MNIST [266] includes a training set with60,000 examples and a testing set with 10,000 examples;the ImageNet dataset [265] includes three million im-ages for training and testing; Microsoft COCO [267] in-cludes more than two million images with annotations.In contrast, many medical image datasets are smallerand only include hundreds or at most thousands of im-ages. For example, the challenge BraTS 2020 (30) in-cludes four hundred subjects and different modalitiesfor each subject; the challenge REFUGE (70) providesabout 1200 images of the eye; the challenge LUNA 16(186) provides 888 CT scans; our recently publisheddataset of pulmonary lesions [268] just provides 694scans; the challenge CAMELYON 17 (254) containsonly more than 1000 WSI pathology images.
There are multiple reasons for the lack of data. Themain cause is due to the restricted access to medical im-ages by non-medical researchers, i. e., barriers betweendisciplines. The root causes of these barriers relate to
the cost and difficulties of annotation and the restrictedaccess due to ethics and privacy.
Access to data: As mentioned in the introduction, thedirect cause of the data scarcity is that most non-medical researchers are not allowed to access medicaldata directly. Though many medical data are gener-ated worldwide every day, most non-medical researchershave no authorization to access clinical data. The eas-ily accessible data are publicly available datasets, butthese datasets are not at a large-scale to be able toproperly train a DL model.
Ethical reasons: Ethics of medical data usage is a majorbottleneck and a limitation to researchers, particularly,computer scientists. Medical data stored in databasesalways contains sensitive or private information, such asname, age, gender, and ID number. In some cases, thedata records of medical images can be used to identifya patient. For example, if an MR scan includes the face,an intruder can identify them for a possibly evil pur-pose. In most countries and regions, it is illegal to dis-tribute such data with private information without thepatients’ permission, and nobody would usually con-sent to such distribution. Therefore, for Deep learningresearchers, it is impossible to gain authorization to ac-cess these datasets.
Before DL researchers are able to gain authorizationeven to desensitized data, they still need to pass ethicalreviews.
Annotation: Another root cause is the difficulty to an-notate medical images. Unlike other computer visionareas, the annotation of medical images requires spe-cialized professions and knowledge. For example, in au-topilot, when annotating objects such as vehicles andpedestrians, there are no specific annotators’ require-ments because most of us can easily distinguish a caror a human. However, when annotating medical images,domain-specific knowledge is essential. E. g., few peopleif naive would be able to tell the differences between anabnormal and normal tissue. However, it is impossiblefor a non-specialist to mark the lesion’s contour or di-agnose a disease.
This difficulty cannot easily be solved even whenprofessionals are employed to annotate data. First, thecost of annotation of medical data is huge. Once theresearcher and their organization have obtained somedata, they need then to spend more money to employfew medics for its labeling. Such annotation cost is enor-mous, particularly where medical resources are scarceor where medical costs are high. For example, the chal-lenge PALM (74) provides about 1,200 images with an-notation, but its organizers involved only two clinical
32 Johann Li 1 et al.
medics. Second, the physician who annotates the datais required to have a rich clinical and diagnosis experi-ence, thus reducing the number of people who are suit-able for this task even further. Third, to avoid anysubjectivity, one image needs to be annotated by twoor more physicians. Another problem is what to do ifthe labels of two annotators are not the same? In manychallenges, the organizer employs many junior physi-cians to annotate and employs a senior physician todecide if the junior physicians’ annotations are not thesame. For example, in the challenge AGE (71), eachdata annotation is determined by the mean of four in-dependent ophthalmologists in a group and it is thenmanually verified by a senior glaucoma expert.
7.1.2 Limitation of medical data
The characteristics of medical images themselves posedifficulties for the medical image analysis tasks.
There are many types and modalities of images thatare used in medical image analysis. Similar to computervision, the modalities include both 2D and 3D. How-ever, the medical images have several other differences.Though the average scale of a medical image dataset issmaller than computer vision-related field datasets, thesize of each sample of data is larger on average than theone of a computer vision-related field.
For 2D images, CR, WSI, and other modalities havelarge variances in the resolution and color than theother computer vision fields. Some modalities mightneed more bits to encode a pixel, while some modal-ities are significantly huge. For example, CAMELYON17 (254) only includes about a thousand of pathologyimages, but the whole dataset is about three terabytes.Such datasets with few large samples pose a challengefor the AI algorithms, and it has become a focus ofresearch to design an algorithm that can learn fromlimited computational resources (e. g., the number oflabeled samples) and be useful for clinical diagnosis.
For 3D medical images such as CT and MRI, theyare dense 3D data, compared with sparse data, such aspoint cloud, in autopilot. Like the BraTS serial chal-lenges (30, 31, 32, 33, 34, 35, 36, 37, 38), many re-searchers face the challenges to design algorithms thatcan effectively learn from multi-modal dataset.
These characteristics of medical images require well-designed algorithms with a more robust capability tofit the data well and without overfitting. However, thatfurther leads to the need for more data and resources.It is a challenge to learn suitable features from a smallsample dataset.
7.2 No silver bullet
The ideal scenario is to find or invent a method or analgorithm to simultaneously solve all of these encoun-tered problems. However, there is no silver bullet.The problems and challenges related to the data andthe adopted methods cannot be entirely resolved, orsometimes, a problem arises as another is solved. Nev-ertheless, many ideas have been introduced to addressthe current problems, and they are introduced in thissubsection.
With respect to the problems and challenges men-tioned above, researchers are working on two researchdirections: 1) a more effective model with less data, and2) a more practical approach to access data. For thelearning methods with small datasets, researchers useapproaches such as few-shot learning and transfer learn-ing. In order to access more data, researchers adoptthree main approaches, namely federated learning, life-long learning, and active learning.
7.2.1 Practical learning from small samples
Many medical image datasets have a small number ofsamples. For example, challenge MRBrains13 (6) onlyincludes 20 subjects for training and testing, while chal-lenge KITS 19 (110) has about two hundred subjects.Therefore, many researchers struggle to find a practicalapproach to learn from small samples.
Few-shot learning and zero-shot learning Few-shotlearning hits one of the critical spots of DL-based med-ical image analysis problems, i. e., the development ofDL models with fewer data. Humans can effectivelylearn from few samples. Therefore, different from thestandard deep learning-based methods, humans learnto diagnose a disease from images, without the needto view tens of thousands of images (i. e., from onlyfew-shot). Meta-learning, which is also called learningto learn, is a solution used to solve few-shot learn-ing problem. Meta-learning can learn the meta-featuresfrom a small data size. The number of medical imagesin most datasets and challenges is not as large com-pared to the regular computer vision-related datasetsand challenges. Mondal et al. [269] use few-shot learn-ing and GAN to segment medical images. The GANis modified for semi-supervised learning with few-shotlearning. Similar to few-shot learning, zero-shot learn-ing aims at novel samples. Rezaei et al. [270] cover a re-view of zero-shot learning from autonomous vehicles toCOVID-19 diagnosis. However, zero-shot and few-shotlearning have also their disadvantages, such as domaingap, overfitting, and interpretability.
A Systematic Collection of Medical Image Datasets for Deep Learning 33
Knowledge transfer: Transfer learning is another method,which can recognize and apply knowledge and skillslearned from a previous task. For example, both whitematter and gray matter segmentation and multi-organssegmentation are segmentation tasks. However, theneural network training is usually independent, whichmeans that almost nobody trains a neural network withtwo tasks at once. However, it does not mean that thesetwo tasks are unrelated. Besides zero-shot learning andfew-shot learning, transfer learning, or say, knowledgetransfer, is another method to infer knowledge from apreviously learned task. Transfer learning can be ap-plied to two similar tasks and between different do-mains. The most significant advantage of transfer learn-ing is that they use rich scale datasets to pre-train theneural network and then fine tune and transfer the net-work to the main task on a few samples dataset.
7.2.2 Effective access to more samples
Besides finding a practical approach to learn from smallsamples, many researchers have been working on activelearning and federated learning (which aims to use datawithout access to sensitive information). This also re-duces annotation costs of deep learning algorithms.
Federated learning: Federated learning provides an-other way to access data. As discussed previously, thelimitation of accessing data is led by privacy and otherproblems. Instead of directly sharing data, federatedlearning shares the model to protect privacy from be-ing leaked. With other privacy protection methods, fed-erated learning can effectively use the data from eachindependent data center or medical center.
However, there are two disadvantages of federatedlearning: annotation and implementation. The problemof annotation cannot be solved by sharing data butother methods. The main challenges are the implemen-tation, as only a few institutions have attempted feder-ated learning so far. For example, Intel and other insti-tutions have attempted to apply federated learning forbrain tumor-related tasks in their research [271]. Themain challenges in their implementation include:
1) The implementation and proof of privacy protec-tion,
2) The methodology for sharing and updating millionsof the model’s parameters,
3) Preventing attacks on DL algorithms and leaks ofdata privacy on the Internet or computing nodes.
Natural language processing: Natural language process-ing is also a potential tool to automatically or semi-automatically annotate medical image data. It is a stan-
dard procedure for a medic to provide a diagnostic re-port of the patient, particularly after the medical im-age was taken. Therefore, such large amounts of data(image and text) is useful for medical image analy-sis after desensitization, and natural language process-ing can be used for annotation. Several natural lan-guage processing-based methods, e. g., [272, 273, 274]have been applied in medical-related research fields.
Active learning: Active learning aims to reduce the an-notation cost by indirectly using the unlabeled data toselect the “best” samples to annotate. Generally, dataannotation for deep learning requires experts to labeldata so that the neural network can learn from the data.Active learning does not require too many samples atthe beginning of training. In other words, active learn-ing can “help” annotators to label their data. Activelearning uses the knowledge learned from the labeleddata to select and annotate the unlabeled data. Theunlabeled data with annotation from algorithms is usedto subsequently train the network over the next num-ber of epochs. Active learning [275, 276] is used in themedical image analysis in a loop of 1) algorithm learnfrom the data annotated by humans, 2) human anno-tate the unlabeled data selected by the algorithm 3)algorithm add the newly labeled data to the trainingset. The advantage of active learning is obvious: anno-tators do not need to annotate all the data they have,and at the same time, the neural network can learnfrom data faster from such interactive progress.
8 Conclusion
In this work, we have provided a comprehensive surveyof the datasets and challenges for medical image analy-sis, collected between 2013 and 2020. The datasets andchallenges were categorized into four themes: head andneck, chest and abdomen, pathology and blood, andothers. We provide a summary of all the details aboutthese themes and data. We also discuss the problemsand challenges of medical image analysis and the pos-sible solutions to these problems and challenges.
Acknowledgments
We thanks for the projects of National Natural ScienceFoundation of China (62072358), Zhejiang Universityspecial scientific research fund for COVID-19 prevern-tion and control, National Key R&D Program of Chinaunder Grant No 2019YFB1311600.
34 Johann Li 1 et al.
References
[1] K. Simonyan and A. Zisserman. Very deep convolutionalnetworks for large-scale image recognition. In 3rd Inter-national Conference on Learning Representations, ICLR2015 - Conference Track Proceedings, 2015.
[2] C. Szegedy et al. Going deeper with convolutions. InProceedings of the IEEE Computer Society Conference onComputer Vision and Pattern Recognition, volume 07-12-June, pp. 1–9, 2015.
[3] K. He et al. Deep residual learning for image recogni-tion. In Proceedings of the IEEE Computer Society Con-ference on Computer Vision and Pattern Recognition, vol-ume 2016-Decem, pp. 770–778, 2016.
[4] D. Silver et al. Mastering the game of Go with deep neuralnetworks and tree search. Nature, 529(7587):484–489, jan2016.
[5] O. Vinyals et al. Grandmaster level in StarCraftII using multi-agent reinforcement learning. Nature,575(7782):350–354, nov 2019.
[6] O. Ronneberger et al. U-net: Convolutional networksfor biomedical image segmentation. In Lecture Notes inComputer Science (including subseries Lecture Notes inArtificial Intelligence and Lecture Notes in Bioinformat-ics), volume 9351, pp. 234–241. 2015.
[7] H. Peng et al. Predicting Isocitrate Dehydrogenase (IDH)Mutation Status in Gliomas Using Multiparameter MRIRadiomics Features. Journal of Magnetic ResonanceImaging, 53(5):1399–1407, may 2021.
[8] M. Mehdizadeh et al. Deep feature loss to denoise OCTimages using deep neural networks. Journal of BiomedicalOptics, 26(04), apr 2021.
[9] B. Wang et al. AI-assisted CT imaging analysis forCOVID-19 screening: Building and deploying a medicalAI system. Applied Soft Computing, 98:106897, jan 2021.
[10] Z. Shi et al. A clinically applicable deep-learning modelfor detecting intracranial aneurysm in computed tomog-raphy angiography images. Nature Communications,11(1):6090, dec 2020.
[11] C. P. Mao et al. Altered resting-state functional connec-tivity and effective connectivity of the habenula in irrita-ble bowel syndrome: A cross-sectional and machine learn-ing study. Human Brain Mapping, 41(13):3655–3666, sep2020.
[12] N. Liu et al. Learning the Dynamic Treatment Regimesfrom Medical Registry Data through Deep Q-network.Scientific Reports, 9(1):1495, dec 2019.
[13] S. Khan et al. A Guide to Convolutional Neural Networksfor Computer Vision. Synthesis Lectures on ComputerVision, 8(1):1–207, feb 2018.
[14] A. Krizhevsky et al. ImageNet classification with deepconvolutional neural networks. In Communications of theACM, volume 60, pp. 84–90, 2017.
[15] K. Clark et al. The cancer imaging archive (TCIA):Maintaining and operating a public information reposi-tory. Journal of Digital Imaging, 26(6):1045–1057, dec2013.
[16] A. L. Goldberger et al. PhysioBank, PhysioToolkit, andPhysioNet: components of a new research resource forcomplex physiologic signals. Circulation, 101(23), jun2000.
[17] D. Bontempi et al. CEREBRUM: a fast and fully-volumetric Convolutional Encoder-decodeR for weakly-supervised sEgmentation of BRain strUctures from out-of-the-scanner MRI. Medical Image Analysis, 62, sep2020.
[18] Y. Sun et al. Multi-Site Infant Brain Segmentation Algo-rithms: The iSeg-2019 Challenge. IEEE Transactions onMedical Imaging, 2021.
[19] L. Wang et al. Benchmark on automatic six-month-old infant brain segmentation algorithms: The iSeg-2017 challenge. IEEE Transactions on Medical Imaging,38(9):2219–2230, sep 2019.
[20] NEATBrainS - Image Sciences Institute.[21] A. M. Mendrik et al. MRBrainS Challenge: Online
Evaluation Framework for Brain Image Segmentation in3T MRI Scans. Computational Intelligence and Neuro-science, 2015:1–16, 2015.
[22] I. Išgum et al. Evaluation of automatic neonatal brainsegmentation algorithms: The NeoBrainS12 challenge.Medical Image Analysis, 20(1):135–151, feb 2015.
[23] H. J. Kuijf et al. Standardized Assessment of AutomaticSegmentation of White Matter Hyperintensities and Re-sults of the WMH Segmentation Challenge. IEEE Trans-actions on Medical Imaging, 38(11):2556–2568, nov 2019.
[24] A. Klein and J. Tourville. 101 Labeled Brain Images and aConsistent Human Cortical Labeling Protocol. Frontiersin Neuroscience, 6(DEC), 2012.
[25] B. van Ginneken et al. 3D segmentation in the clinic:A grand challeng. International Conference on MedicalImage Computing and Computer Assisted Intervention,10:7–15, 2007.
[26] J. Li et al. A Baseline Approach for AutoImplant: TheMICCAI 2020 Cranial Implant Design Challenge. Lec-ture Notes in Computer Science (including subseries Lec-ture Notes in Artificial Intelligence and Lecture Notes inBioinformatics), 12445 LNCS:75–84, jun 2020.
[27] AccelMR 2020 Prediction Challenge – AccelMR 2020 forISBI 2020.
[28] MRI White Matter Reconstruction | ISBI 2019/2020 ME-MENTO Challenge.
[29] R. Souza et al. An open, multi-vendor, multi-field-strength brain MR dataset and analysis of publicly avail-able skull stripping methods agreement. NeuroImage,170:482–494, apr 2018.
[30] J. Zbontar et al. fastMRI: An open dataset and bench-marks for accelerated MRI. arXiv, nov 2018.
[31] R. S. Desikan et al. An automated labeling system forsubdividing the human cerebral cortex on MRI scans intogyral based regions of interest. NeuroImage, 31(3):968–980, jul 2006.
[32] B. H. Menze et al. The Multimodal Brain Tumor ImageSegmentation Benchmark (BRATS). IEEE Transactionson Medical Imaging, 34(10):1993–2024, oct 2015.
[33] S. Bakas et al. Advancing The Cancer Genome Atlasglioma MRI collections with expert segmentation labelsand radiomic features. Scientific Data, 4(1):170117, dec2017.
[34] S. Bakas et al. Identifying the best machine learningalgorithms for brain tumor segmentation, progression as-sessment, and overall survival prediction in the BRATSchallenge. arXiv, nov 2018.
[35] S. Bakas et al. MICCAI BraTS 2017: Scope | Section forBiomedical Image Analysis (SBIA) | Perelman School ofMedicine at the University of Pennsylvania, 2017.
[36] O. Maier et al. ISLES 2015 - A public evaluation bench-mark for ischemic stroke lesion segmentation from mul-tispectral MRI. Medical Image Analysis, 35:250–269, jan2017.
[37] M. D. Hssayeni et al. Intracranial hemorrhage segmenta-tion using a deep convolutional model. Data, 5(1):14, feb2020.
A Systematic Collection of Medical Image Datasets for Deep Learning 35
[38] K. Schmainda and M. Prah. Brain-Tumor-Progression,2018.
[39] Z. Akkus et al. Predicting Deletion of ChromosomalArms 1p/19q in Low-Grade Gliomas from MR ImagesUsing Machine Intelligence. Journal of Digital Imaging,30(4):469–476, aug 2017.
[40] B. Erickson et al. Data From LGG-1p19qDeletion, 2017.[41] S. L. Liew et al. A large, open source dataset of stroke
anatomical brain images and manual lesion segmenta-tions. bioRxiv, 2017.
[42] O. Commowick et al. Objective Evaluation of Multi-ple Sclerosis Lesion Segmentation using a Data Manage-ment and Processing Infrastructure. Scientific Reports,8(1):13650, dec 2018.
[43] O. Commowick et al. MICCAI 2016 MS lesion segmen-tation challenge: supplementary results, 2018.
[44] A. Carass et al. Longitudinal multiple sclerosis lesion seg-mentation: Resource and challenge. NeuroImage, 148:77–102, mar 2017.
[45] A. Carass et al. Longitudinal multiple sclerosis lesionsegmentation data resource. Data in Brief, 12:346–350,jun 2017.
[46] D. Scheie et al. Fluorescence in situ hybridization (FISH)on touch preparations: A reliable method for detect-ing loss of heterozygosity at 1p and 19q in oligoden-droglial tumors. American Journal of Surgical Pathology,30(7):828–837, jul 2006.
[47] S. G. Mueller et al. The Alzheimer’s disease neuroimag-ing initiative. Neuroimaging Clinics of North America,15(4):869–877, 2005.
[48] P. S. Aisen et al. Alzheimer’s Disease Neuroimaging Ini-tiative 2 Clinical Core: Progress and plans. Alzheimer’sand Dementia, 11(7):734–739, jul 2015.
[49] M. W. Weiner et al. The Alzheimer’s Disease Neuroimag-ing Initiative 3: Continued innovation for clinical trialimprovement. Alzheimer’s and Dementia, 13(5):561–571,may 2017.
[50] D. S. Marcus et al. Open Access Series of Imaging Stud-ies (OASIS): Cross-sectional MRI data in young, middleaged, nondemented, and demented older adults. Journalof Cognitive Neuroscience, 19(9):1498–1507, sep 2007.
[51] D. S. Marcus et al. Open access series of imaging stud-ies: Longitudinal MRI data in nondemented and de-mented older adults. Journal of Cognitive Neuroscience,22(12):2677–2684, dec 2010.
[52] P. LaMontagne et al. OASIS-3: Longitudinal Neu-roimaging, Clinical, and Cognitive Dataset for Nor-mal Aging and Alzheimer Disease. medRxiv, pp.2019.12.13.19014902, jan 2019.
[53] H. Varmazyar et al. MRI Hippocampus Segmentationusing Deep Learning autoencoders. 2020.
[54] S. Malekzadeh. MRI Hippocampus Segmentation | Kag-gle, 2019.
[55] R. V. Marinescu et al. The Alzheimer’s disease predic-tion of longitudinal evolution (tadpole) challenge: Resultsafter 1 year follow-up. arXiv, feb 2020.
[56] E. E. Bron et al. Standardized evaluation of algorithmsfor computer-aided diagnosis of dementia based on struc-tural MRI: The CADDementia challenge. NeuroImage,111:562–579, may 2015.
[57] P. Boord et al. Executive attention networks show alteredrelationship with default mode network in PD. NeuroIm-age: Clinical, 13:1–8, 2017.
[58] T. M. Madhyastha et al. Dynamic connectivity at restpredicts attention task performance. Brain Connectivity,5(1):45–59, feb 2015.
[59] C. Tessa. PD De Novo: Resting State fMRI and Physio-logical Signals, 2018.
[60] C. Tessa et al. Central modulation of parasympatheticoutflow is impaired in de novo Parkinson’s disease pa-tients. PLoS ONE, 14(1):e0210324, jan 2019.
[61] H. Mori. Diffusion tensor imaging, 2008.[62] M. Mascalchi et al. Histogram analysis of dti-derived
indices reveals pontocerebellar degeneration and its pro-gression in SCA2. PLoS ONE, 13(7):e0200258, jul 2018.
[63] Z. Fan et al. U-net based analysis of MRI for Alzheimer’sdisease diagnosis. Neural Computing and Applications,apr 2021.
[64] B. Khagi et al. Alzheimer’s disease Classification fromBrain MRI based on transfer learning from CNN. InBMEiCON 2018 - 11th Biomedical Engineering Interna-tional Conference, 2019.
[65] J. B. Bae et al. Identification of Alzheimer’s disease us-ing a convolutional neural network model based on T1-weighted magnetic resonance imaging. Scientific Reports,10(1):22252, dec 2020.
[66] Z. Tang et al. Interpretable classification of Alzheimer’sdisease pathologies with a convolutional neural networkpipeline. Nature Communications, 10(1):2173, dec 2019.
[67] 2020 Alzheimer’s disease facts and figures. Alzheimer’sand Dementia, 16(3):391–460, mar 2020.
[68] G. Quellec et al. Automatic detection of rare pathologiesin fundus photographs using few-shot learning. MedicalImage Analysis, 61, 2020.
[69] J. I. Orlando et al. REFUGE Challenge: A unified frame-work for evaluating automated methods for glaucoma as-sessment from fundus photographs, 2020.
[70] H. Fu et al. AGE Challenge: Angle Closure GlaucomaEvaluation in Anterior Segment Optical Coherence To-mography. arXiv, 2020.
[71] H. Fu et al. ADAM: Automatic Detection challenge onAge-related Macular degeneration, 2020.
[72] P. Porwal et al. Indian Diabetic Retinopathy ImageDataset (IDRiD), 2018.
[73] P. Porwal et al. IDRiD: Diabetic Retinopathy – Segmen-tation and Grading Challenge. Medical Image Analysis,59:101561, jan 2020.
[74] P. Porwal et al. Indian diabetic retinopathy image dataset(IDRiD): A database for diabetic retinopathy screeningresearch. Data, 3(3):25, jul 2018.
[75] D. Kermany. Large Dataset of Labeled Optical CoherenceTomography (OCT) and Chest X-Ray Images. MendeleyData, 3, 2018.
[76] D. S. Kermany et al. Identifying Medical Diagnoses andTreatable Diseases by Image-Based Deep Learning. Cell,172(5):1122–1131.e9, feb 2018.
[77] H. Al Hajj et al. CATARACTS: Challenge on automatictool annotation for cataRACT surgery. Medical ImageAnalysis, 52:24–41, feb 2019.
[78] E. Flouty et al. CaDIS: Cataract dataset for image seg-mentation. arXiv, jun 2019.
[79] H. Bogunovic et al. RETOUCH: The Retinal OCT FluidDetection and Segmentation Benchmark and Challenge.IEEE Transactions on Medical Imaging, 38(8):1858–1874,aug 2019.
[80] Diabetic Retinopathy Detection. International Journal ofEngineering and Advanced Technology, 9(4):1022–1026,2020.
[81] S. J. Chiu et al. Kernel regression based segmentation ofoptical coherence tomography images with diabetic mac-ular edema. Biomedical Optics Express, 6(4):1172, apr2015.
36 Johann Li 1 et al.
[82] M. Niemeijer et al. Retinopathy online challenge: Auto-matic detection of microaneurysms in digital color fundusphotographs. IEEE Transactions on Medical Imaging,29(1):185–195, 2010.
[83] H. Gireesha and N. S. Thyroid Nodule Segmentation AndClassification In Ultrasound Images, 2015.
[84] H. J. Aerts et al. Decoding tumour phenotype by nonin-vasive imaging using a quantitative radiomics approach.Nature Communications, 5(1):4006, sep 2014.
[85] L. Wee and A. Dekker. Data from Head-Neck-Radiomics-HN1, 2019.
[86] C. E. Cardenas et al. Data from AAPM RT-MAC GrandChallenge 2019, 2019.
[87] M. Vallières et al. Radiomics strategies for risk assess-ment of tumour failure in head-and-neck cancer. Scien-tific Reports, 7(1):10117, dec 2017.
[88] M. Vallières et al. Data from Head-Neck-PET-CT, 2017.[89] P. F. Raudaschl et al. Evaluation of segmentation meth-
ods on head and neck CT: Auto-segmentation challenge2015. Medical Physics, 44(5):2020–2036, may 2017.
[90] X. Yang et al. Automated segmentation of the parotidgland based on atlas registration and machine learning: Alongitudinal mri study in head-and-neck radiation ther-apy. International Journal of Radiation Oncology BiologyPhysics, 90(5):1225–1233, dec 2014.
[91] K. Hameeteman et al. Evaluation framework for carotidbifurcation lumen segmentation and stenosis grading.Medical Image Analysis, 15(4):477–488, aug 2011.
[92] S. Zhang et al. Cognitive control of sensory pain encodingin the pregenual anterior cingulate cortex. d1 - decoderconstruction in day 1, d2 - adaptive control in day 2.,2020.
[93] Y. Ikutani et al. Decoding functional category of sourcecode from the brain (fMRI on Java program comprehen-sion), 2020.
[94] Y. Ikutani et al. Expert programmers have fine-tunedcortical representations of source code. bioRxiv, pp.2020.01.28.923953, 2020.
[95] R. VanRullen and L. Reddy. Reconstructing faces fromfMRI patterns using deep generative neural networks.Communications Biology, 2(1), oct 2019.
[96] I. Alkhasli et al. Modulation of fronto-striatal func-tional connectivity using transcranial magnetic stimula-tion. Frontiers in Human Neuroscience, 13, jun 2019.
[97] I. Alkhasli et al. Resting State - TMS, 2019.[98] G. Shen et al. Deep image reconstruction from hu-
man brain activity. PLoS Computational Biology,15(1):e1006633, jan 2019.
[99] Guohua Shen et al. Deep Image Reconstruction, 2020.[100] N. Chang et al. BOLD5000 A public fMRI dataset of
5000 images. arXiv, 6(1):49, sep 2018.[101] R. J. Lepping et al. Neural processing of emotional mu-
sical and nonmusical stimuli in depression. PLoS ONE,11(6):e0156859, jun 2016.
[102] R. J. Lepping et al. Development of a validated emotion-ally provocative musical stimulus set for research. Psy-chology of Music, 44(5):1012–1028, sep 2016.
[103] J. R. Dugré et al. Limbic hyperactivity in responseto emotionally neutral stimuli in schizophrenia: A neu-roimaging meta-analysis of the hypervigilant mind.American Journal of Psychiatry, 176(12):1021–1029, dec2019.
[104] Y. Miyawaki et al. Visual Image Reconstruction fromHuman Brain Activity using a Combination of MultiscaleLocal Image Decoders. Neuron, 60(5):915–929, dec 2008.
[105] T. Horikawa and Y. Kamitani. Generic decoding of seenand imagined objects using hierarchical visual features.Nature Communications, 8(1):15037, aug 2017.
[106] J. D. Carlin and N. Kriegeskorte. Adjudicating betweenface-coding models with individual-face fMRI responses.PLoS Computational Biology, 13(7):e1005604, jul 2017.
[107] L. Koenders et al. Grey matter changes associated withheavy cannabis use: A longitudinal sMRI study. PLoSONE, 11(5):e0152482, may 2016.
[108] Campello and M. Víctor. Multi-Centre, Multi-Vendorand Multi-Disease Cardiac Segmentation: The M&MsChallenge. IEEE Transactions on Medical Imaging, 2020.
[109] N. Heller et al. The KiTS19 challenge data: 300 Kidneytumor cases with clinical context, Ct semantic segmenta-tions, and surgical outcomes. arXiv, 2019.
[110] N. Heller et al. Data from C4KC-KiTS [Data set], 2019.[111] X. Zhuang. Multivariate Mixture Model for Myocardial
Segmentation Combining Multi-Source Images. IEEETransactions on Pattern Analysis and Machine Intelli-gence, 41(12):2933–2946, 2019.
[112] X. Zhuang. Multivariate mixture model for cardiac seg-mentation from multi-sequence MRI. In Lecture Notes inComputer Science (including subseries Lecture Notes inArtificial Intelligence and Lecture Notes in Bioinformat-ics), volume 9901 LNCS, pp. 581–588, 2016.
[113] S. Leclerc et al. Deep Learning for Segmentation Usingan Open Large-Scale Dataset in 2D Echocardiography.IEEE transactions on medical imaging, 38(9):2198–2210,sep 2019.
[114] B. Rister et al. CT-ORG: A Dataset of CT Volumes WithMultiple Organ Segmentations, 2019.
[115] N. Bloch. Nc-isbi 2013 challenge: automated segmenta-tion of prostate structures, 2015.
[116] P. Bilic1a et al. The liver tumor segmentation benchmark(LiTS). arXiv, abs/1901.0, 2019.
[117] A. E. Kavur et al. CHAOS Challenge - combined (CT-MR) healthy abdominal organ segmentation, 2021.
[118] A. E. Kavur et al. CHAOS Challenge - combined (CT-MR) healthy abdominal organ segmentation. Medical Im-age Analysis, 69, jan 2021.
[119] A. E. Kavur et al. Comparison of semi-automatic anddeep learning-based automatic methods for liver segmen-tation in living liver transplant donors. Diagnostic andInterventional Radiology, 26(1):11–21, jan 2020.
[120] R. Trullo et al. Multiorgan segmentation using distance-aware adversarial networks. Journal of Medical Imaging,6(01):1, 2019.
[121] A. L. Simpson et al. A large annotated medical imagedataset for the development and evaluation of segmenta-tion algorithms. arXiv, feb 2019.
[122] S. Jaeger et al. Automatic tuberculosis screening usingchest radiographs. IEEE Transactions on Medical Imag-ing, 33(2):233–245, 2014.
[123] S. Candemir et al. Lung segmentation in chest radio-graphs using anatomical atlases with nonrigid registra-tion. IEEE Transactions on Medical Imaging, 33(2):577–590, 2014.
[124] S. Stirenko et al. Chest X-Ray Analysis of Tuberculosis byDeep Learning with Segmentation and Augmentation. In2018 IEEE 38th International Conference on Electronicsand Nanotechnology, ELNANO 2018 - Proceedings, pp.422–428, 2018.
[125] Z. Xiong et al. A global benchmark of algorithms for seg-menting the left atrium from late gadolinium-enhancedcardiac magnetic resonance imaging. Medical ImageAnalysis, 67:101832, jan 2021.
[126] J. Yang et al. Autosegmentation for thoracic radiationtreatment planning: A grand challenge at AAPM 2017.Medical Physics, 45(10):4568–4581, 2018.
A Systematic Collection of Medical Image Datasets for Deep Learning 37
[127] J. Yang et al. Data from Lung CT Segmentation Chal-lenge, 2017.
[128] H. R. Roth et al. Deeporgan: Multi-level deep convolu-tional networks for automated pancreas segmentation. InLecture Notes in Computer Science (including subseriesLecture Notes in Artificial Intelligence and Lecture Notesin Bioinformatics), volume 9349, pp. 556–564. 2015.
[129] H. Roth et al. Data From Pancreas-CT, 2016.[130] D. Newitt and N. Hylton. Single site breast DCE-MRI
data and segmentations from patients undergoing neoad-juvant chemotherapy, 2016.
[131] D. F. Pace et al. Interactive whole-heart segmentationin congenital heart disease. In Lecture Notes in Com-puter Science (including subseries Lecture Notes in Arti-ficial Intelligence and Lecture Notes in Bioinformatics),volume 9351, pp. 80–88, 2015.
[132] B. N. Bloch et al. Data From PROSTATE-DIAGNOSIS,2015.
[133] O. Jimenez-Del-Toro et al. Cloud-Based Evaluation ofAnatomical Structure Segmentation and Landmark De-tection Algorithms: VISCERAL Anatomy Benchmarks.IEEE Transactions on Medical Imaging, 35(11):2459–2475, nov 2016.
[134] A. Seff et al. Leveraging mid-level semantic bound-ary cues for automated lymph node detection. In Lec-ture Notes in Computer Science (including subseries Lec-ture Notes in Artificial Intelligence and Lecture Notes inBioinformatics), 2015.
[135] A. Seff et al. 2D view aggregation for lymph node detec-tion using a shallow hierarchy of linear classifiers. In Lec-ture Notes in Computer Science (including subseries Lec-ture Notes in Artificial Intelligence and Lecture Notes inBioinformatics), volume 8673 LNCS, pp. 544–552, 2014.
[136] H. R. Roth et al. A new 2.5D representation for lymphnode detection using random sets of deep convolutionalneural network observations. In Lecture Notes in Com-puter Science (including subseries Lecture Notes in Arti-ficial Intelligence and Lecture Notes in Bioinformatics),volume 8673 LNCS, pp. 520–527, 2014.
[137] R. Holger et al. A new 2.5 D representation for lymphnode detection in CT., 2015.
[138] A. B. Spanier and L. Joskowicz. Rule-based ventral cav-ity multi-organ automatic segmentation in CT scans. InLecture Notes in Computer Science (including subseriesLecture Notes in Artificial Intelligence and Lecture Notesin Bioinformatics), volume 8848, pp. 163–170, 2014.
[139] C. Tobon-Gomez et al. Benchmark for Algorithms Seg-menting the Left Atrium From 3D CT and MRI Datasets.IEEE Transactions on Medical Imaging, 34(7):1460–1473,2015.
[140] G. Litjens et al. Evaluation of prostate segmentation al-gorithms for MRI: The PROMISE12 challenge. MedicalImage Analysis, 2014.
[141] J. H. H. Litjens Geert; Futterer. Data From Prostate-3T,2015.
[142] S. Balocco et al. Standardized evaluation methodologyand reference database for evaluating IVUS image seg-mentation. Computerized Medical Imaging and Graphics,38(2):70–90, mar 2014.
[143] P. Lo et al. Extraction of airways from CT (EXACT’09).IEEE Transactions on Medical Imaging, 31(11):2093–2107, 2012.
[144] T. Heimann et al. Comparison and evaluation of methodsfor liver segmentation from CT datasets. IEEE Transac-tions on Medical Imaging, 28(8):1251–1265, 2009.
[145] J. Zhao et al. COVID-CT-Dataset: A CT image datasetabout COVID-19. arXiv, mar 2020.
[146] L. L. Wang et al. CORD-19: The COVID-19 open re-search dataset. arXiv, apr 2020.
[147] G. Maguolo and L. Nanni. A Critic Evaluation of Meth-ods for COVID-19 Automatic Detection from X-Ray Im-ages. arXiv, apr 2020.
[148] E. Tartaglione et al. Unveiling COVID-19 from chestx-ray with deep learning: A hurdles race with smalldata. International Journal of Environmental Researchand Public Health, 17(18):1–17, apr 2020.
[149] J. Born et al. Accelerating COVID-19 differential diag-nosis with explainable ultrasound image analysis. arXiv,sep 2020.
[150] J. Born et al. POCOVID-net: Automatic detection ofCOVID-19 from a new lung ultrasound imaging dataset(POCUS). arXiv, apr 2020.
[151] L. Wang et al. COVID-Net: a tailored deep convolutionalneural network design for detection of COVID-19 casesfrom chest X-ray images. Scientific Reports, 10(1):19549,dec 2020.
[152] M. E. Chowdhury et al. Can AI Help in Screening Vi-ral and COVID-19 Pneumonia? IEEE Access, 8:132665–132676, 2020.
[153] S. A. Harmon et al. Artificial intelligence for the detectionof COVID-19 pneumonia on chest CT using multinationaldatasets. Nature Communications, 11(1):4080, dec 2020.
[154] P. An et al. CT Images in Covid-19 [Data set], 2020.[155] S. Desai et al. Chest imaging representing a COVID-19
positive rural U.S. population, 2020.[156] S. Desai et al. Chest imaging representing a COVID-19
positive rural U.S. population. Scientific Data, 7(1):414,dec 2020.
[157] M. Buda et al. Detection of masses and architectural dis-tortions in digital breast tomosynthesis: a publicly avail-able dataset of 5,060 patients and a deep learning model.2020.
[158] M. Buda et al. Breast Cancer Screening – Digital BreastTomosynthesis (BCS-DBT), 2020.
[159] D. Li, P., Wang, S., Li, T., Lu, J., HuangFu, Y., & Wang.A Large-Scale CT and PET/CT Dataset for Lung CancerDiagnosis [Data set], 2020.
[160] J. Irvin et al. CheXpert: A large chest radiographdataset with uncertainty labels and expert comparison.33rd AAAI Conference on Artificial Intelligence, AAAI2019, 31st Innovative Applications of Artificial Intelli-gence Conference, IAAI 2019 and the 9th AAAI Sympo-sium on Educational Advances in Artificial Intelligence,EAAI 2019, 33:590–597, jan 2019.
[161] L. Wee et al. Data from NSCLC-Radiomics-Interobserver1, 2019.
[162] A. E. Johnson et al. MIMIC-CXR, a de-identified pub-licly available database of chest radiographs with free-textreports. Scientific Data, 6(1), 2019.
[163] A. E. W. Johnson et al. The MIMIC-CXR Database,2019.
[164] M. Rusu et al. Co-registration of pre-operative CT withex vivo surgically excised ground glass nodules to de-fine spatial extent of invasive adenocarcinoma on in vivoimaging: a proof-of-concept study. European Radiology,27(10):4209–4217, 2017.
[165] A. Madabhushi and M. Rusu. Fused Radiology-PathologyLung Dataset, 2018.
[166] O. Bernard et al. Deep Learning Techniques for Auto-matic MRI Cardiac Multi-Structures Segmentation andDiagnosis: Is the Problem Solved? IEEE Transactions onMedical Imaging, 37(11):2514–2525, nov 2018.
38 Johann Li 1 et al.
[167] O. Gevaert et al. Non-small cell lung cancer: Identifyingprognostic imaging biomarkers by leveraging public geneexpression microarray data - Methods and preliminaryresults. Radiology, 264(2):387–396, aug 2012.
[168] L. Kostakoglu et al. A phase II study of 3-Deoxy-3-18F-fluorothymidine PET in the assessment of early re-sponse of breast cancer to neoadjuvant chemotherapy:Results from ACRIN 6688. Journal of Nuclear Medicine,56(11):1681–1689, nov 2015.
[169] P. Kinahan et al. Data from ACRIN-FLT-Breast, 2017.[170] X. Wang et al. ChestX-ray8: Hospital-scale chest X-ray
database and benchmarks on weakly-supervised classifi-cation and localization of common thorax diseases. InProceedings - 30th IEEE Conference on Computer Vi-sion and Pattern Recognition, CVPR 2017, volume 2017-Janua, pp. 3462–3471, 2017.
[171] G. Litjens et al. Computer-aided detection of prostatecancer in MRI. IEEE Transactions on Medical Imaging,33(5):1083–1092, may 2014.
[172] G. Litjens et al. ProstateX challenge data, 2017.[173] A. A. A. Setio et al. Validation, comparison, and com-
bination of algorithms for automatic detection of pul-monary nodules in computed tomography images: TheLUNA16 challenge. Medical Image Analysis, 42:1–13, dec2017.
[174] M. Kachelrieß et al. Flying focal spot (FFS) in cone-beamCT. IEEE Transactions on Nuclear Science, 53(3):1238–1247, 2006.
[175] T. G. Flohr et al. Image reconstruction and image qualityevaluation for a 64-slice CT scanner with z-flying focalspot. Medical Physics, 32(8):2536–2547, 2005.
[176] B. Zhao et al. Evaluating variability in tumor measure-ments from same-day repeat CT scans of patients withnon-small cell lung cancer. Radiology, 252(1):263–272, jul2009.
[177] B. Zhao et al. Data From RIDER_Lung CT, 2015.[178] M. A. Gavrielides et al. Data From Phantom_FDA, 2015.[179] M. A. Gavrielides et al. A resource for the assessment
of lung nodule size estimation methods: database of tho-racic CT scans of an anthropomorphic phantom. OpticsExpress, 18(14):15244, jul 2010.
[180] B. N. Bloch et al. Data From BREAST-DIAGNOSIS,2015.
[181] M. Vallières et al. A radiomics model from joint FDG-PET and MRI texture features for the prediction oflung metastases in soft-tissue sarcomas of the extremi-ties. Physics in Medicine and Biology, 60(14):5471–5496,jul 2015.
[182] S. G. Armato et al. LUNGx Challenge for computerizedlung nodule classification. Journal of Medical Imaging,3(4):044506, dec 2016.
[183] S. G. Armato et al. Guest Editorial: LUNGx Chal-lenge for computerized lung nodule classification: reflec-tions and lessons learned. Journal of Medical Imaging,2(2):020103, jun 2015.
[184] R. Armato III, S. G., Hadjiiski, L., Tourassi, G.D.,Drukker, K., Giger, M.L., Li, F. and L. G., Farahani, K.,Kirby, J.S. and Clarke. SPIE-AAPM-NCI Lung NoduleClassification Challenge, 2015.
[185] S. K. Napel, Sandy, & Plevritis. NSCLC Radiogenomics:Initial Stanford Study of 26 Cases. The Cancer ImagingArchive., 2014.
[186] S. G. Armato et al. The Lung Image Database Con-sortium (LIDC) and Image Database Resource Initiative(IDRI): A completed reference database of lung noduleson CT scans. Medical Physics, 38(2):915–931, jan 2011.
[187] A. III et al. Data From LIDC-IDRI, 2015.
[188] C. K. Smith K et al. Data FromCT_COLONOGRAPHY, 2015.
[189] C. D. Johnson et al. Accuracy of CT Colonography forDetection of Large Adenomas and Cancers. New EnglandJournal of Medicine, 359(12):1207–1217, sep 2008.
[190] B. van Ginneken et al. Comparing and combining algo-rithms for computer-aided detection of pulmonary nod-ules in computed tomography scans: The ANODE09study. Medical Image Analysis, 14(6):707–722, 2010.
[191] S. Ali et al. An objective comparison of detection and seg-mentation algorithms for artefacts in clinical endoscopy.Scientific Reports, 10(1):2748, dec 2020.
[192] S. Ali et al. Deep learning for detection and segmenta-tion of artefact and disease instances in gastrointestinalendoscopy. Medical Image Analysis, 70, 2021.
[193] V. S. Bawa et al. ESAD: Endoscopic surgeon action de-tection dataset. arXiv, jun 2020.
[194] S. Ali et al. Endoscopy artifact detection (EAD 2019)challenge dataset. arXiv, may 2019.
[195] T. L. A. van den Heuvel et al. Automated measurementof fetal head circumference using 2D ultrasound images.PLOS ONE, 13(8):e0200412, aug 2018.
[196] R. Karim et al. Algorithms for left atrial wall segmen-tation and thickness – Evaluation on an open-source CTand MRI image database. Medical Image Analysis, 50:36–53, dec 2018.
[197] H. A. Kirişli et al. Standardized evaluation frameworkfor evaluating coronary artery stenosis detection, steno-sis quantification and lumen segmentation algorithms incomputed tomography angiography. Medical Image Anal-ysis, 17(8):859–876, dec 2013.
[198] S. Rueda et al. Evaluation and comparison of current fe-tal ultrasound image segmentation methods for biometricmeasurements: A grand challenge. IEEE Transactions onMedical Imaging, 33(4):797–813, 2014.
[199] X. He et al. PATHVQA: 30000+ questions for medicalvisual question answering. arXiv, mar 2020.
[200] R. Verma et al. Multi-organ Nuclei Segmentation andClassification Challenge 2020. (February):1–3, 2020.
[201] M. Peikari et al. Automatic cellularity assessment frompost-treated breast surgical specimens. Cytometry PartA, 91(11):1078–1087, 2017.
[202] A. L. Martel et al. Assessment of Residual Breast CancerCellularity after Neoadjuvant Chemotherapy using Digi-tal Pathology [Data set], 2019.
[203] S. A. R. D. D. O. A. H. B. &. M. R. Leavey P. Osteosar-coma data from UT Southwestern/UT Dallas for Viableand Necrotic Tumor Assessment, 2019.
[204] R. Mishra et al. Histopathological diagnosis for viableand non-viable tumor prediction for osteosarcoma usingconvolutional neural network. In Lecture Notes in Com-puter Science (including subseries Lecture Notes in Arti-ficial Intelligence and Lecture Notes in Bioinformatics),volume 10330 LNBI, pp. 12–23, 2017.
[205] L. P. et al. American Society of Pediatric Hematol-ogy/Oncology (ASPHO) Palais des congrés de MontréalMontréal, Canada April 26-29, 2017. Pediatric blood &cancer, 64, 2017.
[206] H. B. Arunachalam et al. Computer aided image segmen-tation and classification for viable and non-viable tumoridentification in osteosarcoma. In Pacific Symposium onBiocomputing, volume 0, pp. 195–206, 2017.
[207] R. Mishra et al. Convolutional neural network forhistopathological analysis of osteosarcoma. In Journalof Computational Biology, volume 25, pp. 313–325, 2018.
A Systematic Collection of Medical Image Datasets for Deep Learning 39
[208] J. Li et al. Signet Ring Cell Detection with a Semi-supervised Learning Framework. Lecture Notes in Com-puter Science (including subseries Lecture Notes in Arti-ficial Intelligence and Lecture Notes in Bioinformatics),11492 LNCS:842–854, jul 2019.
[209] Z. Swiderska-Chadaj et al. Learning to detect lympho-cytes in immunohistochemistry with deep learning. Med-ical Image Analysis, 58, 2019.
[210] J. Borovec et al. ANHIR: Automatic Non-Rigid Histo-logical Image Registration Challenge. IEEE transactionson medical imaging, 39(10):3042–3052, 2020.
[211] J. Borovec. Birl: Benchmark on Image Registration Meth-ods With Landmark Validation. arXiv, dec 2019.
[212] G. Campanella et al. Clinical-grade computationalpathology using weakly supervised deep learning onwhole slide images. Nature Medicine, 25(8):1301–1309,2019.
[213] G. Campanella et al. Breast Metastases to AxillaryLymph Nodes, 2019.
[214] Z. Li et al. Title: Computer-aided diagnosis of lung car-cinoma using deep learning – a pilot study. arXiv, mar2018.
[215] N. Kumar et al. A Multi-Organ Nucleus Segmenta-tion Challenge. IEEE Transactions on Medical Imaging,39(5):1380–1391, may 2020.
[216] B. S. Veeling et al. Rotation equivariant CNNs for digitalpathology. Lecture Notes in Computer Science (includ-ing subseries Lecture Notes in Artificial Intelligence andLecture Notes in Bioinformatics), 11071 LNCS:210–218,jun 2018.
[217] B. E. Bejnordi et al. Diagnostic assessment of deep learn-ing algorithms for detection of lymph node metastases inwomen with breast cancer. JAMA - Journal of the Amer-ican Medical Association, 318(22):2199–2210, dec 2017.
[218] G. Aresta et al. BACH: Grand challenge on breast cancerhistology images. Medical Image Analysis, 56:122–139,aug 2019.
[219] G. Litjens et al. 1399 H&E-stained sentinel lymph nodesections of breast cancer patients: The CAMELYONdataset. GigaScience, 7(6), jun 2018.
[220] P. Bándi et al. From Detection of Individual Metastasesto Classification of Lymph Node Status at the PatientLevel: The CAMELYON17 Challenge. IEEE Transac-tions on Medical Imaging, 38(2):550–560, feb 2019.
[221] V. Ulman et al. An objective comparison of cell-trackingalgorithms. Nature Methods, 14(12):1141–1152, dec 2017.
[222] D. Wei et al. MitoEM Dataset: Large-Scale 3D Mito-chondria Instance Segmentation from EM Images. InLecture Notes in Computer Science (including subseriesLecture Notes in Artificial Intelligence and Lecture Notesin Bioinformatics), volume 12265 LNCS, pp. 66–76, 2020.
[223] C. Matek et al. Human-level recognition of blast cells inacute myeloid leukaemia with convolutional neural net-works. Nature Machine Intelligence, 1(11):538–544, nov2019.
[224] C. Matek et al. A Single-cell Morphological Dataset ofLeukocytes from AML Patients and Non-malignant Con-trols [Data set], 2019.
[225] Y. Pan et al. Neighborhood-Correction Algorithm forCells. Lecture Notes in Bioengineering. Springer Singa-pore, Singapore, 2019.
[226] S. Gehlot et al. SDCT-AuxNetθ: DCT augmented staindeconvolutional CNN with auxiliary classifier for cancerdiagnosis. Medical Image Analysis, 61:101661, apr 2020.
[227] S. Goswami et al. Heterogeneity loss to handle intersub-ject and intrasubject variability in cancer. arXiv, mar2020.
[228] A. Gupta et al. PCSEG: Color model driven probabilisticmultiphase level set based tool for plasma cell segmenta-tion in multiple myeloma. PLoS ONE, 13(12), 2018.
[229] R. Gupta et al. Stain Color Normalization and Segmen-tation of Plasma Cells in Microscopic Images as a Pre-lude to Development of Computer Assisted AutomatedDisease Diagnostic Tool in Multiple Myeloma. ClinicalLymphoma Myeloma and Leukemia, 17(1):e99, feb 2017.
[230] R. Gupta and A. Gupta. MiMM_SBILab Dataset: Mi-croscopic Images of Multiple Myeloma, 2019.
[231] R. Duggal et al. SD-Layer: Stain deconvolutional layerfor CNNs in medical microscopic imaging. Lecture Notesin Computer Science (including subseries Lecture Notesin Artificial Intelligence and Lecture Notes in Bioinfor-matics), 10435 LNCS:435–443, 2017.
[232] R. Duggal et al. Overlapping cell nuclei segmentation inmicroscopic images using deep belief networks. In ACMInternational Conference Proceeding Series, 2016.
[233] Natasha Honomichl. SN-AM Dataset: White Blood can-cer dataset of B-ALL and MM for stain normalization,2019.
[234] L. Jin et al. Deep-learning-assisted detection and seg-mentation of rib fractures from CT scans: Developmentand validation of FracNet. EBioMedicine, 62:103106, dec2020.
[235] M. T. Löffler et al. A Vertebral Segmentation Datasetwith Fracture Grading. Radiology: Artificial Intelligence,2(4):e190138, 2020.
[236] A. Sekuboyina et al. VerSe: A Vertebrae Labelling andSegmentation Benchmark for Multi-detector CT Images.arXiv, jan 2020.
[237] A. A. Yorke et al. Pelvic Reference Data, 2019.[238] N. Bien et al. Deep-learning-assisted diagnosis for
knee magnetic resonance imaging: Development andretrospective validation of MRNet. PLoS Medicine,15(11):e1002699, nov 2018.
[239] P. Rajpurkar et al. MURA: Large dataset for abnormal-ity detection in musculoskeletal radiographs. arXiv, dec2017.
[240] Y. Song et al. Bone texture characterization with fisherencoding of local descriptors. In Proceedings - Interna-tional Symposium on Biomedical Imaging, volume 2015-July, pp. 5–8. IEEE, apr 2015.
[241] A. Sekuboyina et al. Labeling Vertebrae with Two-dimensional Reformations of Multidetector CT Images:An Adversarial Approach for Incorporating Prior Knowl-edge of Spine Anatomy. Radiology: Artificial Intelligence,2(2):e190074, 2020.
[242] B. Cassidy et al. DFUC 2020: Analysis towards diabeticfoot ulcer detection. arXiv, apr 2020.
[243] International Skin Imaging Collaboration. SIIM-ISIC2020 Challenge Dataset, 2020.
[244] P. Tschandl et al. Data descriptor: The HAM10000dataset, a large collection of multi-source dermatoscopicimages of common pigmented skin lesions. ScientificData, 5(1):180161, dec 2018.
[245] N. Codella et al. Skin lesion analysis toward melanomadetection 2018: A challenge hosted by the internationalskin imaging collaboration (ISIC). arXiv, feb 2019.
[246] N. C. Codella et al. Skin lesion analysis toward melanomadetection: A challenge at the 2017 International sympo-sium on biomedical imaging (ISBI), hosted by the inter-national skin imaging collaboration (ISIC). Proceedings- International Symposium on Biomedical Imaging, 2018-April:168–172, oct 2018.
40 Johann Li 1 et al.
[247] R. B. Ger et al. Synthetic head and neck and phantomimages for determining deformable image registration ac-curacy in magnetic resonance imaging. Medical Physics,45(9):4315–4321, sep 2018.
[248] R. Ger et al. Data from Synthetic and Phantom MRImages for Determining Deformable Image RegistrationAccuracy (MRI-DIR), 2018.
[249] R. Ger et al. Data from CT Phantom Scans for Head,Chest, and Controlled Protocols on 100 Scanners (CC-Radiomics-Phantom-3), 2019.
[250] R. B. Ger et al. Comprehensive Investigation on Control-ling for CT Imaging Variabilities in Radiomics Studies.Scientific Reports, 8(1):13047, dec 2018.
[251] S. u. H. M et al. Credence Cartridge Radiomics Phan-tom CT Scans with Controlled Scanning Approach (CC-Radiomics-Phantom-2), 2018.
[252] D. Mackin et al. Data From Credence Cartridge Ra-diomics Phantom CT Scans, 2017.
[253] P. Muzi et al. Data From RIDER_PHANTOM_PET-CT, 2015.
[254] N. Takata et al. Mouse_awake_rest, 2020.[255] N. A. Nadkarni et al. MouseLemurAtlas_MRIraw, 2019.[256] N. A. Nadkarni et al. A 3D population-based brain atlas
of the mouse lemur primate with examples of applicationsin aging studies and comparative anatomy. NeuroImage,185:85–95, jan 2019.
[257] N. C. I. C. P. T. A. Consortium. Radiology Data from theClinical Proteomic Tumor Analysis Consortium Glioblas-toma Multiforme [CPTAC-GBM] collection, 2018.
[258] G. Muñoz-Gil et al. AnDi: The anomalous diffusion chal-lenge, 2020.
[259] A. Dalca et al. Learn2Reg - The Challenge, 2020.[260] L. Zhang et al. MeDaS: An open-source platform as ser-
vice to help break the walls between medicine and infor-matics. arXiv, jul 2020.
[261] A. Demircioğlu et al. Detecting the pulmonary trunk inCT scout views using deep learning. Scientific Reports,11(1):10215, dec 2021.
[262] J. K. Kim et al. Prediction of ambulatory outcome in pa-tients with corona radiata infarction using deep learning.Scientific Reports, 11(1):7989, dec 2021.
[263] J. Yim et al. Predicting conversion to wet age-relatedmacular degeneration using deep learning. NatureMedicine, 26(6):892–899, jun 2020.
[264] L. Zhang et al. Block Level Skip Connections across Cas-caded V-Net for Multi-Organ Segmentation. IEEE Trans-actions on Medical Imaging, 39(9):2782–2793, 2020.
[265] J. Deng et al. ImageNet: A large-scale hierarchical imagedatabase. In 2009 IEEE Conference on Computer Visionand Pattern Recognition, pp. 248–255. IEEE, jun 2010.
[266] Y. LeCun et al. Gradient-based learning applied to docu-ment recognition. Proceedings of the IEEE, 86(11):2278–2323, 1998.
[267] T. Y. Lin et al. Microsoft COCO: Common objects incontext. Lecture Notes in Computer Science (includingsubseries Lecture Notes in Artificial Intelligence and Lec-ture Notes in Bioinformatics), 8693 LNCS(PART 5):740–755, may 2014.
[268] P. Li et al. A Dataset of Pulmonary Lesions WithMultiple-Level Attributes and Fine Contours. Frontiersin Digital Health, 2, feb 2021.
[269] A. K. Mondal et al. Few-shot 3D multi-modal medicalimage segmentation using generative adversarial learning.arXiv, oct 2018.
[270] M. Rezaei and M. Shahidi. Zero-Shot Learning and ItsApplications From Autonomous Vehicles to Covid-19 Di-agnosis: A Review. arXiv, apr 2020.
[271] M. J. Sheller et al. Multi-institutional deep learning mod-eling without sharing patient data: A feasibility study onbrain tumor segmentation. In Lecture Notes in Com-puter Science (including subseries Lecture Notes in Arti-ficial Intelligence and Lecture Notes in Bioinformatics),volume 11383 LNCS, pp. 92–104. 2019.
[272] H. Liang et al. Evaluation and accurate diagnoses ofpediatric diseases using artificial intelligence. NatureMedicine, 25(3):433–438, mar 2019.
[273] N. Viani et al. A natural language processing approachfor identifying temporal disease onset information frommental healthcare text. Scientific Reports, 11(1):757, dec2021.
[274] Y. Kim et al. Validation of deep learning natural lan-guage processing algorithm for keyword extraction frompathology reports in electronic health records. ScientificReports, 10(1):20265, dec 2020.
[275] W. Shao et al. Deep active learning for nucleus classifica-tion in pathology images. In Proceedings - InternationalSymposium on Biomedical Imaging, volume 2018-April,pp. 199–202. IEEE, apr 2018.
[276] J. Carse and S. McKenna. Active Learning for Patch-Based Digital Pathology Using Convolutional Neural Net-works to Reduce Annotation Costs. In Lecture Notes inComputer Science (including subseries Lecture Notes inArtificial Intelligence and Lecture Notes in Bioinformat-ics), volume 11435 LNCS, pp. 20–27. 2019.
top related