Methodology and tools for creating training samples for artificial intelligence systems for recognizing lung cancer on CT images
- Authors: Kulberg N.S.1,2, Gusev M.A.1,3, Reshetnikov R.V.1,4, Elizarov A.B.1, Novik V.P.1, Prokudaylo S.B.1, Philippovich Y.N.3, Gobmolevsky V.A.1, Vladzymyrskyy A.V.1, Kamynina N.N.5, Morozov S.P.1
-
Affiliations:
- Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department
- Federal Research Center «Computer Science and Control» of Russian Academy of Sciences
- Moscow Polytechnic Uniersity
- Institute of Molecular Medicine, Sechenov First Moscow State Medical University
- Research Institute for Healthcare Organization and Medical Management of Moscow Healthcare Department
- Issue: Vol 64, No 6 (2020)
- Pages: 343-350
- Section: PROBLEMS OF SOCIALLY SIGNIFICANT DISEASES
- Submitted: 25.10.2024
- URL: https://rjonco.com/0044-197X/article/view/637940
- DOI: https://doi.org/10.46563/0044-197X-2020-64-6-343-350
- ID: 637940
Cite item
Full Text
Abstract
Introduction. Medical imaging techniques can diagnose many diseases at the early stages of their development, improving the patient survival. Artificial intelligence (AI) systems, requiring the high-quality annotated and marked-up sets of medical images, are a suitable and promising means of improving the diagnostics’ quality.
The purpose of the study was to develop a methodology and software for creating AIS training sets.
Material and methods. We compared the main annotation methods’ performance and accuracy and based the information system on the most efficient method in both domains to develop an optimal approach. To markup objects of interest, we used the cluster model of lesions localization previously developed by the authors. We used C++ and Kotlin programming languages for software development.
Results. A structured annotation template with delivered a glossary of terms became the basis of the information system. The latter consists of three interacting modules, two of which are executed on a remote server’s capacities and one on a personal computer or mobile device of the end-user. The first module is a web service responsible for the workflow logic. The second module, a web server, is responsible for interacting with client applications. Its role is to identify users and manage the database and Picture Archiving and Communication System (PACS) connections. The front-end module is a web application with a graphical interface that assists the end-user in images’ markup and annotation.
Conclusions. An algorithmic basis and a software package have been created for annotation and markup of CT images. The resulting information system was used in a large-scale lung cancer screening project for the creation of medical imaging datasets.
About the authors
Nikolay S. Kulberg
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department; Federal Research Center «Computer Science and Control» of Russian Academy of Sciences
Author for correspondence.
Email: kulberg@npcmr.ru
ORCID iD: 0000-0001-7046-7157
MD, Ph.D., head of the Department, Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies, Moscow, 109029, Russia.
e-mail: kulberg@npcmr.ru
Maxim A. Gusev
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department; Moscow Polytechnic Uniersity
Email: noemail@neicon.ru
ORCID iD: 0000-0001-8864-8722
Russian Federation
Roman V. Reshetnikov
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department; Institute of Molecular Medicine, Sechenov First Moscow State Medical University
Email: noemail@neicon.ru
ORCID iD: 0000-0002-9661-0254
Russian Federation
Alexey B. Elizarov
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department
Email: noemail@neicon.ru
ORCID iD: 0000-0003-3786-4171
Russian Federation
Vladimir P. Novik
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department
Email: noemail@neicon.ru
ORCID iD: 0000-0002-6752-1375
Russian Federation
Sergey B. Prokudaylo
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department
Email: noemail@neicon.ru
ORCID iD: 0000-0003-0970-3645
Russian Federation
Yuriy N. Philippovich
Moscow Polytechnic Uniersity
Email: noemail@neicon.ru
ORCID iD: 0000-0001-9419-2282
Russian Federation
Victor A. Gobmolevsky
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department
Email: noemail@neicon.ru
ORCID iD: 0000-0003-1816-1315
Russian Federation
Anton V. Vladzymyrskyy
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department
Email: noemail@neicon.ru
ORCID iD: 0000-0002-2990-7736
Russian Federation
Natalya N. Kamynina
Research Institute for Healthcare Organization and Medical Management of Moscow Healthcare Department
Email: noemail@neicon.ru
ORCID iD: 0000-0002-0925-5822
Russian Federation
Sergey P. Morozov
Research and Practical Clinical Center for Diagnostics and Telemedicine Technologies of the Moscow Health Care Department
Email: noemail@neicon.ru
ORCID iD: 0000-0001-6545-6170
Russian Federation
References
- Riquelme D., Akhloufi M.A. Deep learning for lung cancer nodules detection and classification in CT scans. AI. 2020; 1(1): 28–67. https://doi.org/10.3390/ai1010003
- Bell D.J., Morgan M.A. Lung-RADS. National Cancer Institute (NCI). Available at: https://radiopaedia.org/articles/lung-rads
- Morozov S.P., Kul’berg N.S., Gombolevskiy V.A., Ledikhova N.A., Sokolina I.A., Vladzimirskiy A.V., et al. Tagged Chest Computed Tomography (CT) Images. Patent RU № 2018620500; 2018. (in Russian)
- Morozov S.P., Kul’berg N.S., Gombolevskiy V.A., Ledikhova N.A., Sokolina I.A., Vladzimirskiy A.V., et al. Chest Computer Tomography (CT) set for Machine Learning. Patent RU № 2018620427; 2018. (in Russian)
- Li Z., Wang C., Han M., Xue Y., Wei W., Li L.J., et al. Thoracic Disease Identification and Localization with Limited Supervision. Available at: https://arxiv.org/abs/1711.06373
- Armato S.G., McLennan G., Bidaut L., McNitt-Gray M.F., Meyer C.R., Reeves A.P., et al. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans. Med. Phys. 2011; 38(2): 915–31. https://doi.org/10.1118/1.3528204
- Kan S.H. Metrics and Models in Software Quality Engineering. Boston: Addison-Wesley Professional; 2003.
- Kovalev V.A., Levchuk V.A., Kalinovskiy A.A., Fridman M.V. Tumor segmentation in whole-slide histology images using deep learning. Informatika. 2019; 16(2): 18–26. (in Russian)
- Xu R., Zhou X., Hirano Y., Tachibana R., Hara T., Kido S., et al. Particle system based adaptive sampling on spherical parameter space to improve the MDL method for construction of statistical shape models. Comput. Math. Methods Med. 2013; 2013: 196259. https://doi.org/10.1155/2013/196259
- Armato S.G., Meyer C.R., Mcnitt-Gray M.F., McLennan G., Reeves A.P., Croft B.Y., et al. The Reference Image Database to Evaluate Response to therapy in lung cancer (RIDER) project: A resource for the development of change analysis software. Clin. Pharmacol. Ther. 2008; 84(4): 448–56. https://doi.org/10.1038/clpt.2008.161
- Bakr S., Gevaert O., Echegaray S., Ayers K., Zhou M., Shafiq M., et al. A radiogenomic dataset of non-small cell lung cancer. Sci. Data. 2018; 5: 180202. https://doi.org/10.1038/sdata.2018.202
Supplementary files
