|
1.INTRODUCTIONIn chest imaging diagnosis, chest CT images are the most commonly used medical imaging data, which can display and locate important information such as the site, location and extent of lesions 1-2. In a chest medical examination, doctors look at a patient’s chest CT image to determine his or her condition. For patients with diseases such as lung cancer, doctors usually need to observe the growth and extent of the tumour; for lung cancer patients, they also need to observe lung function and some other imaging indicators to assess the condition. In CT images, the asymmetry or irregularity of lung lesions makes the location of lesions vary from region to region; and the shape of lung lesions also changes with the flow of gas in the lungs, so there is information overlap between different regions. In the field of computer-aided diagnosis, DL methods are mainly used for computer vision tasks in order to perform image classification and recognition. DL algorithms have also been widely used in the field of medical imaging, such as the combination of DL techniques and medical image processing algorithms for tumour localisation, tumour benignity and malignancy determination, and some other disease diagnosis in the medical field 3-4. In a related study, Ilyas et al. developed a new patient-specific anatomical background and shape a priori or PACS-aware 3D recurrent registration-segmentation network for segmentation of longitudinal thoracic CBCT 5. The segmentation and registration networks were trained simultaneously in an end-to-end framework and implemented by convolutional long and short-term memory models. The registration network is trained in an unsupervised manner using planned CT (pCT) and CBCT image pairs and produces a progressively deformed image sequence. The segmentation network was optimised by combining progressively deformed pCT (anatomical background) and pCT delineations (shape background) with CBCT images in a one-time setup.John et al. proposed a new DL framework to explore discriminative information in the lung and heart regions 6. A feature extractor equipped with a multi-scale attention module was designed to learn global attention maps from global images. Comprehensive experiments show that our approach achieves superior performance compared to state-of-the-art methods. The proposed new deep framework for multi-label classification of chest diseases in chest X-ray images. The proposed network aims to efficiently exploit the pathological regions containing the main clues from chest radiography. The proposed network has been used for clinical screening to assist radiologists. Chest X-rays represent a large proportion of radiological examinations and there is value in exploring additional ways to improve performance. Lung cancer is one of the most common malignant tumours in China, accounting for about 40% of all malignant tumours. Due to the lack of obvious symptoms in the early stages of lung cancer and the small size of the tumour, many patients are already in the middle to late stages when their disease is detected. The use of X-rays in CT examinations cannot determine the extent of the lesion and whether further treatment is needed. Based on this, this paper develops and clinically validates a system based on DL algorithms to assist in the detection of lung cancer. In this paper, a data-driven artificial intelligence model is constructed using a convolutional NN algorithm to assist doctors in the diagnosis of lung cancer and other malignant diseases. The results are then used as input to the model to predict the patient’s condition. 2.DESIGN STUDIES2.1Deep Neural Network (DNN) application problemsThe development of big data has driven the rise of DL, however, unlike natural image datasets, medical data is difficult to collect in a standardised way, and data annotation requires a high degree of specialisation and is expensive, making it very difficult to obtain large-scale annotated data. In addition, feature learning in deep networks is closely related to the class distribution of the dataset, and the very limited and unbalanced medical data is prone to problems such as biased feature selection and overfitting. Therefore, the application of DNNs to fine-grained chest X-ray aided diagnosis of high standard still faces two critical problems as follows 7-8. 2.1.1Inadequate sample of labelled dataFirstly, the problem of difficulty in fitting features to the network due to insufficient samples of labeled data. Although ChestX-Ray14 is a relatively large medical dataset with a data volume of 110,000, compared to the natural image dataset Imagenet with a data volume of 10 million, networks trained on medical image datasets still suffer from poor generalization and overfitting due to insufficient sample size. If the target dataset is not sufficiently supportive, fine-tuning the full layer may result in over-fitting the network; or fine-tuning the fully connected layer only due to the low similarity between the source and target domain data may prevent the network from extracting the feature semantic information of the chest lesion accurately inductively and with poor expressiveness. 2.1.2Uneven distribution of case dataNot only is the number of normal X-ray images generally higher than those containing lesions, but due to factors such as the complex diversity of disease pathogenesis, there may be significant biases in the distribution of samples for certain diseases, with large disparities in the number of samples for different categories of disease. The problem of data imbalance is currently addressed at two main levels 9-10.
2.2Thoracic medical imaging assisted diagnosis systemThe thoracic medical imaging assisted diagnostic system in this paper consists of the following main areas.
2.3Cross-Entropy loss functionCross entropy is often used as a loss function in DL. The loss function reflects the distance between the output value y’ and the label value y. A smaller gap indicates a better fit. In the logistic regression task, let the input be x and the output be y’, the linear regression model can be abbreviated as and define a non-linear mapping function from the input space to the output space as in equation (2). For a 2 classification problem, where the label value y takes the value 0 or 1, the posterior probability function at y = 1 can first be defined as The posterior probability at y = 0 is then defined as In summary, the posterior probability P(y|x) can be defined as follows. According to the maximum likelihood estimate, if all samples satisfy independent identical distribution, then a set of parameters can be identified to maximise the value of equation (5). Since the logarithmic function is monotonically increasing, then maximising the value of P(y|x) can be equated to maximising the value of log(P(y|x)), so taking the logarithm of equation (5) yields The objective of logistic regression is to maximise this function, which adequately reflects the model performance error. The loss function can therefore be made to be the opposite of the above function, and for m samples the loss function is the cross-entropy function. 3.EXPERIMENTAL RESEARCH3.1Experimental environment and parameter settingsThe PyTorch framework is a commonly used DL framework in which one can design one’s own network model and automatically solve the gradient by back propagation within the framework. The reduction of the U-Net network framework and the training of the DB-U-Net network are based on the PyTorch framework. A pre-configured runtime environment and library files are used to run the lung CT image pre-processing algorithm and the graph cut algorithm. The basic configuration of the experimental environment in this paper is an Inteli5-8400 processor, two NVIDIAGeForce1070Ti graphics cards (8GB of video memory), 32G of RAM, and Ubuntu 16.04 Server Edition. The GPU parameters, such as computational power and speed, will affect the network training of the experiments, and the use of GPU performance will reduce the training time of the split network in this paper. The experimental parameters are shown in Table 1. Because the lung CT images used in this paper are relatively large, the number of samples selected for the same training session will have the problem of overflowing video memory, which will affect the calculation results and the calculation speed. The effectiveness of the training depends heavily on the learning rate, which is set to 0.01, 0.001, 0.0001 and 0.00001, respectively. Table 1.Experimental parameter settings.
Stochastic gradient descent is used for training the model in this chapter to solve for the minima, updating the model weights in each iteration until it gradually converges to the minima. In a single iteration, the weights of the model are updated using a batch size number of samples each time. The advantages of using stochastic gradient descent over other optimisation algorithms are as follows: (1) it is possible to train the network model without loading all the data into memory at once, taking up less memory and video memory; (2) stochastic gradient descent algorithms train the network faster; (3) within reason, increasing the batch size can make the direction of gradient descent more accurate and faster, as smaller batches can cause the model to to fall into other locally optimal solutions. When training the NN, the batch size was set to 16. 3.2General flow of the experimentIn this paper, a feature interpretability-guided multi-scale integrated convolutional NN is proposed to achieve abstract feature reuse with understanding of cnn depth features, and to accurately and automatically perform the task of classifying true and false recurrence of glioma in DTI images. Firstly, three single classification models for true and false recurrence of glioma were trained separately using the classical cnn framework as the base model. All feature maps are then visualised layer by layer, and the heatmap mapping approach is used to visually capture the respective “focus” of the different layers of the single classification models. For example, some layers focus more on edge information, while others highlight differences between pixels, etc. Based on this, the three monoclassification models are empirically observed by the imaging practitioner to locate the most relevant layer for each glioma lesion area, visually completing the feature selection process. Finally, the selected multi-scale features from different networks were fused to construct an integrated model of true and false recurrence of glioma. The experimental flow chart is shown in Figure 1. 3.3Performance evaluation indicatorsIn order to verify the accuracy of the algorithm based on DL and feature mixing, the performance of the algorithm was measured based on the categories to which the sample labels belonged compared with the actual predicted categories of the algorithm using the evaluation criteria shown in Table 2 to overcome the singularity and one-sidedness of the algorithm performance evaluation in the field of medical aid diagnosis in terms of accuracy alone. Table 2.Evaluation criteria for detection of lung nodules.
The meanings of the indicators in Table 2 are as follows.
In the lung nodule detection algorithm, the ultimate aim is to extract all the lung nodules in the lung CT image sequence, but since interference from other tissues within the lung often causes a degree of false positives, the experiment uses the accuracy (ACC), sensitivity, and specificity (Spe) defined by the four criteria mentioned above for a comprehensive measure of the overall discriminatory ability of the algorithm. The accuracy is used to measure the overall classification capability of the algorithm, and the higher the accuracy, the better the overall classification of the algorithm, as defined in equation (8). Sensitivity, also known as true positive rate and recall, is used to measure the ability of the algorithm to discriminate between regions of lung nodules, i.e. the ratio of the number of correctly predicted samples among all nodule samples, with higher sensitivity representing a lower rate of missed detections by the algorithm, as defined by the formula in equation (9). Specificity is used to measure the ability of the algorithm to discriminate between non-nodular regions, i.e. the ratio of the number of correctly predicted samples out of all non-nodular samples, with a higher specificity representing a better determination of non-nodular regions, as defined by the formula in equation (10). In this paper, we use the FROC (Free Response Receiver Operating Characteristics) criterion to measure the detection effectiveness of the algorithm by calculating the cpm (Competitive Performance Index) value, where cpm is the average number of false positives per set of CT images (FPs per scan) i.e. the average detection rate at horizontal coordinates of 1/8, 1/4, 1/2, 1, 2, 4, 8. The following experiments use cpm values as an assessment criterion. 4.EXPERIMENTAL ANALYSIS4.1Experimental data sourcesThe dataset was derived from lung CT image segmentation, with 7,500 CT images, 6,000 for training and validation, which contains 4,800 training set data and 1,200 validation set data. The training and validation datasets were all collected from the National Centre for Biological Information and contained a dataset of 2000 normal CT line images, 2000 CT line images of viral pneumonia and 2000 CT line images of positive new coronary pneumonia as experimental subjects. Details of the dataset are shown in Table 3 for a total of 6000 CT images of lung data, 80% of the data is the training set data and 20% of the data is the validation set data. Table 3.Lung CT image dataset
4.2Analysis of experimental resultsTo validate the effectiveness of the proposed DL and feature mixing based lung nodule detection model, CT image sequences were sent to the network for detection, and patient image data from the test set were selected for the following self-comparison experiments and statistical quantification of the prediction results by calculating cpm values using guidelines to measure the performance of the algorithm: (1) Lung nodules were detected using the traditional R-fcn network and the results were recorded as modell. Table 4 shows the difference in performance between the three network structures on the LUNA16 dataset for Figure 2 shows the FROC curves of the three models for the improved process. Table 4.Algorithm improvement process detection performance comparison
The process of the change in the average detection rate of the algorithm during the improvement can be seen by observing Table 4 with Figure 2. By upgrading the RcsNet to denseNet with fpn for feature extraction, the algorithm detection sensitivity was improved and the recognition performance of the network was enhanced. Then according to the results of K-meaning is clustering change the settings in the rpn network with and correlation in the network, and introduce the feature pyramid structure, training using Focalose loss function instead of the traditional cross-entropy cost function to further improve the recognition performance of the network, have a higher detection accuracy for small-scale nodules, so that the network has better results. The above experiments show that the detection sensitivity of the algorithm is improving in the process of improvement, and the average detection rate can reach 96.6% when the average number of false positives in each group of CT images is 8. The recognition performance of the proposed algorithm is better than the original method in the detection of lung nodules. In summary, the following conclusions can be drawn from the above experiments: even though lung nodules have a series of complex characteristics such as varying sizes, complex shapes and uncertain locations, the proposed DL and feature mixing based lung nodule detection algorithm still shows good detection results for nodules of different scales and good discrimination ability in the face of complex lesion areas. The network not only has a high prediction probability for larger diameter lung nodules to reach a correct judgment, but also for smaller diameter microscopic lung nodules to reach a correct prediction conclusion. Table 5 shows the performance of the algorithm in this paper compared with the FasterRCNN, the multi-view convolutional NN model and the multi-view deep belief network for lung nodule detection. The FROC curves for the four models are shown in Figure 3. Table 5.Comparison of lung nodule detection performance by model
Analysis of Table 5 and Figure 3 shows that the proposed algorithm performs well in the task of lung nodule detection, and its overall sensitivity is better than other algorithms, with a sensitivity of 96.6% at an average of 8 false positives per group of CT images. Through the above analysis, the improved algorithm outperforms the traditional algorithm in terms of feature extraction ability and discrimination ability, which can effectively lay the foundation for the subsequent diagnosis of the imaging physician and achieve the purpose of assisting the physician in the diagnosis. 5.CONCLUSIONSIn this paper, a DL algorithm is designed to segment and classify tumours from lung CT images, and the results are tested to show that the algorithm can effectively identify and classify lung tumours and determine their benign and malignant degrees. At present, DL has been widely used in the fields of image recognition, medical image analysis and medical artificial intelligence. The lung CT image segmentation and lung disease diagnosis method studied in this paper is a new artificial intelligence technique based on the application of computer-aided diagnosis (CAA) technology to medical imaging diagnosis, which can classify and diagnose lung tumours more accurately and with more guidance by fusing medical imaging data and clinical experience. The DL algorithm will achieve a more comprehensive and detailed presentation of clinical information than traditional image processing algorithms. REFERENCESPatel, V., Li, C. H., Rye, V., Liu, C. S. J., Lerner, A., Acharya, J., Rajamohan, A. G.,
“A Comparison of WebRTC and Conventional Videoconferencing for Synchronized Remote Medical Image Presentation,”
j. Digit. Imaging, 35
(1), 68
–76
(2022). https://doi.org/10.1007/s10278-021-00544-0 Google Scholar
Soomro, T. A., Zheng, L., Afifi, A. J., Ali, A., Yin, M., Gao, J.,
“Artificial intelligence (AI) for medical imaging to combat coronavirus disease (COVID-19): a detailed review with direction for future research,”
Artif. Intell. Rev., 55
(2), 1409
–1439
(2022). https://doi.org/10.1007/s10462-021-09985-z Google Scholar
Ghasemi, M., Kelarestaghi, M., Eshghi, F., Sharifi, A.,
“D3FC: deep feature-extractor discriminative dictionary-learning fuzzy classifier for medical imaging,”
Appl. Intell, 52
(7), 7201
–7217
(2022). https://doi.org/10.1007/s10489-021-02781-w Google Scholar
Avola, D., Cinque, L., Fagioli, A., Foresti, G., Mecca, A.,
“Ultrasound Medical Imaging Techniques: A Survey,”
ACM Comput. Surv., 54
(3), 67:1
–67:38
(2022). https://doi.org/10.1145/3447243 Google Scholar
Sirazitdinov, I., Schulz, H., Saalbach, A., Renisch, S., Dylov, D.V.,
“Tubular shape aware data generation for segmentation in medical imaging,”
Int. J. Comput. Assist. Radiol. Surg., 17
(6), 1091
–1099
(2022). https://doi.org/10.1007/s11548-022-02621-3 Google Scholar
Valen, J., Balki, I., Mendez, M., Qu, W., Levman, J., Bilbily, A., Tyrrell, P. N.,
“Quantifying uncertainty in machine learning classifiers for medical imaging,”
Int. J. Comput. Assist. Radiol. Surg., 17
(4), 711
–718
(2022). https://doi.org/10.1007/s11548-022-02578-3 Google Scholar
Ravi, P., Chepelev, L. L., Stichweh, G. V., Jones, B. S., Rybicki, F. J.,
“Medical 3D Printing Dimensional Accuracy for Multi-pathological Anatomical Models 3D Printed Using Material Extrusion,”
J. Digit. Imaging, 35
(3), 613
–622
(2022). https://doi.org/10.1007/s10278-022-00614-x Google Scholar
Kidav, J., Pillai, P. M., Deepak, V., Sreejeesh S. G.,
“Design of a 128-channel transceiver hardware for medical ultrasound imaging systems,”
IET Circuits Devices Syst., 16
(1), 92
–104
(2022). https://doi.org/10.1049/cds2.v16.1 Google Scholar
Benazzouz, M., Benomar, M. L., Moualek, Y.,
“Modified U-Net for cytological medical image segmentation,”
Int. J. Imaging Syst. Technol., 32
(5), 1761
–1773
(2022). https://doi.org/10.1002/ima.v32.5 Google Scholar
Kollem, S., Ramalinga Reddy, K., Srinivasa Rao, D., Rajendra Prasad, C., Malathy, V., Ajayan, J., Muchahary, D.,
“Image denoising for magnetic resonance imaging medical images using improved generalized cross-validation based on the diffusivity function,”
Int. J. Imaging Syst. Technol., 32
(4), 1263
–1285
(2022). https://doi.org/10.1002/ima.v32.4 Google Scholar
Kumar, A., Goodrum, H., Kim, A., Stender, C., Roberts, K., Bernstam, E. V.,
“Closing the loop: automatically identifying abnormal imaging results in scanned documents,”
J. Am. Medical Informatics Assoc., 29
(5), 831
–840
(2022). https://doi.org/10.1093/jamia/ocac007 Google Scholar
Valtchinov, V. I., Murphy, S. N., Lacson, R., Ikonomov, N., Zhai, B. K., Andriole, K., et al.,
“Analytics to monitor the local impact of the Protecting Access to Medicare Act’s imaging clinical decision support requirements,”
J. Am. Medical Informatics Assoc., 29
(11), 1870
–1878
(2022). https://doi.org/10.1093/jamia/ocac132 Google Scholar
|