Accurately segmenting complete teeth from CBCT images is not only crucial in the field of orthodontics but also holds significant importance in forensic science. However, fully automated teeth segmentation is challenging due to the tight interconnection of teeth and the complexity of their arrangement, as well as the difficulty in distinguishing them from the surrounding alveolar bone due to similar densities. Currently, U-Net-based approaches have demonstrated remarkable success across a spectrum of medical image processing tasks, particularly in the task of segmentation. This work compares some U-Net-based segmentation methods (U-Net, U-Net++, U2-Net, nnU-Net, and TransUNet) on clinical teeth segmentation. We assess the enhancements introduced by these networks over the original U-Net and validate their performance on the identical dataset. Experimental results, both qualitative and quantitative, reveal that all methods perform well, with TransUNet demonstrating the best performance, achieving a Dice coefficient of 0.9364. Notably, U-Net, serving as the foundational model, outperforms U-Net++ and U2-Net, highlighting its robust generalization capability.
The segmentation of pulmonary arteries and veins in computed tomography scans is crucial for the diagnosis and assessment of pulmonary diseases. This paper discusses the challenges in segmenting these vascular structures, such as the classification of terminal pulmonary vessels relying on information from distant root vessels, and the complex branches and crossings of arteriovenous vessels. To address these difficulties, we introduce a fully automatic segmentation method that utilizes multiple 3D residual U-blocks module, a semantic embedding module, and a semantic perception module. The 3D residual U-blocks module can extract multi-scale features under a high receptive field, the semantic embedding module embeds semantic information to aid the network in utilizing the anatomical characteristics of parallel pulmonary artery and bronchi, and the SPM perceives semantic information and decodes it into classification results for pulmonary arteries and veins. Our approach was evaluated on a dataset of 57 lung CT scans and demonstrated competitive performance compared to existing medical image segmentation models.
Monocular depth estimation is a popular task. Due to the difficulty of obtaining true depth labels for the bronchus and the characteristics of the bronchial image such as scarcity of texture, smoother surfaces and more holes, there are many challenges in bronchial depth estimation. Hence, we propose to use a ray tracing algorithm to generate virtual images along with their corresponding depth maps to train an asymmetric encoder-decoder transformer network for bronchial depth estimation. We propose the edge-aware unit to enhance the awareness of the bronchial internal structure considering that the bronchus has few texture features and many edges and holes. And asymmetric encoder-decoder is proposed by us for multi-layer features fusion. The experimental results of the virtual bronchial demonstrate that our method achieves the best results in several metrics, including MAE of 0.915 ± 0.596 and RMSE of 1.471 ± 1.097.
Optical coherence tomography (OCT) is a non-invasive imaging modality that suitable for accessing retinal diseases. Since the thickness and shape of the retinal layer are diagnostic indicators for many ophthalmic diseases, segmentation of the retinal layer in OCT images is a critical step. Automated segmentation of oct images has made many efforts but there are still some challenges, such as lack of context information, ambiguous boundaries and inconsistent prediction of retinal lesion regions. In this work, we propose a new framework of Densely Encoded Attention Networks (DEAN) that combines dense encoders with position attention in an U-architecture for retinal layers segmentation. Since the spatial position of each layer in OCT image is relatively fixed, we use convolution in dense connections to obtain diverse feature maps in the encoder and employ position attention to improve the spatial information of learning targets. Moreover, up-sampling and skip connections in the decoder are to restore resolution by the position index saved during down-sampling, while supplementing the corresponding pixels is to guide the network capturing the global context information. This method is evaluated on two public datasets, and the results demonstrate that our method is an effective strategy on improving the performance of segmenting the retinal layers.
Pulmonary vessel segmentation from CT images is essential to diagnosis and treatment of lung diseases, particularly in treatment planning and clinical outcome evaluation. The main challenge for pulmonary vessel segmentation is complicated structures of the vascular trees and their similar intensity values with other tissues like the tracheal wall and lung nodules. This paper presents a novel relation extractor U-shaped network combining convolution and self-attention mechanism in an encoder-decoder mode. Particularly, we employ convolution in the shallow layers to extract local information of vessels in a short range and apply self-attention in the deep layers to capture long-range contextual relationship between ancestors and descendants of the vascular tree. We evaluate our proposed method on 50 computer tomography volumes, with the experimental results showing that our method can improve the average coefficient dice and recall to 85.60 and 86.04 respectively.
To deal with multitask segmentation, detection and classification of colon polyps, and solve the clinical problems of small polyps with similar background, missed detection and difficult classification, we have realized the method of supporting the early diagnosis and correct treatment of gastrointestinal endoscopy on the computer. We apply the residual U-structure network with image processing to segment polyps, and a Dynamic Attention Deconvolutional Single Shot Detector (DAD-SSD) to classify various polyps on colonic narrow-band images. The residual U-structure network is a two-level nested U-structure that is able to capture more contextual information, and the image processing improves the segmentation problem. DAD-SSD consists of Attention Deconvolutional Module (ADM) and Dynamic Convolutional Prediction Module (DCPM) to extract and fuse context features. We evaluated narrow-band images, and the experimental results validate the effectiveness of the method in dealing with such multi-task detection and classification. Particularly, the mean average precision (mAP) and accuracy are superior to other methods in our experiment, which are 76.55% and 74.4% respectively.
The problems of the large variation in shape and location, and the complex background of many neighboring tissues in the pancreas segmentation hinder the early detection and diagnosis of pancreatic diseases. The U-Net family achieve great success in various medical image processing tasks such as segmentation and classification. This work aims to comparatively evaluate 2D U-Net, 2D U-Net++ and 2D U-Net3+ for CT pancreas segmentation. More interestingly, We also modify U-Net series in accordance with depth wise separable convolution (DWC) that replaces standard convolution. Without DWC, U-Net3+ works better than the other two networks and achieves an average dice similarity coefficient of 0.7555. Specifically, according to this study, we find that U-Net plus a simple module of DWC certainly works better than U-Net++ using redesigned dense skip connections and U-Net3+ using full-scale skip connections and deep supervision and can obtain an average dice similarity coefficient of 0.7613. More interestingly, the U-Net series plus DWC can significantly reduce the amount of training parameters from (39.4M, 47.2M, 27.0M) to (14.3M, 18.4M, 3.15M), respectively. At the same time, they also improve the dice similarity compared to using normal convolution.
Kidney segmentation is fundamental for accurate diagnosis and treatment of kidney diseases. Computed tomography urography imaging is commonly used for radiologic diagnosis of patients with urologic disease. Recently, 2D and 3D fully convolutional networks are widely employed for medical image segmentation. However, most 2D fully convolutional networks do not take inter-slice spatial information into consideration, resulting in incomplete and inaccurate segmentation of targets in 3D volumes. While the spatial information is truly important for 3D volumes segmentation. To tackle these problems, we propose a computed tomography urography kidney segmentation method on the basis of spatiotemporal fully convolutional networks that employ the convolutional long short-term memory network to model inter-slice features of computed tomography urography images. We trained and tested our proposed method on kidney computed tomography urography data. The experimental results demonstrate our proposed method can effectively leverage the inter-slice spatial information to achieve better (or comparable) results than current 2D and 3D fully convolutional networks.
Abdominal kidney segmentation plays an essential role in diagnosis and treatment of kidney diseases, particularly in surgical planning and clinical outcome analysis before and after kidney surgery. It still remains challenging to precisely segment the kidneys from CT images. Current segmentation approaches still suffer from CT image noises and variations caused by different CT scans, kidney location discrepancy, pathological morphological diversity among patients, and partial volume artifacts. This paper proposes a fully automatic kidney segmentation method that employs a volumetric convolution driven cascaded V-Net architecture and false positive reduction to precisely extract the kidney regions. We evaluate our method on publicly available kidney CT data. The experimental results demonstrate that our proposed method is a promising method for accurate kidney segmentation, providing a dice coefficient of 0.95 better than other approaches as well less computational time.
Endoscopic video sequences provide surgeons with much structural information (e.g., vessels and neurovascular bundles) that guides them to accurately manipulate various surgical tools and avoid surgical risks. Unfortunately, it is difficult for surgeons to intuitively perceive these small structures with tiny pulsation motion on endoscopic images. This work proposes a new endoscopic video motion magnification method to accurately generate the amplified pulsation motion that can be intuitively and easily visualized by surgeons. The proposed method explores a new temporal filtering for Eulerian motion magnification method to precisely magnify the tiny pulsation motion and simultaneously suppress noise and artifacts in endoscopic videos. We evaluate our approach on surgical endoscopic videos acquired in robotic prostatectomy. The experimental results demonstrate that our proposed temporal filtering method essentially outperforms other filters in current video motion magnification approaches, while it provides better visual quality and quantitative assessment than other methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.