Surgical image and video applications using endoscopic datasets have been actively investigated to develop advanced surgical assistant systems. These applications are particularly crucial for understanding surgical scenes during procedures. Specifically, segmentation techniques allow for identifying anatomical structures and surgical instruments, while quality control methods refine surgical techniques, and action recognition aids in discerning surgical steps. A significant improvement in performance across different downstream tasks has been achieved due to the advancements in deep neural networks and the expansive training dataset available. However, the exploration of surgical action recognition remains limited. Existing methods face challenges in real-world settings, mainly due to the lack of adaptability in a dynamic imaging environment. In this study, we present a framework for surgical action recognition in endoscopic datasets by leveraging video-masked autoencoders (VideoMAE), which has shown promise in video dataset analysis with minimal datasets. Additionally, we incorporate a temporal data augmentation technique to represent diverse imaging conditions and resolve the issue of using single-source data with low quality. For our experiments, we utilize VideoMAE v2 pre-trained on Unlabeled Hybrid datasets and fine-tune the model on the CholecT45 dataset for validation. Our proposed method shows the effectiveness of using the VideoMAE structure with focal loss, particularly for action recognition tasks in surgical scenarios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.