The shape of defects on steel surfaces is highly variable and training samples are limited, making it a significant challenge to transfer a high-performance pretrained vision language model to steel surface defect detection. Therefore, a Multi-level Supervised Vision Language Model based Steel Surface Defect Detection method MLS-VLM is proposed in this paper. MLS-VLM delves deeply into the extraction of profound features from limited samples with three levels of training: supervised contrast training from labeled areas and the entire image, as well as self-supervised contrast learning from Region Proposals. MLS-VLM can be rapidly transferred to two-stage object detector. Experimental results demonstrate that, compared to traditional object detection methods, MLS-VLM achieves 5.68~8.37 mAP improvement on three benchmark object detectors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.