Paper
28 April 2023 Binary function similarity detection based on text semantics
Shuai Lu, Chao Mu, Guannan Zhang, Shanshan Liu, Xuewei Zhao
Author Affiliations +
Proceedings Volume 12610, Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022); 126103N (2023) https://doi.org/10.1117/12.2671299
Event: Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022), 2022, Wuhan, China
Abstract
Binary code similarity is that different binary codes obtained from the same source code compiled by different compiler configurations are similar. Binary code similarity detection is often used to evaluate whether functions in two binary codes are similar. This technique has critical applications in intellectual property protection and IoT security, such as code plagiarism detection, malware detection, vulnerability detection, etc. In this paper, we propose a text semantics-based binary function similarity detection model SBFS, which firstly transforms binary functions into function texts by preprocessing assembly instructions; then learns function texts to obtain semantic embedding vectors using a natural language processing model. Finally, the similarity between two functions is measured by calculating the cosine distance between the embedding vectors of the two functions. Experimental results show that the SBFS model can achieve cross architecture detection and higher accuracy with 98.2% in the binary function similarity detection task.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Shuai Lu, Chao Mu, Guannan Zhang, Shanshan Liu, and Xuewei Zhao "Binary function similarity detection based on text semantics", Proc. SPIE 12610, Third International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2022), 126103N (28 April 2023); https://doi.org/10.1117/12.2671299
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Binary data

Semantics

Feature extraction

Education and training

Data modeling

Transform theory

Neural networks

Back to Top