A Bytecode-based Approach for Smart Contract Classification

Shi, Chaochen; Xiang, Yong; Doss, Robin Ram Mohan; Yu, Jiangshan; Sood, Keshav; Gao, Longxiang

Computer Science > Information Retrieval

arXiv:2106.15497 (cs)

[Submitted on 31 May 2021]

Title:A Bytecode-based Approach for Smart Contract Classification

Authors:Chaochen Shi, Yong Xiang, Robin Ram Mohan Doss, Jiangshan Yu, Keshav Sood, Longxiang Gao

View PDF

Abstract:With the development of blockchain technologies, the number of smart contracts deployed on blockchain platforms is growing exponentially, which makes it difficult for users to find desired services by manual screening. The automatic classification of smart contracts can provide blockchain users with keyword-based contract searching and helps to manage smart contracts effectively. Current research on smart contract classification focuses on Natural Language Processing (NLP) solutions which are based on contract source code. However, more than 94% of smart contracts are not open-source, so the application scenarios of NLP methods are very limited. Meanwhile, NLP models are vulnerable to adversarial attacks. This paper proposes a classification model based on features from contract bytecode instead of source code to solve these problems. We also use feature selection and ensemble learning to optimize the model. Our experimental studies on over 3,300 real-world Ethereum smart contracts show that our model can classify smart contracts without source code and has better performance than baseline models. Our model also has good resistance to adversarial attacks compared with NLP-based models. In addition, our analysis reveals that account features used in many smart contract classification models have little effect on classification and can be excluded.

Comments:	10 pages, 6 figures
Subjects:	Information Retrieval (cs.IR); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as:	arXiv:2106.15497 [cs.IR]
	(or arXiv:2106.15497v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2106.15497

Submission history

From: Chaochen Shi [view email]
[v1] Mon, 31 May 2021 03:00:29 UTC (10,739 KB)

Computer Science > Information Retrieval

Title:A Bytecode-based Approach for Smart Contract Classification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:A Bytecode-based Approach for Smart Contract Classification

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators