Abstract
Grammar-based fuzzing is effective at finding the vulnerabilities of various input-parsing programs which take as inputs complex data conforming to a certain grammar. Traditional grammar-based fuzzing techniques require a manually-generated grammar for valid test input generation. However, writing an input grammar by hand has two major drawbacks: (1) it is costly and error-prone and (2) it has no capability to generate interesting inputs which induce high test-coverage (for finding many vulnerabilities). To address these problems, a state-of-the-art technique, Learn&Fuzz, automatically generates an input grammar via deep neural network-based statistical learning. Even Learn&Fuzz, however, has significant limitations; especially, it cannot successfully generate a (long) sequence of instructions (consisting of opcode plus zero or more operands), which contribute to high test-coverage of instruction-interpreting code. In this paper, we focus on and quantify the limitations of the current learning-assisted grammar-based fuzzing, i.e, how ineffective it is at generating instruction sequences triggering high test coverage. Through our experiments using a re-implementation of Learn&Fuzz and real instruction-interpreting code, we measure the test-coverage of the target code when tested by Learn&Fuzz. Our experimental results show the coverage is surprisingly low, and the analysis of the results open up new research directions to enhance learning-assisted grammar-based fuzzing.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chen, J., Diao, W., Zhao, Q., Zuo, C., Lin, Z., Wang, X.F., Lau, W.C., Sun, M., Yang, R., Zhang, K.: IoTFuzzer: discovering memory corruptions in IoT through app-based fuzzing. In: NDSS (2018)
Cummins, C., Petoumenos, P., Wang, Z., Leather, H.: End-to-end deep learning of optimization heuristics. In: PACT 2017, pp. 219–232 (2017)
Godefroid, P., Peleg, H., Singh, R.: Learn & fuzz: machine learning for input fuzzing. In: ASE 2017, pp. 50–59 (2017)
Godefroid, P., Levin, M.Y., Molnar, D.: Automated whitebox fuzz testing. In: NDSS 2008, pp. 151–166 (2008)
Google. https://www.google.com/chrome/
Google. https://www.pdfium.org/
Graves, A.: Generating sequences with recurrent neural networks. CoRR 2013, abs/1308.0850 (2013)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Li, J., Zhao, B., Zhang, C.: Fuzzing: a survey. Cybersecurity 1(1), 6 (2018)
LLVM. https://clang.llvm.org/
LLVM. https://llvm.org/
Purdom, P.: A sentence generator for testing parsers. BIT Numer. Math. 12(3), 366–375 (1972)
Sutton, M., Greene, A., Amini, P.: Fuzzing: Brute Force Vulnerability Discovery
Burget, L., Cernocky, J., Mikolov, T., Karafiat, M., Khu-danpur, S.: Recurrent neural network based language model. In: INTERSPEECH 2008, pp. 1045–1048 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Jitsunari, Y., Arahori, Y., Gondow, K. (2020). Quantifying the Limitations of Learning-Assisted Grammar-Based Fuzzing. In: Barolli, L., Takizawa, M., Xhafa, F., Enokido, T. (eds) Advanced Information Networking and Applications. AINA 2019. Advances in Intelligent Systems and Computing, vol 926. Springer, Cham. https://doi.org/10.1007/978-3-030-15032-7_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-15032-7_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15031-0
Online ISBN: 978-3-030-15032-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)