A FPGA-Oriented Quantization Scheme for MobileNet-SSD

  • Yuxuan Xie
  • Bing LiuEmail author
  • Lei Feng
  • Xipeng Li
  • Danyin Zou
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 157)


The rising popularity of mobile devices, which have high performance in object detection calls for a method to implement our algorithms efficiently on mobile devices. As we know, Deep Learning is a good approach to achieve state-of-the-art results. But it needs lots of computation and resources, mobile devices are often resource-limited because of their small size. Recently, FPGA is a device famous for parallelism and many people try to implement the Deep Learning Networks on FPGA. After our investigation, we choose MobileNet-SSD to implement on FPGA because that this network is designed for mobile devices and its size and cost are relatively smaller. There are also some challenges about implementing the network on FPGA, such as the large demand of resources and low latency, which are pretty important for mobile devices. In this paper, we show a quantization scheme for object detection networks based on FPGA and a process to simulate the FPGA on PC to help us predict the performance of networks on FPGA. Besides, we propose an integer-only inference based on FPGA, which truly reduce the cost of resources greatly. The method of Dynamic Fixed Point is adopted and we make some improvement based on object detection networks to quantize the MobileNet-SSD, which is a suitable object detection network for embedded system. Our improvements make its performance better than Ristretto.


Quantization FPGA MobileNet-SSD 


  1. 1.
    Lee, A.: Comparing Deep Neural Networks and Traditional Vision Algorithms in Mobile Robotics. Swarthmore University (2015)Google Scholar
  2. 2.
    Chen, X., Peng, X., Li, J.-B., Peng, Yu.: Overview of deep kernel learning based techniques and applications. J. Netw. Intell. 1(3), 83–98 (2016)Google Scholar
  3. 3.
    Howard, A.G., Zhu, M., Chen, B., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2014). arXiv:1704.04861
  4. 4.
    Iandola, F.N., Han, S., Moskewicz, M.W., et al.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and <0.5 mb model size (2016). arXiv:1602.07360
  5. 5.
    Yin, P., Zhang, S., Xin, J., et al.: Training ternary neural networks with exact proximal operator (2016). arXiv:1612.06052
  6. 6.
    Rastegari, M., Ordonez, V., Redmon, J., et al.: Xnor-net: Imagenet classification using binary convolutional neural networks. In: European Conference on Computer Vision, pp. 525–542. Springer, Cham (2016)CrossRefGoogle Scholar
  7. 7.
    Chen, Y., Du, Z., Sun, N., Wang, J., Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In: ASPLOS, vol. 49, no. 4. ACM, pp. 269–284 (2014)Google Scholar
  8. 8.
    Kuang, F.-J., Zhang, S.-Y.: A novel network intrusion detection based on support vector machine and tent chaos artificial bee colony algorithm. J. Netw. Intell. 2(2), 195–204 (2017)Google Scholar
  9. 9.
    Fan,C., Ding, Q.: ARM-embedded implementation of H.264 selective encryption based on chaotic stream cipher. J. Netw. Intell. 3(1), 9–15 (2018)Google Scholar
  10. 10.
    Gysel, P.: Ristretto: hardware-oriented approximation of convolutional neural networks (2016). arXiv:1605.06402
  11. 11.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015). arXiv:1502.03167
  12. 12.
    Liu, B., Zou, D., Feng, L., Feng, S., Fu, P., Li, J.: An FPGA-based CNN accelerator integrating depthwise separable convolution. Electronics 8, 281 (2019)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Harbin Institute of TechnologyHarbinChina

Personalised recommendations