Indoor Place Recognition and Localization Using Histogram of Oriented Gradient with Deep Learning
DOI:
https://doi.org/10.21271/ZJPAS.32.1.3Keywords:
Place recognition, Localization, CNN, AlexNet, SIFT, HOGAbstract
Indoor place recognition is a crucial and challenging field of computer science. It is widely used in robotics and computer vision for various applications. The challenges in indoor place recognition comes from the fact that recognizing localized places like office, corridor, and others may fall under various environmental effects of weather, illumination and others. In this paper, an indoor place recognition and localization system is proposed. The system utilizes the great recognition capabilities of Convolutional Neural Network (CNN) and AlexNet with the use of feature image for training. The feature images are constructed using Histogram of Gradient (HOG). The main contribution of this work is the use of 2D feature constructed image from HOG instead of the scene image used with CNN. The proposed system was compared to other previous systems, in which, it achieved better recognition accuracy when tested on COLD and IDOL standard indoor image datasets.
References
BAI, D., WANG, C., ZHANG, B., YI, X. & YANG, X. 2018. Sequence searching with CNN features for robust and fast visual place recognition. Computers & Graphics, 70, 270-280.
BAY, H., TUYTELAARS, T. & VAN GOOL, L.2006. Surf: Speeded up robust features. European conference on computer vision.Springer, 404-417.
CHEN, Z., JACOBSON, A., SÜNDERHAUF, N., UPCROFT, B., LIU, L., SHEN, C., REID, I. & MILFORD, M.2017. Deep learning features at scale for visual place recognition.IEEE International Conference on Robotics and Automation (ICRA). IEEE, 3223-3230.
CHEN, Z., MAFFRA, F., SA, I. & CHLI, M.2017. Only look once, mining distinctive landmarks from convnet for visual place recognition.IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 9-16.
CHOLLET, F. 2018. Deep Learning mit Python und Keras: Das Praxis-Handbuch vom Entwickler der Keras-Bibliothek, MITP-Verlags GmbH & Co. KG.
CUMMINS, M. & NEWMAN, P. 2008. FAB-MAP: Probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research, 27, 647-665.
DALAL, N. & TRIGGS, B.2005. Histograms of oriented gradients for human detection.
DENG, J., DONG, W., SOCHER, R., LI, L.-J., LI, K. & FEI-FEI, L.2009. Imagenet: A large-scale hierarchical image database. IEEE conference on computer vision and pattern recognition. Ieee, 248-255.
HAN, X., ZHONG, Y., CAO, L. & ZHANG, L. 2017. Pre-trained alexnet architecture with pyramid pooling and supervision for high spatial resolution remote sensing image scene classification. Remote Sensing, 9, 848.
HE, K., ZHANG, X., REN, S. & SUN, J.2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition.770-778.
KENSHIMOV, C., BAMPIS, L., AMIRGALIYEV, B., ARSLANOV, M. & GASTERATOS, A. 2017. Deep learning features exception for cross-season visual place recognition. Pattern Recognition Letters, 100, 124-130.
KUMAR, D., NEHER, H., DAS, A., CLAUSI, D. A. & WASLANDER, S. L.2017. Condition and viewpoint invariant omni-directional place recognition using cnn. 14th Conference on Computer and Robot Vision (CRV). IEEE, 32-39.
LEDWICH, L. & WILLIAMS, S. 2004. Reduced SIFT features for image retrieval and indoor localisation. Australian conference on robotics and automation. Citeseer, 3.
LI, P., LEE, S.-H. & HSU, H.-Y. 2011. Review on fruit harvesting method for potential use of automatic fruit harvesting systems. Procedia Engineering, 23, 351-366.
LOPEZ-ANTEQUERA, M., GOMEZ-OJEDA, R., PETKOV, N. & GONZALEZ-JIMENEZ, J. 2017. Appearance-invariant place recognition by discriminatively training a convolutional neural network. Pattern Recognition Letters, 92, 89-95.
LOWE, D. G. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60, 91-110.
LUO, J., PRONOBIS, A., CAPUTO, B. & JENSFELT, P. 2006. The kth-idol2 database. KTH, CAS/CVAP, Tech. Rep, 304.
MANCINI, M., BULÒ, S. R., CAPUTO, B. & RICCI, E. 2018. Robust place categorization with deep domain generalization. IEEE Robotics and Automation Letters, 3, 2093-2100.
MANCINI, M., BULÒ, S. R., RICCI, E. & CAPUTO, B. 2017. Learning deep NBNN representations for robust place categorization. IEEE Robotics and Automation Letters, 2, 1794-1801.
MANSOURIAN, L., ABDULLAH, M. T., ABDULLAH, L. N. & AZMAN, A. 2015. Evaluating classification strategies in bag of sift feature method for animal recognition. Research Journal of Applied Sciences, Engineering and Technology, 10, 1266-1272.
PARK, C., JANG, J., ZHANG, L. & JUNG, J.-I. 2018. Light-weight visual place recognition using convolutional neural network for mobile robots. IEEE International Conference on Consumer Electronics (ICCE). IEEE, 1-4.
PORZI, L., BULO, S. R., PENATE-SANCHEZ, A., RICCI, E. & MORENO-NOGUER, F. 2016. Learning depth-aware deep representations for robotic perception. IEEE Robotics and Automation Letters, 2, 468-475.
PRONOBIS, A. & CAPUTO, B. 2009. COLD: The CoSy localization database. The International Journal of Robotics Research, 28, 588-594.
REN, H. & LI, Z.-N.2014.Object detection using edge histogram of oriented gradient. IEEE International Conference on Image Processing (ICIP). IEEE, 4057-4061.
SCHILLING, F. 2016. The effect of batch normalization on deep convolutional neural networks.
SHARMA, N., JAIN, V. & MISHRA, A. 2018. An analysis of convolutional neural networks for image classification. Procedia computer science, 132, 377-384.
SYKORA, P., KAMENCAY, P. & HUDEC, R. 2014. Comparison of SIFT and SURF methods for use on hand gesture recognition based on depth map. AASRI Procedia, 9, 19-24.
TOBÍAS, L., DUCOURNAU, A., ROUSSEAU, F., MERCIER, G. & FABLET, R. 2016 Convolutional Neural Networks for object recognition on mobile devices: A case study. 23rd International Conference on Pattern Recognition (ICPR). IEEE, 3530-3535.
XU, D., RICCI, E., OUYANG, W., WANG, X. & SEBE, N. 2017. Multi-scale continuous crfs as sequential deep networks for monocular depth estimation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5354-5362.
YOSINSKI, J., CLUNE, J., BENGIO, Y. & LIPSON, H. 2014. How transferable are features in deep neural networks? Advances in neural information processing systems. 3320-3328.
ZHANG, Y., JIN, R. & ZHOU, Z.-H. 2010. Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics, 1, 43-52.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2020 Zina Khaleel Jalal , Moayad Y. Potrus, Abbas M. Ali
This work is licensed under a Creative Commons Attribution 4.0 International License.