Document Type : Full Research Paper


1 Ph.D. Student, Electronic Department, Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran

2 Assistant Professor, Electronic Department, Faculty of Electrical and Computer Engineering, Babol Noshirvani University of Technology, Babol, Iran

3 Professor, Computer Department, Faculty of Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran



Object recognition is one of the main cognitive abilities of human and animals. Human visual system, as a fast and accurate system can be a source of inspiration for the computational models of object recognition. Studies on the human visual system have emphasized its processing over time, whereas it is not considered in the conventional computational models of object recognition. In this paper, we attempt to present a time-based multilevel model for object recognition. In the first layer of the model, the input image information is sent to the next layer in a temporal representation. In the middle layer of the model, a deep neural network is used as a feature extractor. Finally, in contrast to the popular computational models for object recognition, a decision-making model such as drift-diffusion model is proposed based on the neuronal decision-making mechanisms in the brain. In other words, adaption to the human visual system has been considered in all of three layers. Several experiments have been conducted to evaluate the performance of the proposed computational model in object recognition. The experimental results show that as the input image becomes more complicated, noise increases, or occlusion occurs, the performance/reaction time of the model decreases/increases, which is consistent with the behavior of human visual system. The performance of the model for object recognition and base-level categorization is also investigated for application of the original images and the inverted images. The results show the difference between the processes of the object recognition and base-level categorization, which is consistent with the behavior of human visual system reported in the referenced papers. 


Main Subjects

[1]   E. Contini, S. Wardle, T. Carlson, “Decoding the time-course of object recognition in the human brain: From visual features to categorical decisions”, Neuropsychologia, 2017.
[2]   M. Dehaqani, A. Vahabie, R. Kiani, M. Ahmadabadi, B. Araabi, H. Esteky, “Temporal dynamics of visual category representation in the macaque inferior temporal cortex”, Journal of Neurophysiology, 116:587-601, 2016.
[3]   K. Rajaei, Y. Mohsenzadeh, R. Ebrahimpour, S. Khaligh-Razavi, “Beyond Core Object Recognition: Recurrent processes account for object recognition under occlusion”, PLOS Computational Biology, l 15(5), 2019.
[4]   H. Fujiyoshi, T. Hirakawa, T. Yamashita, “Deep learning-based image recognition for autonomous driving”, IATSS Research, 2019.
[5]   James DiCarlo, D. Zoccolan, N. Rust, “How does the brain solve visual object recognition?” Neuron, Vol 73 PP 415-434, 2012.
[6]   H. Sufikarimi, K. Mohammadi, “Feature extraction for object recognition inspired by human visual system”, Iranian Journal of Biomedical Engineering, 11(4): 337-349, 2018.
[7]   M. Jazlaeiyan, H. S. Shahhoseini, “Optimal Feature Selection in Biologically Inspired Model for Object Recognition Using Mutual Information Maximisation”, Iranian Journal of Biomedical Engineering, 8: 371-383, 2015.
[8]   S. Khaligh-Razavi, S. Habibi, M. Sadeghi, H. Marefat, M. Khanbagi, S. Nabavi, E. Sadeghi, C. Kalafatis, “Integrated Cognitive Assessment: Speed and Accuracy of Visual Processing as a Reliable Proxy to Cognitive Performance.”, Sci Rep vol. 9, pp: 1102, 2019.
[9]   A. Mirzaei, S. M. Khaligh-Razavi, M. Ghodrati, S. Zabbah, R. Ebrahimpour, “Predicting the human reaction time based on natural image statistics in a rapid categorization task”, Vision Research, 81: 36-44, 2013.
[10]S. Zabbah, K. Rajaei, A. Mirzaei, R. Ebrahimpour, S.M. Khaligh-Razavi, “The impact of the lateral geniculate nucleus and corticogeniculate interactions on efficient coding and higher-order visual object processing”, Vision Research,101: 82-93, 2014.
[11]M. Riesenhuber, T. Poggio, “Hierarchical models of object recognition in cortex”, Nat Neurosci vol. 2, pp: 1019–1025, 1999
[12]B. Le Cun, J. Denker, D. Henderson, R. Howard, W. Hubbard, L. Jackel, “Handwritten digit recognition with a back-propagation network,” in Advances in neural information processing systems, 1990. 
[13]D. Hubel, T. Wiesel, "Receptive fields, binocular interaction and functional architecture in the cat's visual cortex," J. Physiol., vol. 160, no. 1, pp: 106-154, 1962. 
[14]K. Fukushima, "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position," Biol. Cybern., vol. 36, no. 4, pp: 193-202, 1980.
[15]K. Fukushima, "Training multi-layered neural network neocognitron," Neural Networks, vol. 40, pp: 18-31, 2013.
[16]K. Fukushima, "Neocognitron for handwritten digit recognition," Neurocomputing, vol. 51, pp: 161-180, 2003. 
[17]S. Khaligh-Razavi, N. Kriegeskorte, “Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation”, PLoS Comput Biol 10(11), 2014
[18]A. Krizhevsky, I. Sutskever, G. Hinton, “ImageNet classification with deep convolutional neural networks”, Communications of the ACM, vol. 60, no. 6, pp: 84-90, 2017
[19]Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, “Gradient-based learning applied to document recognition”, Proc. IEEE 86(11): 2278–2324, 1998. 
[20]M. Zeiler, R. Fergus, “Visualizing and Understanding Convolutional Networks”, European Conference on Computer Vision (ECCV), pp 818-833, 2014.
[21]K. Simonyan, A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition”, Computer Vision and Pattern Recognition, ICLR, 2015. 
[22]C. Szegedy, C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich “Going deeper with convolutions”, Conference on Computer Vision and Pattern Recognition(CVPR), Boston, MA, pp. 1-92015.
[23]K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, Conference on Computer Vision and Pattern Recognition( CVPR), Las Vegas, NV, pp. 770-778, 2016.
[24]H. Timothy, C. Summerfield, “Perceptual Decision Making in Rodents, Monkeys, and Humans”, Neuron, vol. 93-1, pp: 15-31, 2017
[25]J. Gold, M. Shadlen, “The neural basis of decision making”, Annu Rev Neurosci. 30(1): 535-74, 2007.
[26]R. Ratcliff, J. Rouder, “Modeling response times for two-choice decisions”, Psychological Science, 9(5):347-35, 1998.
[27]R. Ratcliff, G. McKoon, “The diffusion decision model: theory and data for two-choice decision tasks”, Neural computation, 20(4):873-922, 2008.
[28]D. Vickers, “Evidence for an accumulator model of psychophysical discrimination.” Ergonomics, 13(1):37-58, 1970.
[29]X. Wang, “Probabilistic decision making by slow reverberation in cortical circuits.” Neuron, 36(5):955-968, 2002.
[30]K. Wong, X. Wang, “A Recurrent Network Mechanism of Time Integration in Perceptual Decisions.” The Journal of Neuroscience, 26(4):1314 –1328, 2006.
[31]S. Thorpe, A. Delorme, R. Van Rullen, “Spike-based strategies for rapid processing.” Neural Netw, 14(6-7):715-25, 2001
[32]G. Griffin, A. Holub, P. Perona, “Caltech-256 object category dataset.”, Technical Report 7694, California Institute of Technology, 2007.
[33]M. Mack, I.  Gauthier, J. Sadr, T. Palmeri, “Object detection and basic-level categorization: Sometimes you know it is there before you know what it is”, Psychonomic Bulletin & Review, 15(1), 28-35, 2008.
[34]A. Diaz, F. Queirazza and G. Philiastides, “Perceptual learning alters post-sensory processing in human decision-making”, Nature Human Behaviour, vol. 1, no. 0035, 2017.