Feature map upscaling to improve scale invariance in convolutional neural networks

Research output: A Conference proceeding or a Chapter in BookConference contributionpeer-review

2 Citations (Scopus)
122 Downloads (Pure)

Abstract

Efforts made by computer scientists to model the visual system has resulted in various techniques from which the most notable has been the Convolutional Neural Network (CNN). Whilst the ability to recognise an object in various scales is a trivial task for the human visual system, it remains a challenge for CNNs to achieve the same behaviour. Recent physiological studies reveal the visual system uses global-first response strategy in its recognition function, that is the visual system processes a wider area from a scene for its recognition function. This theory provides the potential for using global features to solve transformation invariance problems in CNNs. In this paper, we use this theory to propose a global-first feature extraction model called Stacked Filter CNN (SFCNN) to improve scale-invariant classification of images. In SFCNN, to extract features from spatially larger areas of the target image, we develop a trainable feature extraction layer called Stacked Filter Convolut ions (SFC). We achieve this by creating a convolution layer with a pyramid of stacked filters of different sizes. When convolved with an input image the outputs are feature maps of different scales which are then upsampled and used as global features. Our results show that by integrating the SFC layer within a CNN structure, the network outperforms traditional CNN on classification of scaled color images. Experiments using benchmark datasets indicate potential effectiveness of our model towards improving scale invariance in CNN networks.
Original languageEnglish
Title of host publicationVISIGRAPP 2021 - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
EditorsGiovanni Maria Farinella, Petia Radeva, Jose Braz, Kadi Bouatouch
Place of PublicationPortugal
PublisherScitepress
Pages113-122
Number of pages10
Volume5
ISBN (Print)9789897584886
DOIs
Publication statusPublished - Feb 2021
Event16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2021 - Virtual, Online
Duration: 8 Feb 202110 Feb 2021

Publication series

NameVISIGRAPP 2021 - Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications
PublisherSciteScore
Volume5
ISSN (Print)2184-4321

Conference

Conference16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, VISIGRAPP 2021
CityVirtual, Online
Period8/02/2110/02/21

Fingerprint

Dive into the research topics of 'Feature map upscaling to improve scale invariance in convolutional neural networks'. Together they form a unique fingerprint.

Cite this