Cross-domain heterogeneous residual network for single image super-resolution
Introduction
Digital cameras have evolved with great progress in image resolution and acquisition speed over the past decades. They are not only widely used in many areas, but also incorporated into mobile devices like smartphones and tablets. In practical applications, only low-resolution (LR) images are often captured due to imaging environments and cameras themselves, but high-resolution (HR) images are usually needed for exhibiting clear details in subsequent image analysis and understanding (Bangyal et al., 2021, Pervaiz et al., 2021, Xiao et al., 2018, Zhang et al., 2015). As a common software solution, image super-resolution (ISR) is to construct the HR image from an LR observation without changing the front-end image acquisition process. Although ISR technologies have made significant progress in the past decades (Wang et al., 2021, Zhang et al., 2020, Zhang, Shi, et al., 2019, Zhang, Yap, et al., 2019, Zhang, Yap, Chen, et al., 2019), it is still urgent to develop fast, high-performance and lightweight ISR algorithms in practical applications.
Residual learning (Ahn et al., 2018, Bi et al., 2020, He et al., 2016, Lan, Sun, Liu, Lu, Pang, et al., 2021, Lan, Sun, Liu, Lu, Su, et al., 2021, Rad et al., 2019, Wang et al., 2017) has achieved excellent super-resolution performance, but it does not consider the heterogeneous attribute of residuals between different feature layers. In previous works, Zhang et al. (Qu et al., 2020, Zhang, Cheng, et al., 2018, Zhang, Yap, Chen, et al., 2019) pointed out that the frequency-domain features, which are complementary to the space-domain features, are beneficial to performance improvement of image synthesis networks. Inspired by these works, in this paper, we further explore the complementarity between the spatial and frequency domains, and apply it to develop a high-performance cross-domain heterogeneous residual network (HRN) for ISR. Our HRN is implemented and evaluated in comparison with baseline methods on several public datasets. We show that the proposed method, having a lightweight model, can recover high-quality super-resolved images and often outperforms the competing methods both qualitatively and quantitatively.
Our contributions are mainly summarized as below:
- •
We propose a cross-domain heterogeneous residual learning framework to learn the relationship between LR and HR spaces;
- •
We introduce the frequency-domain features as complementary information to assist deep learning;
- •
We design wide-activated residual attention blocks, wide-activated residual-in-residual dense blocks and dual-domain enhancement modules to develop hierarchical residual learning;
- •
We carry out quite a few experiments to corroborate the effectiveness of the proposed method for fast and accurate ISR.
The rest of this paper is arranged as below. Section 2 gives an overview of related work. Section 3 elaborates the proposed method. Section 4 presents extensive experiments and analysis. Section 5 provides the discussion. Section 6 draws the conclusions.
Section snippets
Related work
Deep learning has produced impressive results in the fields of image processing and computer vision in the past decade. Quite a number of deep learning-based ISR methods (Bi et al., 2020, Wang et al., 2021, Wang et al., 2017) have been proposed by increasing network depth or sharing model parameters. Many of them involve residual learning, dense connections or attention mechanism.
Method
In this section, we elaborate a novel HRN framework to learn a mapping function for recovering the detailed structures and constructing a corresponding super-resolved image from an input degraded image. To improve the network performance and accelerate training convergence, we design the proposed model using three levels of residual learning in a hierarchical manner. We will describe domain conversion, network structure and other details.
Experimental data
We chose the widely used dataset ‘DIV2K’ (Agustsson & Timofte, 2017) for model training in an end-to-end manner. Another four datasets (Arbelaez et al., 2011, Bevilacqua et al., 2012, Huang et al., 2015, Zeyde et al., 2010) were used for testing. The average pixel value of images in the DIV2K dataset was deducted from all images in data preprocessing. To guarantee good generalization performance, we adopted random 90-degree rotation, vertical flipping and horizontal flipping for data
Discussion
To estimate the desired HR images, we develop an accurate, fast and lightweight ISR network by introducing cross-domain heterogeneous residual learning and the dual-domain strategy, which distinguish our HRN from the existing methods. The existing ISR methods most similar to our HRN are EDSR (Lim et al., 2017), RDN (Zhang, Tian, et al., 2018), SAN (Dai et al., 2019), HAN (Niu et al., 2020), MADNet (Lan, Sun, Liu, Lu, Pang, et al., 2021), ERN (Lan, Sun, Liu, Lu, Su, et al., 2021) and MDCN (Li et
Conclusions
In this paper, we have presented a novel cross-domain heterogeneous residual network for fast and high-performance image super-resolution. This network is designed with three levels of residual learning in a hierarchical way. The wide nonlinear activation combined with coordinate attention is used in inner residual learning to correct feature mapping. The densely connected wide-activated residual blocks are introduced in middle residual learning to further improve feature mapping. The cascaded
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the Natural Science Basic Research Program of Shaanxi, China (Program No. 2019JM-103), the New Star of Youth Science and Technology of Shaanxi Province, China (Grant No. 2020KJXX-007), the Social Science Foundation of Shaanxi Province, China (Grant No. 2019H010), the Open Research Fund of CAS Key Laboratory of Spectral Imaging Technology, China (Grant No. LSIT201920W), and the National Natural Science Foundation of China (Grant No. 62173270).
References (45)
- et al.
RADC-Net: A residual attention based convolution network for aerial scene classification
Neurocomputing
(2020) - et al.
Synthesized 7T MRI from 3T MRI via deep learning in spatial and wavelet domains
Medical Image Analysis
(2020) - et al.
Blind video denoising via texture-aware noise estimation
Computer Vision and Image Understanding
(2018) - et al.
Super-resolution reconstruction of neonatal brain magnetic resonance images via residual structured sparse representation
Medical Image Analysis
(2019) - et al.
Dual-domain convolutional neural networks for improving structural information in 3 T MRI
Magnetic Resonance Imaging
(2019) - Agustsson, E., & Timofte, R. (2017). NTIRE 2017 challenge on single image super-resolution: Dataset and study. In IEEE...
- Ahn, N., Kang, B., & Sohn, K. (2018). Fast, accurate, and lightweight super-resolution with cascading residual network....
- et al.
Contour detection and hierarchical image segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
(2011) - et al.
Comparative analysis of low discrepancy sequence-based initialization approaches using population-based algorithms for solving the global optimization problems
Applied Sciences
(2021) - Bevilacqua, M., Roumy, A., Guillemot, C., & Alberi-Morel, M.-L. (2012). Low-complexity single-image super-resolution...
Attention mechanisms in computer vision: A survey
MADNet: A fast and lightweight network for single-image super resolution
IEEE Transactions on Cybernetics
Cited by (0)
- 1
Both Li Ji and Qinghui Zhu contributed equally to this paper.