Learning Non-Local Spatial-Angular Correlation
for Light Field Image Super-Resolution

1National University of Defense Technology, 2Aviation University of Air Force

Abstract

Exploiting spatial-angular correlation is crucial to light field (LF) image super-resolution (SR), but is highly challenging due to its non-local property caused by the disparities among LF images. Although many deep neural networks (DNNs) have been developed for LF image SR and achieved continuously improved performance, existing methods cannot well leverage the long-range spatial-angular correlation and thus suffer a significant performance drop when handling scenes with large disparity variations.

In this paper, we propose a simple yet effective method to learn the non-local spatial-angular correlation for LF image SR. In our method, we adopt the epipolar plane image (EPI) representation to project the 4D spatial-angular correlation onto multiple 2D EPI planes, and then develop a Transformer network with repetitive self-attention operations to learn the spatial-angular correlation by modeling the dependencies between each pair of EPI pixels. Our method can fully incorporate the information from all angular views while achieving a global receptive field along the epipolar line.

We conduct extensive experiments with insightful visualizations to validate the effectiveness of our method. Comparative results on five public datasets show that our method not only achieves state-of-the-art SR performance but also performs robust to disparity variations.


Method

Figure 3. Overview of EPIT.

In this paper, we propose a simple yet effective method to learn the non-local spatial-angular correlation for LF image SR. In our method, we re-organize 4D LFs as multiple 2D epipolar plane images (EPIs) to manifest the spatial-angular correlation to the line patterns with different slopes. Then, we develop a Transformer-based network called EPIT to learn the spatial-angular correlation by modeling the dependencies between each pair of pixels on EPIs. Specifically, we design a basic Transformer block to alternately process horizontal and vertical EPIs, and thus progressively incorporate the complementary information from all angular views. Compared to existing LF image SR methods, our method can achieve a global receptive field along the epipolar line, and thus performs robust to disparity variations.


BasicLFSR Benchmark


Robustness to Large Disparity Variations

Figure 8. Performance comparison and local attribution maps (LAM) of different SR methods.

(1) Except for the single image SR method RCAN, all LF image SR methods suffer a performance drop when the absolute sheared value of LF images increases. That is because, large sheared values can result in more significant misalignment among LF images, and introduce difficulties in complementary information incorporation;

(2) As the absolute sheared value increases, the performance of existing LF image SR methods is even inferior to RCAN. The possible reason is that, these methods do not make full use of local spatial information, but rather rely on local angular information from adjacent views. When the sheared value exceeds their receptive fields, the large disparities can make the spatial-angular correlation non-local and thus introduce challenges in complementary information incorporation;

(3) Our EPIT performs much more robust to disparity variations and achieves the highest PSNR scores under all sheared values. More quantitative comparisons on the whole datasets can be referred to the supplemental material.


Quantitative Comparison on Large Disparity Variations

Figure II. Quantitative comparison on five datasets with different shearing values for 2× SR.
Figure III. Quantitative comparison on five datasets with different shearing values for 4× SR.

Visual Results under Sheared Value=4

Related Links

Welcome to pay attention to the following works conducted by our group on LF image processing.

OACCnet, CVPR 2022 proposes a fast cost constructor composed by a series of convolutions with specifically designed dilation rates to construct matching cost for LF depth estimation.

LF-Distg, TPAMI 2022 proposes a generic mechanism to disentangle the spatial and angular information for LF image processing.

BibTeX

@InProceedings{Liang2023EPIT,
      author    = {Liang, Zhengyu and Wang, Yingqian and Wang, Longguang and Yang, Jungang and Zhou, Shilin and Guo, Yulan},
      title     = {Learning Non-Local Spatial-Angular Correlation for Light Field Image Super-Resolution},
      booktitle = {ICCV},
      year      = {2023},
}