ProSR

A Fully Progressive Approach to Single-Image Super-Resolution

ProSR is a Single Image Super-Resolution (SISR) method designed upon the principle of multi-scale progressiveness. The architecture resembles an asymmetric pyramidal structure with more layers in the upper levels, to enable high upsampling ratios while remaining efficient. The training procedure implements the paradigm of curriculum learning by gradually increasing the difficulty of the task.

Installation

Follow the instructions below to get ProSR up and running on your machine, both for development and testing purposes.

ProSR is developed under Ubuntu 16.04 with CUDA 9.1, cuDNN v7.0 and pytorch-0.4.0. We tested the program on Nvidia Titan X and Tesla K40c GPUs. Parallel processing on multiple GPUs is supported during training.

Search Path

export PYTHONPATH=$PROJECT_ROOT/lib:$PYTHONPATH to include proSR into the search path.

Getting the Data

We provide a script data/get_data.sh to download the pretrained models and datasets that we used in this project. This is a large download of approximately 10GB that might take a while to complete. Individual links to the models and datasets are available in the next sections.

Datasets

The results reported in the paper are trained on DIV2K. Improved performance, at the expenses of longer training time, can be obtained adding Flickr2K to the training set. The pretrained models released in this repository have been trained with DIV2K and Flickr2K.

We evaluated the performance of ProSR on the following benchmark datasets:

The above models perform well across different upscaling ratios [2,4,8]. However, best performance is achieved using scale specific models. These models are available in the same folder and are post-fixed with _xSCALE (e.g. proSR_x8.pth) to indicate at which regime they perform best.

Results

Following wide-spread protocol, the quantitative results are obtained converting RGB images to YCbCr and evaluating the PSNR and SSIM on the Y channel only.

Results slightly differ from those reported in the paper for several reasons: this is an independent re-implementation; differently from the paper we trained on on DIV2K and Flickr2K; I picked the best performers validating on Set14 instead of DIV2K.

Training

MODEL is one of prosr or prosrs. The configuration is defined in prosr/config.py. Checkpoints and log files are stored in DIR. Alternatively, the --config flag reads configuration files in yaml format. In PROJECT_ROOT/options we provide config files corresponding to the architectures proposed in the paper.

Loading the dataset

Set the path to the dataset in configs.py:prosr_params.train.path{source,target}. To train on multiple datasets create a new folder containing soft links to the datasets, you want to use for training. For example: ensemble/{DIV2K_train_HR,Flickr2K}.

train.path.source is optional. If left empty, the dataloader will downsample the target images found in train.path.target to the predefined lower resolution.

Resume Training

To resume training from a checkpoint, e.g. data/checkpoints/PRETRAINED_net_G.pth:

python train.py --checkpoint data/checkpoints/PRETRAINED

MultiGPU Training

By default, all available GPUs are used. To use specific GPUs use CUDA_VISIBLE_DEVICES, e.g. export CUDA_VISIBLE_DEVICES=0,1

Visualization

To visualize intermediate results (optional) run the visdom.server in a separate terminal and enable visualization passing the command line arguments: --visdom True --visdom-port PORT-NUMBER.

# Run the server in a separate terminal
python -m visdom.server -port 8067

Testing

LR_INPUT is the low-resolution input and can be a folder, an image or a list of images. If high-resolution images are provided (HR_INPUT), the script will compute the resulting PSNR and SSIM. Alternatively, if only high-resolution images are given as argument, the script will scale HR_INPUT by the inverse of the upscale factor NUMBER and use the result as LR_INPUT.