Code for paper "Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context"

Abstract: Text removal has attracted increasingly attention due to its various applications on privacy protection, document restoration, and text editing. It has shown significant progress with deep neural network. However, most of the existing methods often generate inconsistent results for complex background. To address this issue, we propose a Contextual-guided Text Removal Network, termed as CTRNet. CTRNet explores both low-level structure and high-level discriminative context feature as prior knowledge to guide the process of background restoration. We further propose a Local-global Content Modeling (LGCM) block with CNNs and Transformer-Encoder to capture local features and establish the long-term relationship among pixels globally. Finally, we incorporate LGCM with context guidance for feature modeling and decoding. Experiments on benchmark datasets, SCUT-EnsText and SCUT-Syn show that CTRNet significantly outperforms the existing state-of-the-art methods. Furthermore, a qualitative experiment on examination papers also demonstrates the generalization ability of our method. The codes and supplement materials are available at this https URL.


This repository is the implementation of "Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context". paper supple

The inference codes are available.

We have updated our retrained model weights on Jul 27. You can download it Here.

For any questions, please email to me. Thank you for your interest.


My environment can be refered as follows:

  • Python 3.6.9
  • PyTorch 1.3 (1.3+ is also work)
  • Inplace_Abn
  • torchlight
  • Polygon
  • shapely
  • skimage

Install torchlight

cd ./torchlight
python install


We use SCUT-EnsText and SCUT-Syn.

After downloading, run to generate data lists.

mkdir datasets
python --path path_to_enstext_test_set --output ./datasets/enstext_test.flist

All the images are set to 512 * 512. The strucuture images for LCG block are generated by the official code in RTV methods. You can generate the data yourselves, and we will also provide the test data here. data.


For generating the results with text removal, the commond is as follows:

        --bs 1 --gpus 1 --prefix CTRNet \
        --img_flist your/test/flist/ \
        --model your/model/weights --save_path ./results --save \

The PSNR is calculated with skimage.metrics.peak_signal_noise_ratio.


The repository is benefit a lot from SPL and DETR. Thanks a lot for their excellent work.


If you find our method or dataset useful for your reserach, please cite:

  author     ={Liu, Chongyu and Jin, Lianwen and Liu, Yuliang and Luo, canjie and Chen, Bangdong and Guo, Fengjun and Ding, Kai},
  journal    ={ECCV},
  title      ={Don’t Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context},
  year       ={2022},}


Suggestions and opinions of our work (both positive and negative) are welcome. Please contact the authors by sending email to Chongyu Liu([email protected]). For commercial usage, please contact Prof. Lianwen Jin via ([email protected]).

Download Source Code

Download ZIP

Paper Preview

Aug 16, 2022