Deep Image Matting (CVPR2017)

3 min readMay 16, 2021

前言

訓練 encoder-decoder 和 refinement network，輸入為 image 和 trimap，輸出為 alpha matte，此外此篇也提出了 Adobe Matting Dataset。

Dataset

由於 alphamatting.com 的資料集太少，僅有 27 張 training images 和 8 張 testing images，根本不夠 Deep learning 使用，因此作者提出了 Adobe Matting Dataset，如下圖。
其是透過融合其他前景資料集如：alphamatting.com、Temporally coherent and spatially accurate video matting。
以及背景資料集如：MS COCO、Pascal VOC。
因此最後有 493 * 100 = 49300， 50 * 20 = 1000 的 training/testing images，493 和 50 為前景影像數量，100 和 20 為背景影像數量。

方法

stage 1 只有 train Encoder-Decoder，stage 2 只 train refinement，都收斂之後再一起 fine-tune。

In addition, since only the alpha values inside the unknown regions of trimaps need to be inferred, we therefore set additional weights on the two types of losses according to the pixel locations, which can help our network pay more attention on the important areas.
Specifically, w_i = 1 if pixel i is inside the unknown region of the trimap while w_i = 0 otherwise

augmentation

randomly crop 320×320 (image, trimap) pairs centered on pixels in the unknown regions.
Crop training pairs with different sizes (e.g. 480×480, 640×640) and resize them to 320×320
flipping
trimaps are randomly dilated from their ground truth alpha mattes (stage 2 沒有使用)

實驗

Composition-1k test set. Our composition-based dataset includes 1000 images and 50 unique foregrounds. This dataset has a wider range of object types and background scenes.

另外也有把 input 的 trimap 做 dilate 測試對 trimap 的穩健程度

一些比較的圖片

User Study

Reference

[arxiv]

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Maching Learning

Written by Balin

NTUST CSIE

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams