Momentum Contrast for Unsupervised Visual Representation Learning (CVPR2020)

Balin

Sep 5, 2021

Introduction

本篇提出 Momentum Contrast (MoCo) 的方式建造很大的 dictionaries，透過 queue 的方式進行儲存 sample，把當前 batch 的資料 enqueue，最舊的資料 dequeue，且對資料進行 contrastive learning。

Method

假設當前的資料為 q(query)，會有對應到一個 positive example (k_+)，和其他的 negative example (k_i)，loss 就會如下，q 要和 k_+ 越近越好、和 k_i 越遠越好，τ 是 temperature hyper-parameter 實驗設定為 0.07。

而 key 的 encoder 是透過如下方式進行更新，跟 BYOL 的做法相同，為一種 moving-averaged encoder，實做上 m 為 0.999，效果比 m=0.9 還好。

pseudocode 如下，input 產生 positive sample，再從 queue 取 negative sample，之後做 enqueue 和 dequeue。

此外為了解決 intra-batch communication 的問題也採用了 Shuffling BN 的方式將 BN 套用在不同 GPU 上。

Experiment

linear evaluation 的結果，論文裡面還有許多 transfer 到其他 dataset 的成果，有興趣可以去看論文。

Reference

[arxiv]

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Balin

20 Followers

20 Following

NTUST CSIE

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Recommended from Medium

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jessica Stillman

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

Oct 30, 2024

25K

731

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Level Up Coding

Jacob Bennett

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Jan 7

10.6K

260

Lists

Natural Language Processing

1977 stories1619 saves

data science and AI

40 stories340 saves

Practical Guides to Machine Learning

10 stories2225 saves

Medium's Huge List of Publications Accepting Submissions

414 stories4678 saves

YOLO v3 v5 v8 explanation | YOLO vs. Faster R-CNN

Jo Wang

YOLO v3 v5 v8 explanation | YOLO vs. Faster R-CNN

YOLO (You Only Look Once): YOLO treats object detection as a regression problem, predicting bounding boxes and class probabilities directly…

Oct 20, 2024

YOLOv12: Redefining Real-Time Object Detection 🚀

Henry Navarro

YOLOv12: Redefining Real-Time Object Detection 🚀

Introducing the Pioneering Features and Performance of YOLOv12 from the Latest Research

Feb 19

195

Dr Victoria Powell

You Can Look, But Don’t Touch

The ‘thingness’ of art, and finding meaning in materiality

Nov 16, 2023

768

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

D.H. Jang

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

In the rapidly advancing landscape of artificial intelligence (AI) and machine learning (ML), specific methodologies and their…

Nov 3, 2024

See more recommendations

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams

Momentum Contrast for Unsupervised Visual Representation Learning (CVPR2020)

Introduction

Method

Experiment

Reference

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Balin

No responses yet

More from Balin

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

Method

TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up

Introduction

Fine-Tuning StyleGAN2 For Cartoon Face Generation

Introduction

EfficientNetV2: Smaller Models and Faster Training

前言

Recommended from Medium

Jeff Bezos Says the 1-Hour Rule Makes Him Smarter. New Neuroscience Says He’s Right

Jeff Bezos’s morning routine has long included the one-hour rule. New neuroscience says yours probably should too.

The 5 paid subscriptions I actually use in 2025 as a Staff Software Engineer

Tools I use that are cheaper than Netflix

Lists

Natural Language Processing

data science and AI

Practical Guides to Machine Learning

Medium's Huge List of Publications Accepting Submissions

YOLO v3 v5 v8 explanation | YOLO vs. Faster R-CNN

YOLO (You Only Look Once): YOLO treats object detection as a regression problem, predicting bounding boxes and class probabilities directly…

YOLOv12: Redefining Real-Time Object Detection 🚀

Introducing the Pioneering Features and Performance of YOLOv12 from the Latest Research

You Can Look, But Don’t Touch

The ‘thingness’ of art, and finding meaning in materiality

Interpreting Support Vector Machine Coefficients: A Comprehensive Analysis

In the rapidly advancing landscape of artificial intelligence (AI) and machine learning (ML), specific methodologies and their…