CoTracker: A Powerful Model for Tracking Any Point on a Video

Introduction

CoTracker, developed by Meta AI Research and the University of Oxford, is a cutting-edge model designed to track any point or pixel within a video. Leveraging a fast transformer-based architecture, CoTracker brings the benefits of optical flow to point tracking, offering robust and accurate performance.

Key features of CoTracker include:

Tracking any individual pixel in a video.
Tracking a quasi-dense set of pixels simultaneously.
Flexibility to select points manually or sample them on a grid in any video frame.

The project has seen continuous development, with significant updates including CoTracker3, which offers simpler and better point tracking through pseudo-labelling real videos, and the release of the Kubric Dataset for enhanced training. CoTracker has also been instrumental in other projects, such as VGGSfM, a fully differentiable SfM framework.

Installation

CoTracker can be easily integrated into your projects. A GPU is strongly recommended for optimal performance.

Via PyTorch Hub

The easiest way to get started is by loading a pretrained model from torch.hub.

Offline mode:

pip install imageio[ffmpeg]

import torch
import imageio.v3 as iio

url = 'https://github.com/facebookresearch/co-tracker/raw/refs/heads/main/assets/apple.mp4'
frames = iio.imread(url, plugin="FFMPEG")

device = 'cuda'
grid_size = 10
video = torch.tensor(frames).permute(0, 3, 1, 2)[None].float().to(device)

cotracker = torch.hub.load("facebookresearch/co-tracker", "cotracker3_offline").to(device)
pred_tracks, pred_visibility = cotracker(video, grid_size=grid_size)

Online mode:

import torch
import imageio.v3 as iio

url = 'https://github.com/facebookresearch/co-tracker/raw/refs/heads/main/assets/apple.mp4'
frames = iio.imread(url, plugin="FFMPEG")

device = 'cuda'
grid_size = 10
video = torch.tensor(frames).permute(0, 3, 1, 2)[None].float().to(device)

cotracker = torch.hub.load("facebookresearch/co-tracker", "cotracker3_online").to(device)
cotracker(video_chunk=video, is_first_step=True, grid_size=grid_size)

for ind in range(0, video.shape[1] - cotracker.step, cotracker.step):
    pred_tracks, pred_visibility = cotracker(
        video_chunk=video[:, ind : ind + cotracker.step * 2]
    )

Development Version

To install CoTracker from the GitHub repository, which is ideal for running local demos or for evaluation and training:

git clone https://github.com/facebookresearch/co-tracker
cd co-tracker
pip install -e .
pip install matplotlib flow_vis tqdm tensorboard

Examples

CoTracker provides various ways to interact with its powerful tracking capabilities.

Visualize Predicted Tracks

After installation, you can visualize the tracks with a simple script:

from cotracker.utils.visualizer import Visualizer

vis = Visualizer(save_dir="./saved_videos", pad_value=120, linewidth=3)
vis.visualize(video, pred_tracks, pred_visibility)

Interactive Demos

Hugging Face Space: Explore the interactive demo on the facebook/cotracker Hugging Face Space.
Google Colab: Run the notebook directly in Google Colab.
Local Gradio Demo: Run the Gradio demo locally by installing requirements (pip install -r gradio_demo/requirements.txt) and executing python -m gradio_demo.app.

Command Line Demos

Offline Demo: python demo.py --grid_size 10 (results saved to ./saved_videos/demo.mp4).
Online Demo: python online_demo.py.

Why Use CoTracker?

CoTracker stands out as a robust solution for point tracking due to several compelling reasons:

State-of-the-Art Performance: CoTracker3 achieves impressive results on benchmarks like TAP-Vid, demonstrating superior accuracy compared to previous versions and other models.
Versatility: It can track any pixel, a quasi-dense set of pixels, and supports both manual and grid-based point selection.
Efficiency: The transformer-based architecture ensures fast processing, and the online mode offers memory-efficient tracking for longer videos.
Continuous Improvement: The project is actively maintained with regular updates, including new models, datasets, and features.
Developer-Friendly: With PyTorch Hub integration, clear installation instructions, and various demo options, CoTracker is accessible for both quick experimentation and in-depth development.

CoTracker: A Powerful Model for Tracking Any Point on a Video

Summary

Repository Information

Topics

Use at your own risk