PyTorch Image Models (timm): The Ultimate Collection of Image Encoders

PyTorch Image Models (timm): The Ultimate Collection of Image Encoders

Summary

PyTorch Image Models (timm) is an extensive library offering the largest collection of PyTorch image encoders and backbones. It provides a wide array of state-of-the-art models, complete with pretrained weights, training, evaluation, and inference scripts. This makes it an invaluable resource for researchers and developers working with computer vision tasks in PyTorch.

Repository Info

Updated on May 5, 2026
View on GitHub

Introduction

PyTorch Image Models, commonly known as timm, is a comprehensive library for computer vision in PyTorch. It stands as the largest collection of image encoders and backbones, designed to provide a wide variety of state-of-the-art models with reproducible ImageNet training results. timm integrates numerous model architectures, layers, utilities, optimizers, schedulers, and data augmentations, making it a go-to resource for deep learning practitioners.

Installation

Getting timm up and running is straightforward. You can install it using pip:

pip install timm

For the latest features or development, you might clone the repository and install it locally.

Examples

Using timm to load a pretrained model is simple. Here's how you can load a ResNet50 and perform a forward pass:

import torch
import timm

# Load a pretrained ResNet50 model
model = timm.create_model('resnet50', pretrained=True)
model.eval()

# Create a dummy input tensor (batch size 1, 3 channels, 224x224 image)
input_tensor = torch.randn(1, 3, 224, 224)

# Perform a forward pass
with torch.no_grad():
    output = model(input_tensor)

print(f"Output shape: {output.shape}")

Why Use timm?

Unparalleled Model Collection: timm offers an extensive range of models, including ResNet, EfficientNet, Vision Transformers (ViT), ConvNeXt, and many more, often with multiple variants and pretrained weights. This makes it easy to experiment with different architectures.

Reproducible Results: The library focuses on reproducing ImageNet training results, providing reliable baselines for research and development.

Rich Ecosystem: Beyond models, timm includes a robust set of optimizers, learning rate schedulers, data augmentations (like AutoAugment, RandAugment, Mixup, CutMix), and regularization techniques (DropPath, DropBlock), streamlining the entire deep learning pipeline.

Flexible API: All models share a common API for accessing classifiers, performing feature extraction, and supporting multi-scale feature maps, ensuring consistency and ease of integration into various projects.

Links

For more detailed information, documentation, and community resources, refer to the official links: