torchmetrics: a scalable PyTorch metrics library for distributed training and evaluation

What it solves

TorchMetrics provides a standardized way to calculate machine learning metrics for PyTorch applications. It eliminates the boilerplate code typically required to accumulate and synchronize metrics across multiple batches and distributed devices (such as multiple GPUs or nodes), ensuring results are reproducible and scalable.

How it works

The library offers two primary ways to compute metrics:

Module-based metrics: These act like PyTorch modules, maintaining an internal state to automatically track and accumulate data across batches. They handle synchronization across multiple devices automatically, making them compatible with CPU, single GPU, or multi-GPU setups.
Functional metrics: These are simple Python functions that take tensors as input and return a metric value immediately, without maintaining state.

Users can also create custom metrics by subclassing torchmetrics.Metric and defining how the metric should update its state and compute the final result.

Who it’s for

It is designed for PyTorch developers and machine learning engineers who need to track model performance across diverse domains (audio, text, image, etc.) and those working with distributed training at scale.

Highlights

Extensive Library: Includes over 100 built-in metrics covering classification, regression, segmentation, audio, text, and multimodal data.
Distributed Support: Built-in automatic synchronization and accumulation for multi-device training.
Customizable: Easy API for implementing custom metrics by subclassing a base class.
Visualization: Integrated plotting support to visualize metric progress over time.

torchmetrics: a scalable PyTorch metrics library for distributed training and evaluation

torchmetrics: a scalable PyTorch metrics library for distributed training and evaluation

What it solves

How it works

Who it’s for

Highlights

Sources