onnxruntime: a cross-platform inference and training accelerator for machine learning models

What it solves

ONNX Runtime provides a high-performance, cross-platform engine for both running (inference) and training machine learning models. It solves the problem of model portability and performance optimization across different hardware, drivers, and operating systems, allowing developers to move models from training frameworks like PyTorch or TensorFlow to production environments efficiently.

How it works

It acts as an accelerator that leverages hardware accelerators, graph optimizations, and transforms to provide optimal performance. It supports a wide range of models from deep learning frameworks (PyTorch, TensorFlow/Keras) and classical machine learning libraries (scikit-learn, LightGBM, XGBoost).

Who it’s for

Developers and machine learning engineers who need to deploy models across various platforms and ensure they are fast and cost-effective, as well as those looking to accelerate training for transformer models on multi-node NVIDIA GPUs.

Highlights

Cross-platform support: Works across different hardware, drivers, and operating systems.
Broad framework compatibility: Supports models from PyTorch, TensorFlow/Keras, scikit-learn, LightGBM, and XGBoost.
Inference acceleration: Enables faster user experiences and lower costs through hardware acceleration and graph optimizations.
Training acceleration: Accelerates training time for transformer models on multi-node NVIDIA GPUs with minimal code changes to PyTorch scripts.

onnxruntime: a cross-platform inference and training accelerator for machine learning models

onnxruntime: a cross-platform inference and training accelerator for machine learning models

What it solves

How it works

Who it’s for

Highlights

Sources