onnxruntime: a cross-platform inference and training accelerator for machine learning models
onnxruntime: a cross-platform inference and training accelerator for machine learning models
What it solves
ONNX Runtime provides a high-performance, cross-platform engine for both running (inference) and training machine learning models. It solves the problem of model portability and performance optimization across different hardware, drivers, and operating systems, allowing developers to move models from training frameworks like PyTorch or TensorFlow to production environments efficiently.
How it works
It acts as an accelerator that leverages hardware accelerators, graph optimizations, and transforms to provide optimal performance. It supports a wide range of models from deep learning frameworks (PyTorch, TensorFlow/Keras) and classical machine learning libraries (scikit-learn, LightGBM, XGBoost).
Who it’s for
Developers and machine learning engineers who need to deploy models across various platforms and ensure they are fast and cost-effective, as well as those looking to accelerate training for transformer models on multi-node NVIDIA GPUs.
Highlights
- Cross-platform support: Works across different hardware, drivers, and operating systems.
- Broad framework compatibility: Supports models from PyTorch, TensorFlow/Keras, scikit-learn, LightGBM, and XGBoost.
- Inference acceleration: Enables faster user experiences and lower costs through hardware acceleration and graph optimizations.
- Training acceleration: Accelerates training time for transformer models on multi-node NVIDIA GPUs with minimal code changes to PyTorch scripts.
Sources
- undefinedmicrosoft/onnxruntime