This repository contains links to pre-trained models, sample scripts, best practices, and step-by-step tutorials for many popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs.
Containers for running the workloads can be found at Intel® AI Containers.
Intel® AI Reference Models in a Jupyter Notebook is also available for the listed workloads
Intel optimizes popular deep learning frameworks such as TensorFlow* and PyTorch* by contributing to the upstream projects. Additional optimizations are built into plugins/extensions such as the Intel Extension for Pytorch* and the Intel Extension for TensorFlow*. Popular neural network models running against common datasets are the target workloads that drive these optimizations.
The purpose of the Intel® AI Reference Models repository (and associated containers) is to quickly replicate the complete software environment that demonstrates the best-known performance of each of these target model/dataset combinations. When executed in optimally-configured hardware environments, these software environments showcase the AI capabilities of Intel platforms.
DISCLAIMER: These scripts are not intended for benchmarking Intel platforms. For any performance and/or benchmarking information on specific Intel platforms, visit https://www.intel.ai/blog.
Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.
The Intel® AI Reference Models is licensed under Apache License Version 2.0.
To the extent that any public datasets are referenced by Intel or accessed using tools or code on this site those datasets are provided by the third party indicated as the data source. Intel does not create the data, or datasets, and does not warrant their accuracy or quality. By accessing the public dataset(s) you agree to the terms associated with those datasets and that your use complies with the applicable license.
Please check the list of datasets used in Intel® AI Reference Models in datasets directory.
Intel expressly disclaims the accuracy, adequacy, or completeness of any public datasets, and is not liable for any errors, omissions, or defects in the data, or for any reliance on the data. Intel is not liable for any liability or damages relating to your use of public datasets.
The model documentation in the tables below have information on the prerequisites to run each model. The model scripts run on Linux. Certain models are also able to run using bare metal on Windows. For more information and a list of models that are supported on Windows, see the documentation here.
Instructions available to run on Sapphire Rapids.
For best performance on Intel® Data Center GPU Flex and Max Series, please check the list of supported workloads. It provides instructions to run inference and training using Intel(R) Extension for PyTorch or Intel(R) Extension for TensorFlow.
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
ResNet 50v1.5 Sapphire Rapids | TensorFlow | Inference | Int8 FP32 BFloat16 BFloat32 | ImageNet 2012 |
ResNet 50v1.5 Sapphire Rapids | TensorFlow | Training | FP32 BFloat16 BFloat32 | ImageNet 2012 |
ResNet 50 | PyTorch | Inference | Int8 FP32 BFloat16 BFloat32 | [ImageNet 2012] |
ResNet 50 | PyTorch | Training | FP32 BFloat16 BFloat32 | [ImageNet 2012] |
Vision Transformer | PyTorch | Inference | FP32 BFloat16 BFloat32 FP16 INT8 | [ImageNet 2012] |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
3D U-Net | TensorFlow | Inference | FP32 BFloat16 Int8 | BRATS 2018 |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
BERT large Sapphire Rapids | Tensorflow | Inference | FP32 BFloat16 Int8 BFloat32 | SQuAD |
BERT large Sapphire Rapids | Tensorflow | Training | FP32 BFloat16 BFloat32 | SQuAD |
BERT large (Hugging Face) | TensorFlow | Inference | FP32 FP16 BFloat16 BFloat32 | SQuAD |
BERT large | PyTorch | Inference | FP32 Int8 BFloat16 BFloat32 | BERT Large SQuAD1.1 |
BERT large | PyTorch | Training | FP32 BFloat16 BFloat32 | preprocessed text dataset |
DistilBERT base | PyTorch | Inference | FP32 BF32 BF16Int8-FP32 Int8-BFloat16 BFloat32 | DistilBERT Base SQuAD1.1 |
RNN-T | PyTorch | Inference | FP32 BFloat16 BFloat32 | RNN-T dataset |
RNN-T | PyTorch | Training | FP32 BFloat16 BFloat32 | RNN-T dataset |
GPTJ 6B | PyTorch | Inference | FP32 FP16 BFloat16 BF32 INT8 | |
GPTJ 6B MLPerf | PyTorch | Inference | INT4 | CNN-Daily Mail dataset |
LLAMA2 7B | PyTorch | Inference | FP32 FP16 BFloat16 BF32 INT8 | |
LLAMA2 7B | PyTorch | Training | FP32 FP16 BFloat16 BF32 | |
LLAMA2 13B | PyTorch | Inference | FP32 FP16 BFloat16 BF32 INT8 | |
ChatGLMv3 6B | PyTorch | Inference | FP32 FP16 BFloat16 BF32 INT8 |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
BERT | TensorFlow | Inference | FP32 | MRPC |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
Mask R-CNN | PyTorch | Inference | FP32 BFloat16 BFloat32 | COCO 2017 |
Mask R-CNN | PyTorch | Training | FP32 BFloat16 BFloat32 | COCO 2017 |
SSD-ResNet34 | PyTorch | Inference | FP32 Int8 BFloat16 BFloat32 | COCO 2017 |
SSD-ResNet34 | PyTorch | Training | FP32 BFloat16 BFloat32 | COCO 2017 |
Yolo V7 | PyTorch | Inference | Int8 FP32 FP16 BFloat16 BFloat32 | [COCO 2017](/models_v2/pytorch/yolov7/inference/cpu/README.md## Prepare Dataset) |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
Wide & Deep | TensorFlow | Inference | FP32 | Census Income dataset |
DLRM | PyTorch | Inference | FP32 Int8 BFloat16 BFloat32 | Criteo Terabyte |
DLRM | PyTorch | Training | FP32 BFloat16 BFloat32 | Criteo Terabyte |
DLRM v2 | PyTorch | Inference | FP32 FP16 BFloat16 BFloat32 Int8 | Criteo 1TB Click Logs dataset |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
Stable Diffusion | TensorFlow | Inference | FP32 BFloat16 FP16 | COCO 2017 validation dataset |
Stable Diffusion | PyTorch | Inference | FP32 BFloat16 FP16 BFloat32 Int8-FP32 Int8-BFloat16 | COCO 2017 validation dataset |
Stable Diffusion | PyTorch | Training | FP32 BFloat16 FP16 BFloat32 | cat images |
Latent Consistency Models(LCM) | PyTorch | Inference | FP32 BFloat16 FP16 BFloat32 Int8-FP32 Int8-BFloat16 | COCO 2017 validation dataset |
Model | Framework | Mode | Model Documentation | Benchmark/Test Dataset |
---|---|---|---|---|
GraphSAGE | TensorFlow | Inference | FP32 BFloat16 FP16 Int8 BFloat32 | Protein Protein Interaction |
*Means the model belongs to MLPerf models and will be supported long-term.
Model | Framework | Mode | GPU Type | Model Documentation |
---|---|---|---|---|
ResNet 50v1.5 | TensorFlow | Inference | Flex Series | Float32 TF32 Float16 BFloat16 Int8 |
ResNet 50 v1.5 | TensorFlow | Training | Max Series | BFloat16 FP32 |
ResNet 50 v1.5 | PyTorch | Inference | Flex Series, Max Series, Arc Series | Int8 FP32 FP16 TF32 |
ResNet 50 v1.5 | PyTorch | Training | Max Series, Arc Series | BFloat16 TF32 FP32 |
DistilBERT | PyTorch | Inference | Flex Series, Max Series | FP32 FP16 BF16 TF32 |
DLRM v1 | PyTorch | Inference | Flex Series | FP16 FP32 |
SSD-MobileNet* | PyTorch | Inference | Arc Series | INT8 FP16 FP32 |
EfficientNet | PyTorch | Inference | Flex Series | FP16 BF16 FP32 |
EfficientNet | TensorFlow | Inference | Flex Series | FP16 |
FBNet | PyTorch | Inference | Flex Series | FP16 BF16 FP32 |
Wide Deep Large Dataset | TensorFlow | Inference | Flex Series | FP16 |
YOLO V5 | PyTorch | Inference | Flex Series | FP16 |
BERT large | PyTorch | Inference | Max Series, Arc Series | BFloat16 FP32 FP16 |
BERT large | PyTorch | Training | Max Series, Arc Series | BFloat16 FP32 TF32 |
BERT large | TensorFlow | Training | Max Series | BFloat16 TF32 FP32 |
DLRM v2 | PyTorch | Inference | Max Series | FP32 BF16 |
DLRM v2 | PyTorch | Training | Max Series | FP32 TF32 BF16 |
3D-Unet | PyTorch | Inference | Max Series | FP16 INT8 FP32 |
3D-Unet | TensorFlow | Training | Max Series | BFloat16 FP32 |
Stable Diffusion | PyTorch | Inference | Flex Series, Max Series, Arc Series | FP16 FP32 |
Stable Diffusion | TensorFlow | Inference | Flex Series | FP16 FP32 |
Mask R-CNN | TensorFlow | Inference | Flex Series | FP32 Float16 |
Mask R-CNN | TensorFlow | Training | Max Series | FP32 BFloat16 |
Swin Transformer | PyTorch | Inference | Flex Series | FP16 |
FastPitch | PyTorch | Inference | Flex Series | FP16 |
UNet++ | PyTorch | Inference | Flex Series | FP16 |
RNN-T | PyTorch | Inference | Max Series | FP16 BF16 FP32 |
RNN-T | PyTorch | Training | Max Series | FP32 BF16 TF32 |
IFRNet | PyTorch | Inference | Flex Series | FP16 |
RIFE | PyTorch | Inference | Flex Series | FP16 |
If you would like to add a new benchmarking script, please use this guide.