skip to Main Content

Comet is now available natively within AWS SageMaker!

Learn More

Unlocking the Power of ONNX: Model Interoperability and Boosting Performance

Photo by drmakete lab on Unsplash

What is ONNX?

Open Neural Network Exchange, or ONNX, is a free and open-source ecosystem for deep learning model representation. Facebook and Microsoft created this tool in 2017 to make it simpler for academics and engineers to migrate models between various deep-learning frameworks and hardware platforms.

One of ONNX’s key benefits is that it makes it simple to export models from one framework, like PyTorch, and import them into another framework, like TensorFlow. Engineers who need to deploy models on several hardware platforms or academics who wish to test out various frameworks for training and deploying their models may find this extremely helpful.

Benefits of Using ONNX

  • Interoperability: ONNX allows developers to switch between frameworks like PyTorch, TensorFlow, and Caffe2 without trouble. This compatibility makes it easier for businesses to integrate AI models developed with different tools into their platforms.
  • Portability or Platform Independence: ONNX supports a wide range of hardware, making it easier for developers to deploy their models on various devices without worrying about hardware optimizations. This allows for faster development and deployment.
  • Performance: ONNX is optimized for GPU and CPU, ensuring improved performance and training speed. Moreover, the ONNX runtime provides highly efficient performance across multiple platforms.
  • Flexibility: ONNX standardizes deep learning operations, enabling extensive customization for specific use cases.
  • Community Support: As an open-source project, ONNX has a vibrant community of researchers and developers dedicated to improving and supporting the ecosystem.

Now that we have briefly introduced ONNX let’s look at how it works and how the above benefits would apply through an example code.

In the example below, I will demonstrate how to create a simple neural network using PyTorch, convert it to ONNX format, and use ONNX Runtime for evaluation.

Step 1:

pip install torch torchvision onnx

The above code snippet will install the PyTorch framework, TorchVision (a library that provides datasets and models), and ONNX library using the Python pip package manager.

Step 2:

import torch
import torch.nn as nn
import torch.nn.functional as F

class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.fc1 = nn.Linear(28 * 28, 100)
        self.fc2 = nn.Linear(100, 10)
        
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.softmax(self.fc2(x), dim=1)
        return x

model = SimpleModel()

Here, we create a simple, fully connected feed-forward neural network with a single hidden layer. In the __init__() method, I have initialized two fully connected layers, where the first layer has an input size of 28*28 and 100 output units, and the second layer has 100 input units and 10 output units.

In the forward() method contains the forward pass of the neural network. It accepts an input tensor x and applies the ReLU activation function after passing through the first layer. Then, it uses the Softmax activation function after passing through the second layer and returns the x tensor as the output.

Step 3:

import torch.onnx

dummy_input = torch.randn(1, 28*28)
onnx_filename = "simple_test_model.onnx"

torch.onnx.export(model, dummy_input, onnx_filename, verbose=True, input_names=['input'], output_names=['output'])

In this step, we import the torch.onnxpackage for ONNX conversion. We create a dummy_input, which is a random tensor. The input tensor is expected to be of size (1, 28*28), where 1 represents the batch size and 28*28 is the input dimension.

We then define the name of the ONNX model file as simple_test_model.onnx. The torch.onnx.export() function is used to convert the PyTorch model to ONNX format and save it in the file.

The input_names and output_names arguments are optional but help identify the input and output tensors when using the ONNX model later.

Step 4:

pip install onnxruntime

This above command will install the ONNX runtime, ONNX Runtime is a high-performance, cross-platform library for running ONNX standard models on various devices and platforms or languages.

import onnxruntime as ort
import numpy as np

ORT_session = ort.InferenceSession(onnx_filename)

def run_model(input_data):
    input_data = input_data.astype(np.float32)
    input_name = ORT_session.get_inputs()[0].name
    output_name = ORT_session.get_outputs()[0].name
    result = ORT_session.run([output_name], {input_name: input_data})
    return result[0]

# Define your input data, for example, a random tensor
input_data = np.random.randn(1, 28*28)

# Get output
output = run_model(input_data)
print(output)

I have created an inference session using the saved ONNX model using the file name. Then, I created a function that first casts the input data type to float32 because it is necessary for ONNX Runtime. Finally, we can run the model using the ORT_session.run() and get the result.

Comparative Analysis

By examining the above example, we can better understand some of the benefits ONNX provides.

It provides interoperability by supporting various deep-learning frameworks, allowing models to be converted from frameworks such as PyTorch and TensorFlow. In the above code, we converted the PyTorch model to ONNX format.

ONNX enables easy deployment for models across various platforms and programming languages. In the above code, we ran the ONNX model using the ONNX Runtime library, which is available for different platforms. In my personal experience, I have converted a TensorFlow model to an ONNX model and deployed it on a NodeJS server using the ONNX Runtime library.

ONNX Runtime is designed to provide optimized execution of models, both static and dynamic optimization. As shown in the code above, running inference with ONNX Runtime capitalizes on these optimizations to offer faster, low-latency inference than running the model directly in the training framework. This results in better resource utilization and efficient model deployments.

Conclusion

ONNX: An indispensable asset for AI developers that provides unmatched flexibility in selecting tools based on individual requirements while ensuring utmost compatibility, portability, and performance. Our article offers detailed instructions on developing a straightforward neural network utilizing PyTorch before assigning it an ONNX format that allows for inference using the ONNX runtime.

Ravindu Senaratne, Heartbeat author

Ravindu Senaratne

Back To Top