AI on the Edge: Bridging the gap between research and production

Penned by our Machine Learning (Argo) Team: Renz Iver Baliber, Jebb Matthew Luza, Neil Ragadio & Christian Ray Suello

‍

One of the challenges in the AI industry is the ability to implement research innovations into an actual working tool that will be useful in production. Researchers in this field constantly look for ways to develop new approaches that will give state of the art performance in various tasks, which, more often than not, are only measured through model accuracy.

Where is the gap?

In a real world setting, there is a constant search for the right balance between a model’s effectiveness, efficiency, reliability, and interpretability. The same can be said with what we do at meldCX, where we are constantly challenged to bring lightweight and effective models on edge devices.

Understanding the differences

Working in a fast-paced industry means that we have to make sure that we are always on the lookout for new methods and innovations that would improve our products and solutions. It is not always easy to make the implementations work on edge deployments, due to the following:

Slow inference speed - A lot of new model architectures give highly accurate results but require heavy compute power to crunch data, which is not always feasible for edge devices.
Data - As with any type of machine learning models, it is still challenging for us to make sure that the model will be robust enough to handle production data where the data is constantly changing — a problem that we are trying to solve with synthetic data.
Portability - It is also important that a model can be easily embedded or implemented on our existing pipeline. A lot of new models have high dependency on certain software/tools that are hard to incorporate into our pipeline.

In order to keep up with the breakneck speed at which the AI world is innovating, we have incorporated continuous and nonstop research into our pipeline. It’s always exciting to see new trends, which gives us a chance to experiment and improve our model performance.

Scaling down to scale up

One of the challenges we have is supporting model performance when it scales. There’s a huge difference between running a single setup and scaling it to hundreds or thousands of instances.

The size and complexity of a model are crucial factors in bringing AI to the Edge. These factors affect the model’s capability to provide accurate results, fast inference speed, and its power consumption, leading up to a tradeoff in performance. A collaborative research of academic and industry leaders in productionalizing intelligent systems shows the diversity of machine learning models and how each one provides a tradeoff between accuracy and computational requirements.

With all these challenges and limitations, It is important to take note that there are a number of existing tools that make it possible for us to do intelligent edge computing.

Tools like OpenVino by Intel and Tensorflow Lite helps us in scaling down our model through model quantization while also providing specialized inference infrastructure that helps us deliver results fast. ONNX on the other hand, helps us to make sure that we can easily port our models from one framework to another.

Closing thoughts

Being at the forefront of Edge AI means that we have to innovate at speed and scale. It is essential to understand the challenges and be aware of the existing resources that could help us address them.

The AI industry has a very open and collaborative community, and we’re glad to take part in the ongoing conversation. Through constant research, we are always inspired to learn how our peers have solved the epic problems that we are also solving.