Custom Vision (Image Recognition) with on-device processing

Last year, I wrote about how you can enhance your business using the custom vision with Bots (read: Azure Bot Service). It was a time when Custom Vision service was in its early preview and you could only use your model via cloud. There was no possibility to consume it locally resulting in a relatively slower performance and higher cost.

This year, Microsoft has finally come up with one of the much needed approaches of Machine Learning to give you an ability to export your model. More interestingly, the export does not just support models for Microsoft’s platform but CoreML (iOS), TensorFlow (Android / many others) and ONNX as well. So now, if you’re a startup who is managing ticketing service, you can also provide the ticket scanning and validation service by just integrating this.

This export feature is an exponential growth indicator for your model. You create one custom vision model (that takes maximum 5 minutes provided you have all the images with you) and integrate the same seamlessly within your apps.

I always try my best to put everything together whenever I am writing a blog post. Therefore, I have tried to plug this model with UWP for the demo purpose. However, if you’re a passionate iOS or Android developer who wants to create a demo using the same model and share them with the community, you can now fork and contribute.

Custom Vision

Custom Vision is not a new service by Microsoft. Having said that, they have been improving periodically and this time, you have the on-device model for your offline execution.

So the use-case of my post is the identification of the type of National Identity Card (NIC) for the Islamic Republic of Pakistan. There are 4 types of identity cards in Pakistan which includes classic, smart card (with the chip) and NIC for Overseas Residents. I have tried to create a custom vision model to classify them so that anyone who is standing at the government office’s reception can easily scan using this app and validate the person accordingly.

The steps are simplest enough to create a basic model.

I uploaded all the images
Classified them one by one according to the categories. Each tag requires at least 5 images (I struggled to get them as I am not in Pakistan)
Once you’re done with the data, you can then train it.
If you want to evaluate, you can upload another image and it will give you the results.
As soon as you are satisfied with the results, you can either use online API or download the model by exporting it with respect to your desired platform.
In case you want to consume the API, you can get the code and try yourself.
If you want to consume the model for offline execution, the UWP sample is uploaded. All you need are the real NICs.

It took me 5 minutes to create the model from the portal and another 30 minutes to prepare this demo. This may not be an ideal solution for you to validate NIC cards but it must have given you an idea as how you can integrate such goodness in your apps.

Happy coding!

Cognitive Services

Custom Vision (Image Recognition) with on-device processing

Custom Vision

Arafat