How To Prepare for the Future of Machine Vision

How To Prepare for the Future of Machine Vision

Over the past decade, the promise of machine vision has undeniably taken off. From self-driving cars all the way to facial recognition doorbells, the applications have captured the imagination of the public. To get these solutions right and be where we are today,a massive amount of work was required on the embedded infrastructure. As a developer of products, I find it exciting to see the industry continue to learn and evolve in a scalable way as the demand for machine vision has grown. I would like to share three current trends in this space:

1: Levels of recognition.Often, we get requests to “recognize” an object or people. Recognition has a wide span of meanings. First, there is the deep learning/machine learning level of recognition driven by the real-time needs of self-driving, facial-based identity, and instant awareness of a large number of objects. Figure 1’s upper-right quadrant represents this well, and is an area that many are calling ubiquitous in commercial and industrial applications.

In most cases, however, we have found that lower levels of recognition are good enough for the application rest. There is feature-level recognition (see Figure 2) which is looking for the existence of features in an image,and there is also basic object detection. In both cases, there have been growth in options for developers.

Referring to Figure 1 again, these interesting options are represented by the upper-left and lower-right quadrants.

• The upper-left quadrant represents the sort of new options available using powerful edge compute devices, but with lower-end optics and sensors. An example of this is the optical scanner used for fingerprint identification on smartphones. Accurate results can be taken as low as 500dpi optical sensor.

• The lower-right quadrant represents the options where a more embedded, autonomous compute is paired with higher-end optics and sensors. The earliest Ring doorbell was powered by an iMX.RT from NXP.

2: Independence at the Edge. The growth in lower-level recognition options has created an architectural shift in how dependent deep-learning systems are on the Cloud.

There are now application-specific AI Accelerators that blur the lines between all three quadrants. Google’s Edge TPU, Intel’s Movidius VPUs, NVIDIA Jetson Nano, and others, through parallel computing optimized for machine learning, can add inference to the system—Learning locally, not connected to the Cloud.

Not all applications need continuous learning. Low-end hardware that only intermittently connects to the Cloud can participate in learning by uploading privacy-safe metrics that then contribute to Big Data which can be used to improve a model that comes down later in a firmware update as well as discover new patterns.

3: Imaging is Everywhere. The recent need to work from home has rapidly changed the average person’s exposure to video and imaging. This has made it more clear than ever that imaging truly is everywhere.A few proof points are as follows.

• Identity using 1-to-1 Match. Earlier this year, taxpayers looking to get a federal Identity Protection (IP) PIN from the IRS website found themselves transferring the process of biometric identification to their smartphone. The app, run by ID.me, does a 1-to-1 match that compares the photo of a government-issued ID to a selfie. Only after a match occurs are users sent back to the website for the issue of the IP PIN.

"The growth in lower-level recognition options has created an architectural shift in how dependent deep-learning systems are on the Cloud"

• Layered Biometrics and Encrypted Coding. My first encounter with the CLEAR app was at CES2022, which required a similar matching process to confirm vaccination status to a government-issued ID.CLEAR, which is more prominently used as identification for airport security, layers both eyes and face detection to create an encrypted code used for identification as well as safekeeping of biometric data.

• Classic image processing and UX. The entire imaging chain is only as good as what is coming in. In both ID.me and CLEAR, users may notice guide marks and provide suggestions on the positioning of the government-issued ID. This is an example of how the marriage of user experience and classic image processing—setting the region of interest—is used to increase the success of the more complex algorithms running above it. Ultimately, it is about making users more successful in the new experience.

In closing, we, as product developers, need to determine the real level of recognition that is needed. What makes machine vision exciting is the ever-expanding applications that it can be used, and the breadth of options available to developers to make the experience successful.

Read Also

Test and Quality, Creston's Approach to Customer Loyalty

Ivan Farias, Senior Director, Test Engineering at Crestron Electronics

Unlocking the true value of Data - Lessons from the journey

Adarsha Marpalli, Global Head Data Governance and Insights, Nexperia

Agile Development of AI Defect Detection Using CNN

Marco Chan, VP of Technology, Wise Ally Holdings

Shaping The Future Of Manufacturing With Artificial Intelligence

Dr Hans-Jurgen Braun, Senior Vice President, Operations, Vitesco Technologies (ETR: VTSC)

Instilling Operational Excellence In Your Company

Luc Roesems, Vice President of Manufacturing, Samsonite

Hardware is Key to Stronger, More Scalable Security

Steve Orrin, Federal CTO and Senior Principal Engineer, Intel