Over the past decade, the promise of machine vision has undeniably taken off. From self-driving cars all the way to facial recognition doorbells, the applications have captured the imagination of the public. To get these solutions right and be where we are today, a massive amount of work was required on the embedded infrastructure. As a developer of products, I find it exciting to see the industry continue to learn and evolve in a scalable way as the demand for machine vision has grown. I would like to share three current trends in this space:
1: Levels of recognition. Often, we get requests to “recognize” an object or people. Recognition has a wide span of meanings. First, there is the deep learning/machine learning level of recognition driven by the real-time needs of self-driving, facial-based identity, and instant awareness of a large number of objects. Figure 1’s upper-right quadrant represents this well, and is an area that many are calling ubiquitous in commercial and industrial applications.
In most cases, however, we have found that lower levels of recognition are good enough for the application rest. There is feature-level recognition (see Figure 2) which is looking for the existence of features in an image, and there is also basic object detection. In both cases, there have been growth in options for developers.
Referring to Figure 1 again, these interesting options are represented by the upper-left and lower-right quadrants.
• The upper-left quadrant represents the sort of new options available using powerful edge compute devices, but with lower-end optics and sensors. An example of this is the optical scanner used for fingerprint identification on smartphones. Accurate results can be taken as low as 500dpi optical sensor.
• The lower-right quadrant represents the options where a more embedded, autonomous compute is paired with higher-end optics and sensors. The earliest Ring doorbell was powered by an iMX.RT from NXP.
2: Independence at the Edge. The growth in lower-level recognition options has created an architectural shift in how dependent deep-learning systems are on the Cloud.