Seriously: What you can see is far more than what you see!
The title might be a mouthful, but it reflects a fact: In addition to our own eyes, there is a growing number of "mechanical eyes" that help us observe, document, and analyze our surroundings in the world today. This greatly expands and enhances our ability to see. At present, China's public security departments have installed over 20 million surveillance cameras, which generate up to 7,500PB of data a month. If we consider surveillance cameras installed by families and individuals, the rear cameras on our cars for parking, dashboard cameras, and cool consumer electronics products such as GoPro, we are already afloat in a sea of machine vision without even knowing it.
As the name suggests, the core value of machine vision is sensing, analyzing, and making decisions based on images and video on behalf of the human eye and human brain. Even though machine vision is not really a new concept, rapid developments have been made in the past decade, and it is expected to continue developing at a considerable pace for the foreseeable future. This development trend of machine vision benefits from three factors.
The first factor is Moore's law. In terms of hardware, a machine vision system comprises two core devices: One is the CMOS image sensor (the camera), and the other is the processor. Both of these can be made by through a standard CMOS semiconductor process. They are thus "cursed" by Moore's law — that is, the endless pursuit of lower power consumption, lower cost, smaller size, and higher performance — always racing to achieve lower cost of machine vision. Think about the resolution of the camera in an entry level phone today. It might be equal to the image sensor on a middle to high end single lens reflex camera a few years ago. This might be the most direct way users experience Moore's law.
Moore's law is also driving improvements in processor performance, making them capable of complex image processing. In the selection of processor hardware architecture, machine vision developers today have multiple options: They can select a DSP optimized for image processing; ARM+GPU or other graphics processing platforms; and there is also programmable logic heterogeneous system architecture based on ARM+FPGA (such as Xilinx Zynq 7000). Even a mainstream ARM general-purpose processor with an optimized algorithm can be used for many machine vision applications. Obtaining a machine vision processor with a higher cost-performance ratio is only a matter of time for users.
The second factor driving the rapid development of machine vision is the growing abundance of algorithms and software resources. You can say Moore's law "kidnapped" the hardware and lowered the threshold for using machine vision, but to allow machine to truly operate like the "human eye + human brain," or even operate with better performance and efficiency, the software is also required. In the last century, developing machine vision algorithms and software was definitely brain intensive work, and companies did not dare get into the business without a few PhDs on staff. The situation changed in 2000, the year Intel announced OpenCV, which is a cross-platform machine vision library released under the open-source BSD license. Developers can conveniently implement a variety of general-purpose algorithms for image and vision processing using a series of C/C++ functions. Starting at that time, functions and algorithms have been optimized for different machine vision applications based on the constantly updated OpenCV, making transplant and operation on embedded processors even easier. This has led to the gradual formation of a complete machine vision software ecosystem. At the same time, many commercial software development tools also began offering graphics processing functions, making the development of machine vision applications even more accessible.
Figure 1, the Blackfin embedded vision starter kit provided by Avnet includes complete hardware and software resources to help machine vision developers get started quickly
It can be said that the maturity of the hardware and software ecosystem is what enabled the rapid expansion of machine vision applications over the last decade. Next, making machine vision "smarter" will be main demand for future development. The third factor — artificial intelligence — plays a crucial role in this process. Using core technologies of artificial intelligence, such as deep learning, machine vision will gain the ability to learn by itself and evolve, constantly enhancing its capabilities to make itself even "smarter."
A classic approach to combining artificial intelligence with machine vision is uploading collected data to the cloud, and then training a powerful "brain" that is capable of evolving through data analysis and autonomous learning in the cloud. Meanwhile, there are also people considering directly training machine vision terminals on the basis of their growingly powerful performance. This allows deep learning algorithms to directly make their way into end products, and make products real-time with better accuracy and reliability. It also avoids potential security issues with protecting privacy in the cloud. No matter how you look at it, successful application will undoubtedly have a profound effect on the future of machine vision.
To sum up, in the world of machines, the work of the human eye, or at least part of it, has become a simple repetitive laborious task. With the combined effects of numerous factors, the process of machine vision replacing and even surpassing human vision can no longer be stopped.