The electronics industry over the past several years has made tremendous strides in creating artificial intelligence in a manner imagined by Allen Turing in the 1940s. The convergence of algorithmic advances in multilayer neural networks, the evolution of PC graphics processing units as massively parallel processing accelerators, and the availability of massive data sets fueled by the Internet and widely deployed sensors — big data — has enabled a renaissance in software neural network modeling techniques commonly referred to as “deep learning,” or “DL.”
In addition, the evolution of 3D graphic shader pipelines into general purpose compute accelerators drastically reduced the time required to train DL models. Training time for applications as diverse as image recognition and natural language processing has been reduced from months to days — and in some cases, hours or even minutes.
These solutions have enabled new AI applications ranging from computational science to voice based digital assistants like Alexa and Siri. However, as far as we have come in such a short period of time, we still have much further to go to realize the true benefits of AI.
The Eyes Have It
AI often is compared to the human brain, because our brain is one of the most complex neural networks on our planet. However, we don’t completely understand how the human brain functions, and medical researchers are still studying what many of the major structures in our brains actually do and how they do it.
AI researchers started out by modeling the neural networks in human eyes. They were early adopters of GPUs to accelerate DLs, so it is no surprise that many of the early applications of DL are in vision systems.
Even with this knowledge, our industry is still in the early phases of the DL revolution. Every research iteration yields more sophisticated functions per neuron, more neurons per network layer, deeper layers of networks per model, and different model choices for different portions of learning tasks.
As we learn more about how our brains work, that new knowledge will drive even more DL model complexity. For example, DL researchers are still exploring the impact of numerical precision on training and inference tasks and have arrived at widely divergent views, ranging from 64- to 128-bit training precision at the high end to 8-, 4-, 2- and even 1-bit precision in some low-end inference cases.
“Good enough” precision turns out to be context-driven and is therefore highly application dependent. This rapid advancement is knowledge and technology has no end in sight.
Within 20 years, most digital systems will use some form of AI for specific tasks or applications, Tirias Research has predicted. This may be as simple as using AI for the user interface, similar to Alexa, or to determine the most efficient and safest method of performing a specific task. As a result, AI will be disruptive to hardware, software, and service delivery chains.
There are many wild forecasts of the value of AI to various segments of the market. While there will be changes in certain business models around it, AI is really an underlying technology differentiator for a very wide range of products and services, rather than a new source of revenue.
It is very much like adding a GPU to a PC or smartphone: If you don’t have it, you won’t be competitive. Those that are not part of the AI revolution will be left by the wayside to wither and die, while those that are part of the revolution will grow and thrive. Hence, the entire high-tech ecosystem’s rush to embrace and enable it.
AI is a powerful new tool. Perhaps it is in the same category with society-changing inventions such as agriculture, the printing press, the steam engine, the internal combustion engine, and the computer itself. In the end, advancements in AI technology are going to restructure much of human work, leading to new advancements in areas ranging from farming to medical research. Just don’t expect it to happen overnight.
What Makes Sense
DL network training typically is done in data center environments, using high-performance computing style clusters outfitted with compute offload accelerators, such as GPUs, digital signal processors, field-programmable gate arrays (FPGAs), or more specialized custom logic. The choice of accelerator is based on the same rubric that the industry has been using for decades:
- If algorithms, model complexity, etc., are still undergoing rapid evolution, then using high volume general purpose products such as GPUs makes sense.
- If a device or service has a cost structure and quality-of-service requirements that support higher component costs and software development investment, then FPGAs may make sense.
- If algorithms become stable or standardized, or if stable processing “inner loops” can be isolated for acceleration, then custom logic, such as Google’s TensorFlow Processing Unit coprocessor, may make sense. Google’s TPU is essentially a bare-naked matrix multiply coprocessor.
In addition to the compute resources, training demands large data sets and high system throughput.
After a DL network is trained, the resulting network model typically is transferred to simpler systems and appliances and run as an inference engine. An inference engine processes individual artifacts one at a time, such as a single photograph or sentence, as opposed to training on millions of photos or sentences.
However, inference services may be latency sensitive — people can be impatient. As a result, some inference engines run in the cloud as a service, such as Apple’s Siri; as a mix of dedicated appliances and cloud services, such as Amazon’s Alexa; or completely locally, such as Facebook’s collaboration with Qualcomm to enable certain AI functions on a smartphone.