Computing has seesawed between centralized and distributed models, from mainframes to enterprise servers and then to cloud computing. Each of these cycles has not completely replaced the previous architecture, but, rather, added to it to cater to differing needs past the hype cycle. With edge sensors and AI inferencing, we are now moving past the hype cycle for data centers towards increasing intelligence at the edge and processing the resulting data right there. With billions of edge devices, the demand for low latency has increased. It is incredibly inefficient to collect data from billions of edge devices to a central data center for computation when there is no inter-relationship among the data. Not to mention large, inefficient data pipes and high latency. This has opened the door for low-power, low-latency, distributed edge-computing devices.
Latest Processor Trends
Enterprise servers led remarkable innovations and growth for Intel and AMD processors riding the technology node and CPU architecture enhancements. Just as we see CPUs and accelerators in IBM/Unisys mainframes and Cray/HP supercomputers, we are now seeing high-performance processors and hardware accelerators even in data centers and cloud services that have led to dramatic growth for GPUs (Nvidia) and, more recently, FPGAs (Xilinx and Intel). Cloud service providers such as AWS, Google, Microsoft, Adobe, VmWare and IBM and data center providers like Equinix, Digital Reality Trust, CenturyLink, Verizon, L3 Communications, and others use both mainstream processors and various hardware accelerators. The biggest benefit is to correlate different types of related data and analytics along with efficient data storage. Check out the latest accelerators and performance benchmarks at OpenCompute. When I worked at Xilinx 10 years ago, FPGAs never found a place in servers even though they were used in supercomputers for high-performance computing. However, the move from paying for hardware to paying for services has changed the business model drastically, making the cost of FPGAs hidden. Data center and cloud service providers are even getting attention for the benefits they provide in accelerating cloud computing.
New Devices for Artificial Intelligence and Machine Learning
Enter artificial intelligence (AI) learning and machine learning, where the GPU/FPGA hardware acceleration is inefficient. Trying to use traditional cloud computing results in dramatic inefficiency in the processors and hardware accelerators, leading to new AI-specific processors and ASSPs—there are startups and traditional players trying to solve the challenge. Examples include Wave Computing, Cerebras, Achronix, Intel Nervana, and Xilinx, among others. To contain the size of the IC, 3D IC and 3D SiP package architectures are being explored. The problem still is that hundreds of amperes of current are required per device—add several and you are looking at KWs of power and high dissipation. Clearly these new devices are challenging the current partitioning between traditional Intel/AMD CPUs and GPUs/FPGAs. This is a good thing, but AI inferencing using data from edge sensors is not the same as AI learning (or what most consider to be AI processing).
Machine learning applications such as object identification benefit from edge processors, which, in turn, need companion signal-chain and power management ICs.
Why Inferencing is Not Learning
Initially the idea was to collect data from IoT sensors and push it through high-bandwidth optical data-pipes to a data center for deep learning and inferencing. This is not working. It is important to draw the difference between learning and inferencing. Learning in a centralized data center is useful but inferencing is not. Think about it—it is rather old school to collect all the data from multiple locations, build very large pipes to send the disparate, unrelated traffic to a central location to process, and send it all back to the same multiple locations. While this does have benefits to correlate multiple data streams more intelligently in some niche applications, using the same system architecture for every need is inefficient. This is where low-power edge-computing processors can come in to collect localized data and inference it at the same location. Edge computing is also being catalyzed by IoT standards and 5G.
What Kinds of Devices Are Needed for Edge Computing?
New devices running 3W–10W are being developed to address highly efficient localized computing with low latency. Examples include Google Tensor, NVIDIA Jetson, Qualcomm, and Lattice FPGAs, among others. Maxim is also trying to address this with novel edge microcontrollers. These processors are used in always-on edge devices and need to stay in a low power consumption state most of the time, except when inferencing AI data or transmitting. Of course, this also means companion signal chain and power management devices are essential. Such devices are being developed to support the processor so that the entire system can perform edge computing rapidly with low latency and the highest energy efficiency possible. This is a happening market with more action to come. Stay tuned.