There’s no question about it: Machine Learning (ML) is driving significant changes in how compute fabric and memory are used. Indeed, the days of the von Neumann architecture are behind us, particularly as the conventional semiconductor products designed to support it are no longer up to the task.
For lack of a better term, “traditional” ML applications such as object detection and image classification just scratches the surface of where ML can be applied. Today, Machine Learning is being applied in health, life and business intelligence applications. It is driving discoveries in areas of cancer research, particle physics, and predictive behavioral analytics; just to name a few. It is also driving the need for different compute fabric that is tightly coupled with high performance memory and a programming model that allows software developers take advantage of this new architecture.
These three elements—efficient and flexible compute platforms, high-performance memory, and a sophisticated but easy to use ML stack—represents the key to achieving breakthrough results. All three elements must be present if the tremendous amount of data streamed via IoT devices are to become useful, whether processed locally, in real time, or in the cloud.
To all these ends, Micron teamed with FWDNXT to integrate it’s innovative Deep Learning architecture into Micron’s advanced acceleration solutions, embodied in the AC-511 module and the SB-852 board. These low-power, high-performance accelerators are enabled by advanced programmable logic (FPGAs) and Micron’s high-performance memory that, working together, deliver efficient, highly-scalable solutions that can be deployed from the edge to the datacenter. Moreover, the trifecta of technologies (compute, memory & ML Stack) enables machine learning systems to operate at near-peak hardware utilization, delivering compelling performance per watt..
This solution was designed to allow developers to go from any framework, e.g., TensorFlow, Torch, Caffe, etc., directly to hardware, accelerating any neural network with ease. This flow bypasses dependencies of HDL coding, the FWDNXT ML compiler completely abstracts the hardware away, by automatically instantiating trained networks into the hardware accelerators. As such, the performance and power advantages of FPGAs are now available with none of the programming grief, making it an easy to deploy deep learning solutions.