Omnitek DPU (Deep Learning Processing Unit)
Omnitek’s Deep Learning Processing Unit (DPU) is a highly optimised, configurable deep neural network (DNN) for machine learning applications processing over 5,300ips for GoogLeNet inference. Delivered with a TensorFlow software programming interface and configurable as a CNN, RNN or MLP inference engine on an FPGA or MPSoC, the Omnitek DPU achieves the highest performance of any DNN accelerator on an FPGA. For applications which do not require the ultimate in performance, this performance advantage can be traded for reductions in system cost and power. Implementation on an FPGA ensures the DPU can be optimised for the application and updated in the field at any time in the future as new research yields improvements in DNN architectures or optimisation techniques.
Omnitek have established themselves as a world leading provider of embedded vision and video FPGA IP and Design Services. They have assisted many of Xilinx’s leading clients by providing optimised IP and algorithm designs. Omnitek has an excellent understanding of Xilinx FPGA and MPSoC devices, and is able to provide highly efficient bespoke turnkey solutions.
Delivered as an IP Core with a software framework for programming it, the Omnitek DPU addresses the following needs:
- Fully software programmable via TensorFlow in C/C++ or Python
- Highly efficient use of FPGA resources for optimum performance, cost and power
- Highly flexible to support:
- Architecture optimisation for the application workload
- Adoption of novel topologies and optimisation techniques as they emerge from the increasingly productive research in industry and academia
- Suitable for either Data Centre (FPGA) or Embedded (FPGA SoC) applications
AI on FPGAs
The highly flexible nature of the Omnitek DPU results from the choice of FPGAs as the delivery platform. FPGAs have many benefits over GPUs, ASICs or ASSPs for machine learning applications, including:
- High performance per watt and low latency
- Optimisation of the network for the workload
- Time to market
- Integration with video/vision functions to create a complete system on a chip
Working with Omnitek for AI
The highly efficient use of FPGA resources reflects Omnitek’s knowledge and experience in Machine Learning and FPGA architectures. This is enhanced by the research Omnitek is doing with the University of Oxford and by its ecosystem of expert partners.
In addition to machine learning Omnitek has a wide portfolio of vision and video IP and can assist with complete AI system design for edge or cloud.
The Omnitek Oxford University Research Scholarship
AI algorithms and the hardware they run on are the subject of significant collaborative research between academia and industry.
Omnitek has chosen to enhance their world-renowned research and development team by working with Oxford University.
The Omnitek Oxford University Research Scholarship funds DPhil doctorate students conducting research into optimum FPGA acceleration of AI algorithms.
Machine Learning for Data Centre / Cloud Applications
Data Centres in the Cloud require maximum throughput per watt using high performance PCIe acceleration cards. They also benefit from being reprogrammable to handle different workloads. The Omnitek DPU was designed to address all these needs and can be programmed immediately onto FPGAs which already exist in the cloud, supporting accelerated “software as a service” applications.
Machine Learning for Embedded / Edge Applications
Embedded technology applications at the Edge typically require smaller, lower cost, lower power devices with additional IP for connectivity and video / vision processing functions. The Omnitek DPU is extremely efficient and highly configurable, enabling the optimum balance of performance/cost/power to be achieved for each specific application. Omnitek’s extensive library if optimised IP for video and vision applications facilitates the design of complete intelligent video/vision systems on a single chip.