Rising demand for AI, especially generative AI (i.e. AI that generates images, text, etc.), is driving the AI inference chip market. Inference chips speed up the process of AI inference, i.e. when AI systems generate outputs (e.g. text, images, audio) based on what they learned during the “training” on a specific set of data. AI inference chips can be – and have been – used to produce faster builds from systems like Stable Diffusion, which translates text prompts into artwork, and OpenAI’s GPT-3, which extends a few lines of prose into Poems, Essays, and Suite.
A number of vendors – both startups and established players – are actively developing and selling access to AI inference chips. There’s Hailo, Mythic, and Flex Logix, to name a few. And on the incumbent side, Google is competing for dominance with its tensor processing units (TPUs) while Amazon is betting on Inferentia. But the competition, while fierce, hasn’t scared off companies like NeuReality, which occupy the AI chip inference market but aim to differentiate themselves by offering a suite of software and services to support their hardware.
In this regard, NeuReality announced today that it has raised $35 million in a Series A funding round led by Samsung Ventures, Cardumen Capital, Varana Capital, OurCrowd and XT Hi-Tech with participation from SK Hynix, Cleveland Avenue, Korean Investment Partners, StoneBridge, and Glory Ventures. Co-founder and CEO Moshe Tanach told TechCrunch that the tranche will be dedicated to finalizing the design of NeuReality’s flagship AI inference chip in early 2023 and shipping it to customers.
“NeuReality was founded with the vision of creating a new generation of AI inference solutions that unleashed from traditional CPU-centric architectures and delivered high performance and low latency, with the best possible efficiency in terms of cost and power consumption,” Tanach told TechCrunch via email. “Most of the companies that can take advantage of AI don’t have the funds or the huge R&D that Amazon, Meta and other big companies that invest in AI have. NeuReality will bring the AI technology to all those who wish to deploy easily and at a lower cost. »
NeuReality was co-founded in 2019 by Tzvika Shmueli, Yossi Kasus and Tanach, who previously served as chief engineering officers at Marvell and Intel. Shmueli was previously vice president of back-end infrastructure at Mellanox Technologies and vice president of engineering at Habana Labs. As for Kasus, he held a position as senior director of engineering at Mellanox and was responsible for integrations at semiconductor company EZchip.
From the start, NeuReality focused on bringing AI hardware to market for cloud data centers and “edge” computers, or machines that run on-premises and do most of their processing. offline data. Tanach says the startup’s current-generation product line, the Networked Processing Unit (NAPU), is optimized for AI inference applications, including computer vision (think algorithms that recognize objects in photos), natural language processing (text generation and classification systems) and recommendation engines (such as those that suggest products on e-commerce sites).
NeuReality’s NAPU is essentially a hybrid of multiple processor types. It can perform functions like AI inference load balancing, task scheduling, and queue management, which have traditionally been done in software but not necessarily very efficiently.
NeuReality’s NR1, an FPGA-based SKU within the NAPU family, is a network-connected “server-on-chip” with an integrated AI inference accelerator as well as networking and virtualization capabilities. NeuReality also offers the NR1-M module, a PCIe card containing an NR1 and a network-attached inference server, and a separate module – the NR1-S – which pairs multiple NR1-Ms to the NR1.
On the software side, NeuReality provides a set of tools, including a software development kit for cloud and on-premises workloads, a deployment manager to troubleshoot runtime issues, and a monitoring dashboard.
“The AI inference software [and] heterogeneous compute and automated build and deploy flow tools…is the magic that underpins our innovative hardware approach,” Tanach said. “The primary beneficiaries of NAPU technology are enterprises and cloud solution providers who need an infrastructure to support their chatbots, voice bots, automatic transcriptions and sentiment analysis as well as business cases. use of computer vision for document scanning, defect detection, etc. …While the world focused on deep learning processor improvements, NeuReality focused on optimizing the system around it and the software layers above it to deliver greater efficiency and flow much easier to deploy inference.
NeuReality, it should be noted, has yet to back up some of its performance claims with empirical evidence. He told ZDNet in a recent article that he estimates his hardware will offer a 15x performance-per-dollar improvement over available GPUs and ASICs offered by deep learning accelerator vendors, but NeuReality doesn’t. has not published any validation benchmark data. The startup also didn’t detail its proprietary network protocol, a protocol it says performs better than existing solutions.
Those things aside, delivering hardware at scale isn’t easy, especially when it comes to custom AI inference chips. But Tanach says NeuReality has laid the necessary groundwork, partnering with AMD-owned semiconductor maker Xilinx for production and partnering with IBM to work on the hardware requirements of NR1. (IBM, which is also a design partner of NeuReality, previously said it was “evaluating” the startup’s products for use in the IBM cloud.) NeuReality has been shipping prototypes to partners since May 2021, Tanach says.
According to Tanach, beyond IBM, NeuReality is working with Lenovo, AMD, and anonymous cloud solution providers, systems integrators, deep learning accelerator vendors, and “inference-consuming” companies on deployments. Tanach, however, declined to reveal how many customers the startup currently has or what it projects in terms of revenue.
“We are seeing the pandemic slowing down businesses and pushing consolidation across the many deep learning vendors. deployment of inference is set to explode – and our technology is exactly the enabler and driver of that growth,” Tanach said. “NAPU will bring AI to a broader set of less technical businesses. It’s also designed to enable large-scale users such as hyperscalers and next-generation data center customers to support their growing use of AI.”
Ori Kirshner, head of Samsung Ventures in Israel, added in an emailed statement: “We see a substantial and immediate need for more efficient and easy-to-deploy inference solutions for data centers and disaster scenarios. on-premises use, and that’s why we’re investing in NeuReality.The company’s innovative data disaggregation, movement, and processing technologies improve compute flows, compute storage flows, and compute-in-storage, which are all critical to the ability to adopt and scale AI solutions.”
NeuReality, which currently has 40 employees, plans to hire 20 more over the next two fiscal quarters. To date, he has raised $38 million in venture capital.
#NeuReality #lands #million #bring #accelerator #chips #market