Artificial intelligence computer maker Cerebras Systems, which has built chips and computers, and now makes supercomputers dedicated to accelerating deep learning, announced services on Tuesday to accelerate use. very large language models that are becoming increasingly popular not only for research but commercial use as well.
“We believe that great language models are underhyped, not overhyped,” Cerebras co-founder and CEO Andrew Feldman said in a press briefing. “We are only just beginning to see the impact of this; there will be winners and new entrants in each of the three layers of the ecosystem, in the hardware layer, the infrastructure layer and the application layer.”
Feldman predicted, “Next year you will see a huge increase in the impact of large language patterns in various sectors of the economy.”
In partnership with cloud service provider Cirrascale, Cerebras offers what it calls “paid-per-model” compute time, a flat rate to converge a large language model such as GPT-3 from OpenAI on clusters of its CS2 computers designed for learning.
Also: Technology in 2023: Here’s what really matters
The service is branded Cerebras AI Model Studio.
Prices, ranging from $2,500 to train a 1.3 billion parameter model of GPT-3 in 10 hours to $2.5 million to train the 70 billion parameter version in 85 days, average the half the cost users would pay to rent cloud capacity or rent machines for years to do the equivalent work. And CS2 clusters can be eight times faster to form than clusters of Nvidia A100 machines in the cloud.
Cirrascale uses a mix of CS2 clusters and Cerebras-owned machines, as well as the Andromeda supercomputer, which is located at Colovore’s colocation facilities, based in Santa Clara, Calif., where Cirrascale has also installed equipment.
The partnership for Studio follows a partnership between Cerebras and Cirrascale announced a year ago to offer CS2 machines in the cloud on a weekly basis.
The service will automatically scale cluster sizes based on the scale of the language model, Feldman said. The company points out that drive performance improves in linear proportion with the addition of additional machines.
Scaling to larger clusters would drive the price up to a premium, Feldman said. For example, Andromeda’s 16-machine cluster is four times larger than a four-way CS2 cluster, but would likely cost a customer five times as much to operate because it achieves a higher level of performance.
Also: AI challenger Cerebras assembles modular ‘Andromeda’ supercomputer to accelerate large language models
The most important immediate benefit of reducing the cost of training on large models could be providing access to large model development to parties that could not afford the huge rental costs typically required, Feldman said.
“We have seen time and time again that knowing prices in advance and how long it will take are real issues for a whole class of customers, and we hope to overcome these issues,” he said.
The alternative, Feldman said, is for companies to spend big on leasing equipment for years at a time.
“If you think about how the biggest models are trained today, and they’re all on dedicated clusters that are on multi-year leases,” Feldman said. “There are companies right now that have raised huge funds and have huge valuations that, in their wildest dreams, never owned any hardware.”
Also: AI chip startup Cerebras secures $250M Series F funding round at over $4B valuation
Also on Tuesday, Cerebras announced that its Andromeda supercomputer, which it unveiled earlier this month, a cluster of 16 CS2 machines, will be used by Jasper, a company-backed startup that runs large language models as as a service for commercial applications such as generating press releases and blog posts.
Jasper, which has nearly a hundred thousand paying customers for its generative text feature, serves businesses that need to train large language models with customer data, such as a particular knowledge base, product catalog, and corporate “voice”.
“They want custom models, and they really do,” Jasper CEO Dave Rogenmoser said at the same press briefing. The idea, he said, is to have the marketing department “all speak with the same voice” and that new hires “get up to speed and all speak with the same voice” as the rest of the team. the company. This includes things like a template generating Facebook ads using the client’s usual language.
The ability to reduce the cost of training and dramatically speed up the time to train large language models “is a huge draw for us” to work with Cerebras, Rogenmoser said.
Jasper recently closed a Series A round valuing the company at $1.5 billion, Rogenmoser said.
Using dedicated clusters can be not only faster and cheaper, but also more nuanced, Cerebras product manager Andy Hock said at the same press briefing.
“One of the things we’re seeing more broadly in the marketplace is that many companies would love to be able to quickly research and develop these models at scale, but the infrastructure that exists in the traditional cloud just doesn’t allow for this kind of large scale. – large-scale research and development easily,” Hock said.
“Being able to ask questions like, should I train from scratch [a large language model]or do I need to refine an open source public checkpoint, what is the best answer, what is the most efficient use of computation to reduce the cost of goods to provide the best service to my customers – be able to ask these issues are costly and impractical in many cases of traditional infrastructure.”
Cerebras clusters allow Jasper and others to ask these questions, he said.
Both announcements were made at the 36th annual conference on Neural Information Systems, or NeurIPS, the premier conference in the field of AI, taking place this week in New Orleans.
#challenger #Cerebras #unveils #paypermodel #cloud #service #Cirrascale #Jasper