Since Amazon launched its cloud computing services arm in 2006 – widely known as AWS (Amazon Web Services) – the company has been on a mission to convert the world to its vision of how computing resources can be purchased and deployed, but also to make them as ubiquitous as possible. This strategy was on display at this year’s re:Invent.
AWS has launched several new computing options, some based on its own new custom silicon designs, as well as an impressive array of data organization, analysis, and connection tools and services. The number and complexity of many of the new features and services that have been unveiled make it difficult to keep track of all the choices now available to customers. Rather than being the result of uncontrolled development, however, the abundance of abilities is by design.
New AWS CEO Adam Selipsky was keen to stress during his keynote (watch above) and other appearances that the organization is “obsessed” with the customer. Therefore, most of its product decisions and strategies are based on customer demands. It turns out that when you have many different types of customers with different types of workloads and requirements, you end up with a complex array of choices.
Realistically, this kind of approach will reach a logical limit at some point, but in the meantime, it means that the vast array of AWS products and services likely represents a mirror image of the entire (and complexity) IT landscape. current business. In fact, there is a wealth of information on enterprise IT trends waiting to be gleaned from an analysis of the services used, to what degree and how they have changed over time, but that’s a topic for another time.
In the world of computing options, the company acknowledged that it now has more than 600 different EC2 (Elastic Compute Cloud) computing instances, each consisting of different combinations of CPU and other acceleration silicon, memory, network connections, etc. While this is a difficult number to fully appreciate, it once again indicates just how diverse today’s computing demands have become. From cloud-native, AI- or ML-based, containerized applications that require the latest AI accelerators or dedicated GPUs, to “raised and moved” legacy enterprise applications that only use older x86 processors, cloud services like AWS now need to be able to handle all of the above.
New entries announced this year include several based on Intel’s 3rd Gen Xeon Scalable processors. However, what received the most attention were instances based on three of Amazon’s new silicon designs. The Hpc7g instance is based on an updated version of the Arm-based Graviton3 processor called Graviton3E which the company claims offers 2x the floating point performance of the previous Hpc6g instance and 20% overall performance over the previous Hpc6g instance. current Hpc6a.
As in many cases, Hpc7g targets a specific set of workloads, in this case high performance computing (HPC), such as weather forecasting, genomics processing, fluid dynamics, etc. Specifically, it’s designed for larger ML models that often end up running on thousands of cores. What’s interesting about this is that it shows both how far Arm-based processors have come in terms of the types of workloads they’ve been used for, as well as the degree of refinement that AWS brings to its various EC2 instances.
Also Read: Why Does Amazon Build CPUs?
Separately, in several other sessions, AWS highlighted the momentum towards using Graviton for many other types of workloads as well, especially for cloud-native containerized applications from AWS customers like DirecTV and Stripe.
An intriguing insight that emerged from these sessions is that due to the nature of the tools used to develop these types of applications, the challenges of porting code from x86 to native Arm instructions (which were once considered a huge shutdowns for Adoption of Arm-Based Servers) have largely disappeared.
Instead, all that is required is the simple change of a few options before the code is complete and deployed to the instance. This makes the future growth potential of Arm-based cloud computing much more likely, especially on new applications.
Of course, some of these organizations strive to create completely instruction-set independent applications in the future, which would apparently make the choice of instruction set irrelevant. However, even in this situation, compute instances that offer better price/performance or performance/watt ratios, which Arm-based processors often have, are a more attractive option.
For ML workloads, Amazon unveiled its second-generation Inferentia processor as part of its new Inf2 instance. Inferentia2 is designed to support ML inference on models with billions of parameters, such as many new large language models for applications such as real-time speech recognition that are currently in development.
The new architecture is designed to scale to thousands of cores, which these huge new models, such as GPT-3, require. Additionally, Inferentia2 includes support for a mathematical technique known as stochastic rounding, which AWS describes as “a way of rounding probabilistically that enables high performance and greater accuracy over legacy rounding modes”. To get the most out of distributed computing, the Inf2 instance also supports a next-generation version of the company’s NeuronLink ring network architecture, which is claimed to deliver 4x the performance and 1/10 the latency of existing Inf1 instances. The bottom line translation is that it can deliver 45% higher performance per watt for inference than any other option, including GPU-powered ones. Given that inference power consumption requirements are often 9 times higher than what is needed for model training according to AWS, this is a big deal.
The third new custom silicon-driven instance is called C7gn and features a new AWS Nitro network card equipped with fifth-generation Nitro chips. Designed specifically for workloads that demand extremely high throughput, such as firewalls, virtual networking, and real-time data encryption/decryption, C7gn is claimed to have 2x the network bandwidth and packet processing 50% higher per second than previous instances. Above all, the new Nitro cards are able to achieve these levels with a 40% improvement in performance per watt over its predecessors.
Altogether, Amazon’s focus on custom silicon and an increasingly diverse range of computing options represents a comprehensive set of tools for companies looking to move more of their workloads to the cloud. As with many other aspects of its AWS offerings, the company continues to refine and improve what has clearly become a very sophisticated and mature toolset. Together, they offer a remarkable and promising vision of the future of computing and the new types of applications they can enable.
Bob O’Donnell is the founder and chief analyst of TECHnalysis Research, LLC, a technology consulting firm that provides strategic consulting and market research services to the technology industry and the professional financial community. You can follow him on Twitter @bobodtech.
#Amazon #AWS #Expands #Computing #Offerings #Introduces #Powerful #Graviton3E #Chip