
Why it issues: Watching the evolution of the computing trade over the previous couple of years has been an interesting train. After a long time of focusing nearly completely on one sort of chip — the CPU — and measuring enhancements by means of refinements to its inside structure, there was a dramatic shift to a number of chip sorts, notably GPUs (Graphics Processing Units), with efficiency enhancements being enabled by high-speed connectivity between parts.
Never has this been made clearer than at Nvidia’s newest GPU Technology Conference (GTC). During the occasion’s keynote, firm CEO Jensen Huang unveiled a number of recent developments, together with the most recent GPU structure (named Hopper after computing pioneer Grace Hopper), and quite a few types of high-speed chip-to-chip and device-to-device connectivity choices.
Collectively, the corporate used these key expertise developments to introduce every little thing from the large Eos Supercomputer all the way down to the H100 CNX Converged Accelerator, a PCIe card designed for current servers, with plenty of different choices in between.
Nvidia’s focus is being pushed by the trade’s relentless pursuit of developments in AI and Machine Learning. In truth, many of the firm’s many chip, {hardware}, and software program bulletins from the present have a tie to those important developments, whether or not or not it’s supercomputing purposes, autonomous driving techniques, or embedded robotics purposes.
Nvidia additionally strongly bolstered that it is greater than a chip firm, providing software program updates for its current instruments and platforms, notably the Omniverse 3D collaboration and simulation suite. To encourage extra use of the instrument, Nvidia introduced Omniverse Cloud, which lets anybody strive Omniverse with nothing greater than a browser.
For hyperscalers and enormous enterprises trying to deploy superior AI purposes, the corporate additionally debuted new or up to date variations of a number of cloud-native utility companies, together with Merlin 1.0 for recommender techniques, and model 2.0 of each its Riva speech recognition (Riva, sounds acquainted?) and text-to-speech service, in addition to AI Enterprise, for quite a lot of information science and analytics purposes
New to AI Enterprise 2.0 is help for virtualization and the power to make use of containers throughout a number of platforms, together with VMware and RedHat. Taken as an entire, these choices replicate the corporate’s rising evolution as a software program supplier. It’s transferring from a tools-focused strategy to at least one that gives SaaS-style purposes that may be deployed throughout all the main public clouds, in addition to by way of on-premises server {hardware} from the likes of Dell Technologies, HP Enterprise, and Lenovo.
Never forgetting its roots, nevertheless, the star of Nvidia’s newest GTC was the brand new Hopper GPU structure and the H100 datacenter GPU.
Boasting a whopping 80 billion transistors, the 4nm H100 helps a number of necessary architectural developments. First, to hurry the efficiency of recent Transformer-based AI fashions (such because the one driving the GPT-3 pure language engine), the H100 features a Transformer engine that the corporate claims gives a 6x enchancment over the earlier Ampere structure.
It additionally features a new set of directions referred to as DPX which are designed to speed up dynamic programming, a way leveraged by purposes reminiscent of genomics and proteomics, that beforehand ran on CPUs or FPGAs.
For privacy-focused purposes, the H100 can also be the primary accelerator to help confidential computing (earlier implementations solely labored with CPUs), permitting fashions and information to be encrypted and guarded by way of a virtualized trusted execution setting.
The structure does permit for federated studying whereas in a confidential computing mode, that means that a number of firms with non-public information units can all practice the identical mannequin by basically passing it round amongst totally different safe environments. In addition, because of a second-generation implementation of multi-instance GPU, or MIG, a single bodily GPU will be cut up up into seven separate remoted workloads, enhancing the effectivity of the chip in shared environments.
Hopper additionally helps the fourth-gen model of Nvidia’s NVLink, a serious leap that gives an enormous 9x enhance in bandwidth versus earlier applied sciences, helps connections to as much as 256 GPUs, and permits use of NVLink Switch. The latter supplies the power to keep up high-speed connections not solely inside a single system, however to exterior techniques as properly. This, in flip, enabled a brand new vary of DGX Pods and DGX TremendousPods, Nvidia’s personal branded supercomputer {hardware}, in addition to the aforementioned Eos Supercomputer.
Speaking of NVLink and bodily connectivity, the corporate additionally introduced help for a brand new chip-to-chip expertise referred to as Nvidia NVLink-C2C, which is designed for chip-to-chip and die-to-die connections with speeds as much as 900 Gbps between Nvidia parts.
The firm is opening up the beforehand proprietary NVLink customary to work with different chip distributors, and notably introduced it might even be supporting the newly unveiled UCIe customary (see “The Future of Semiconductors is UCIe” for extra).
This provides the corporate extra flexibility by way of the way it can doubtlessly work with others to create heterogeneous components, as others within the semiconductor trade have began to do as properly.
Nvidia selected to leverage its personal NVLink-C2C for a brand new Grace Superchip, which mixes two of the corporate’s Arm-based CPUs, and revealed that the Grace Hopper Superchip previewed final 12 months, makes use of the identical interconnect expertise to offer a high-speed connection between its single Grace CPU and Hopper GPU.
Both “superchips” are focused at datacenter purposes, however their architectures and underlying applied sciences present a superb sense of the place we are able to possible anticipate to see PC and different mainstream purposes headed.
The NVLink-C2C customary, which helps trade connectivity requirements reminiscent of Arm’s AMBA CHI protocol and CXL, can be used to interconnect DPUs (information processing items) to assist pace up important information transfers inside and throughout techniques.
In addition to all these datacenter-focused bulletins, Nvidia launched updates and extra real-world clients for its Drive Orin platform for assisted and autonomous driving, in addition to its Jetson and Isaac Orin platforms for robotics.
All informed, it was a formidable launch of quite a few applied sciences, chips, techniques, and platforms. What was clear is that the way forward for demanding AI purposes, together with different tough computing challenges, goes to require a number of totally different parts working in live performance to finish a given process.
As a end result, growing the range of chip sorts and the mechanisms for permitting them to speak with each other goes to be as necessary — if no more necessary — as developments inside particular person classes. To put it extra succinctly, we’re clearly headed right into a linked, multi-chip world.
Bob O’Donnell is the founder and chief analyst of TECHnalysis Research, LLC a expertise consulting agency that gives strategic consulting and market analysis companies to the expertise trade {and professional} monetary neighborhood. You can comply with him on Twitter @bobodtech.