The NVIDIA GTC 2020 keynote is finally done, and just as CEO Jensen Huang promised, it did not disappoint. In a series of nine pre-recorded episodes, Huang unveiled the future of NVIDIA’s GPU – the Ampere – from his kitchen. With it, the company set out a vision for the future of computing – a shift to powerful, flexible data centres from servers.
Alongside the Ampere GPU, Huang also discussed NVIDIA’s recent Mellanox acquisition, Ampere-based products and important new software technologies. To some people’s surprise, NVIDIA did not mention gaming at all, instead announcing that the NVIDIA A100 (the first GPU based on Ampere technology) would first come to enterprise customers.
There is some hope for gamers though, as Huang said in a media briefing ahead of the keynote that the 7nm Ampere architecture would make its way to the next generation of graphics cards. The Ampere is part of NVIDIA’s long-term plan to merge its Turing and Volta chips into a single platform. When that happens, it is very likely the power of Ampere will come to NVIDIA’s GeForce GPUs. According to TechRadar, “the long-rumoured Nvidia GeForce RTX 3080 is widely expected to be Team Green’s first consumer GPU to be based on the 7nm Ampere platform.” The new cards are expected to be launched in the third quarter of 2020.
Data Centre Scaling With Ampere
The new Ampere technology is laying emphasis on data centre scaling, NVIDIA said. Ampere GPUs unify AI training and inference, enabling them to provide the greatest generational performance leap. 18 of the world’s leading service providers are already incorporating Ampere, NVIDIA announced. Amongst them are big names like Alibaba Cloud, Amazon Web Services, Cisco, Google Cloud, Microsoft Azure and Oracle. The NVIDIA A100 GPU. Source: NVIDIAThe A100 boosts performance by up to 20x over its predecessors, resulting in a 6x higher performance than NVIDIA’s previous generation Volta architecture for training and 7x higher performance for inference. Huang detailed five key features of the A100;
- More than 54 billion transistors, making it the world’s largest 7-nanometer processor.
- Third-generation Tensor Cores with TF32, a new math format that accelerates single-precision AI training out of the box.
- Structural sparsity acceleration, a new efficiency technique harnessing the inherently sparse nature of AI math for higher performance.
- Multi-instance GPU, or MIG, allowing a single A100 to be partitioned into as many as seven independent GPUs, each with its own resources.
- Third-generation NVLink technology, doubling high-speed connectivity between GPUs, allowing A100 servers to act as one giant GPU.
Based on the A100, NVIDIA also announced the third-generation GDX AI system – the world’s first 5-petaflops server. The server can handle up to 56 independently run applications, Huang said. As per the company’s press release – the GDX A100 “consolidates the power and capabilities of an entire data center into a single flexible platform for the first time.” Built for end-to-end machine learning, Huang said that the server “is the first AI system built for the end-to-end machine learning workflow.” Huang added that it would also be available for cloud and partner server makers – as the HGX A100.
For more power, Huang also unveiled the NVIDIA GDX SuperPOD. Powered by 140 DGX A100 systems and Mellanox networking technology, it offers 700 petaflops of AI performance. Combining the DGX SuperPOD and DGX A100, NVIDIA said that it was making AI scaling easier with a pay-as-you-go model. SuperPOD will now be available in a modular format, as a group of 20 DGX A100 systems.
The company announced that it would be adding four GDX SuperPODs to its SATURN V internal supercomputer. With the added 2.8 exaflops of AI computing power, the SATURN V is now the world’s fastest AI supercomputer with a total capacity of 4.6 exaflops.
Focus on COVID-19
A key area of focus for the company is COVID-19, Huang said, adding that NVIDIA was working closely with researchers and scientists. Through AI, NVIDIA hopes to contribute to tracking and treating the pandemic. In a press release, the company highlighted four key projects it was working on –
- Working with Oxford Nanopore Technologies to sequence the virus genome
- Real-time tracing of the infection rate with Plotly
- Screening possible drug combinations with Oak Ridge National Laboratory and the Scripps Research Institute
- 3D reconstruction of the virus’ spike protein with Structura Biotechnology, the University of Texas at Austin and the National Institutes of Health
“Researchers and scientists applying NVIDIA accelerated computing to save lives is the perfect example of our company’s purpose — we build computers to solve problems normal computers cannot,” Huang said. The company also announced a major expansion of NVIDIA Clara, it’s healthcare platform. The company released a set of new AI models to help researchers study infected patients through chest CT scans.
NVIDIA has also partnered with Mass General Brigham to create a multinational federated learning initiative. The programme will allow local partners to adapt COVID-19 models to X-ray imaging without compromising patient privacy. The company added that its NVIDIA Clara application framework has been deployed in 50 hospitals worldwide for medical imaging.
“Never before has there been such a critical need to apply the best AI technology and accelerated computing to every facet of healthcare, and its effects will be felt widely beyond this pandemic and across healthcare going forward”
Kimberly Powell, vice president of Healthcare at NVIDIA
NVIDIA GPUs will also power software applications for three critical uses – managing big data, creating recommender systems and building real-time, conversational AI. This would be achieved by enabling native GPU support for Apache Spark 3.0. NVIDIA will also be collaborating with the open-source community to bring end-to-end GPU acceleration. Scheduled to be launched in late spring, Huang described Apache Spark 3.0 as “one of the most important applications in the world today.”
Using Spark 3.0, data scientists and machine learning engineers will for the first time be able to apply revolutionary GPU acceleration to the ETL (extract, transform and load) data processing workloads widely conducted using SQL database operations, the company said.
Alongside Spark, Huang said that NVIDIA was also launching two new software systems – an end-to-end framework for building next-generation recommender systems (NVIDIA Merlin) and an end-to-end platform for creating real-time, multimodal conversational AI (NVIDIA Jarvis). Merlin is so powerful, it slashes the time needed to create a recommender system from a 100-terabyte dataset to 20 minutes from four days.
NVIDIA is also making Omniverse, its real-time simulation and collaboration platform for 3D production, available to early access customers. Based on Pixar’s Universal Scene Description and NVIDIA RTX, the Omniverse Platform is designed to act as a hub, enabling new capabilities to be exposed as micro-services to any connected clients and applications. Right now, Omniverse will be available only to customers in architecture, engineering and construction.
Autonomous Vehicles and Robotics Platforms
NVIDIA DRIVE, the autonomous vehicle’s platform will use the new Orin SoC with an embedded NVIDIA Ampere GPU to achieve greater energy efficiency and performance. The NVIDIA DRIVE AGX Orin consists of 17 billion transistors, allowing it to achieve 200 TOPS of performance. The result of four years of R&D, NVIDIA said the Orin is designed to handle the large number of applications and deep neural networks that run simultaneously in autonomous vehicles and robots, while achieving systematic safety standards such as ISO 26262 ASIL-D.
Orin can be scaled anywhere from Level 2 autonomy to Level 5. Orin is designed for the future of autonomy, with fleet operation in mind. It allows automakers to have a single computing architecture and software stack to build and manage AI in every vehicle. “It’s now possible for a carmaker to develop an entire fleet of cars with one architecture, leveraging the software development across their whole fleet,” Huang said. Autonomous vehicle fleets can be managed using the new NVIDIA DRIVE RC system.
NVIDIA also said it was collaborating with BMW to bring its robotics platform – NVIDIA Issac, to 30 BMW plants around the world. The collaboration will allow BMW to deploy end-to-end systems to enhance factory logistics. The collaboration uses NVIDIA DGX™ AI systems and Isaac simulation technology to train and test the robots; NVIDIA Quadro® ray-tracing GPUs to render synthetic machine parts to enhance the training and a new lineup of multiple AI-enabled robots built on the Isaac software development kit. “BMW Group is leading the way to the era of robotic factories, harnessing breakthroughs in AI and robotics technologies to create the next level of highly customizable, just-in-time, just-in-sequence manufacturing,” Huang said on the collaboration.
If you liked this article, please check out more of our tech coverage!