最新消息: USBMI致力于为网友们分享Windows、安卓、IOS等主流手机系统相关的资讯以及评测、同时提供相关教程、应用、软件下载等服务。

NVIDIA A100 GPU系统规格说明书

IT圈 admin 17浏览 0评论

2024年11月1日发(作者:荆玉石)

UNPRECEDENTED SCALE

AT EVERY SCALE

NVIDIA A100

TENSOR CORE GPU

The Most Powerful Compute Platform for

Every Workload

The NVIDIA

®

A100 Tensor Core GPU delivers unprecedented

acceleration—at every scale—to power the world’s highest-

performing elastic data centers for AI, data analytics, and

high-performance computing (HPC) applications. As the

engine of the NVIDIA data center platform, A100 provides

up to 20X higher performance over the prior NVIDIA Volta

generation. A100 can efficiently scale up or be partitioned

into seven isolated GPU instances, with Multi-Instance GPU

(MIG) providing a unified platform that enables elastic data

centers to dynamically adjust to shifting workload demands.

NVIDIA A100 Tensor Core technology supports a broad range

of math precisions, providing a single accelerator for every

workload. The latest generation A100 80GB doubles GPU

memory and debuts the world’s fastest memory bandwidth

at 2 terabytes per second (TB/s), speeding time to solution

for the largest models and most massive data sets.

A100 is part of the complete NVIDIA data center solution that

incorporates building blocks across hardware, networking,

software, libraries, and optimized AI models and applications

from NGC

. Representing the most powerful end-to-end AI

and HPC platform for data centers, it allows researchers

to deliver real-world results and deploy solutions into

production at scale.

SYSTEM SPECIFICATIONS

NVIDIA A100

for NVLink

Peak FP64

Peak FP64

Tensor Core

Peak FP32

Tensor Float 32

(TF32)

9.7 TF

19.5 TF

19.5 TF

156 TF | 312 TF*

NVIDIA A100

for PCIe

9.7 TF

19.5 TF

19.5 TF

156 TF | 312 TF*

312 TF | 624 TF*

312 TF | 624 TF*

Peak BFLOAT16 312 TF | 624 TF*

Tensor Core

Peak FP16

Tensor Core

Peak INT8

Tensor Core

Peak INT4

Tensor Core

GPU Memory

GPU Memory

Bandwidth

Interconnect

312 TF | 624 TF*

624 TOPS | 1,248 624 TOPS | 1,248

TOPS*TOPS*

1,248

TOPS | 2,496

TOPS*

40GB

1,555

GB/s

80GB

2,039

GB/s

1,248

TOPS | 2,496

TOPS*

40GB

1,555 GB/s

NVIDIA NVLink

600 GB/s**

PCIe Gen4 64 GB/s

Various instance

sizes with up to

7 MIGs @ 5 GB

PCIe

250 W

NVIDIA NVLink

600 GB/s**

PCIe Gen4 64 GB/s

Various instance

sizes with up to

7 MIGs @ 10 GB

4/8 SXM on

NVIDIA HGX

A100

400 W400 W

Multi-Instance

GPU

Form Factor

Max TDP Power

* With sparsity

** SXM GPUs via HGX A100 server boards; PCIe GPUs via NVLink

Bridge for up to 2 GPUs

A100 | DATAShEET | JAN21 | 1

Incredible Performance Across Workloads

Up to 3X Higher AI Training on

Largest Models

DLRM Training

3X

Up to 249X Higher AI Inference

Performance over CPUs

BERT-LARGE Inference

250X

Up to 1.25X Higher AI Inference

Performance over A100 40GB

RNN-T Inference: Single Stream

Up to 1.8X Higher Performance for

HPC Applications

Quantum Espresso

2X

3X

200X

245X

249X

2X

1

2024年11月1日发(作者:荆玉石)

UNPRECEDENTED SCALE

AT EVERY SCALE

NVIDIA A100

TENSOR CORE GPU

The Most Powerful Compute Platform for

Every Workload

The NVIDIA

®

A100 Tensor Core GPU delivers unprecedented

acceleration—at every scale—to power the world’s highest-

performing elastic data centers for AI, data analytics, and

high-performance computing (HPC) applications. As the

engine of the NVIDIA data center platform, A100 provides

up to 20X higher performance over the prior NVIDIA Volta

generation. A100 can efficiently scale up or be partitioned

into seven isolated GPU instances, with Multi-Instance GPU

(MIG) providing a unified platform that enables elastic data

centers to dynamically adjust to shifting workload demands.

NVIDIA A100 Tensor Core technology supports a broad range

of math precisions, providing a single accelerator for every

workload. The latest generation A100 80GB doubles GPU

memory and debuts the world’s fastest memory bandwidth

at 2 terabytes per second (TB/s), speeding time to solution

for the largest models and most massive data sets.

A100 is part of the complete NVIDIA data center solution that

incorporates building blocks across hardware, networking,

software, libraries, and optimized AI models and applications

from NGC

. Representing the most powerful end-to-end AI

and HPC platform for data centers, it allows researchers

to deliver real-world results and deploy solutions into

production at scale.

SYSTEM SPECIFICATIONS

NVIDIA A100

for NVLink

Peak FP64

Peak FP64

Tensor Core

Peak FP32

Tensor Float 32

(TF32)

9.7 TF

19.5 TF

19.5 TF

156 TF | 312 TF*

NVIDIA A100

for PCIe

9.7 TF

19.5 TF

19.5 TF

156 TF | 312 TF*

312 TF | 624 TF*

312 TF | 624 TF*

Peak BFLOAT16 312 TF | 624 TF*

Tensor Core

Peak FP16

Tensor Core

Peak INT8

Tensor Core

Peak INT4

Tensor Core

GPU Memory

GPU Memory

Bandwidth

Interconnect

312 TF | 624 TF*

624 TOPS | 1,248 624 TOPS | 1,248

TOPS*TOPS*

1,248

TOPS | 2,496

TOPS*

40GB

1,555

GB/s

80GB

2,039

GB/s

1,248

TOPS | 2,496

TOPS*

40GB

1,555 GB/s

NVIDIA NVLink

600 GB/s**

PCIe Gen4 64 GB/s

Various instance

sizes with up to

7 MIGs @ 5 GB

PCIe

250 W

NVIDIA NVLink

600 GB/s**

PCIe Gen4 64 GB/s

Various instance

sizes with up to

7 MIGs @ 10 GB

4/8 SXM on

NVIDIA HGX

A100

400 W400 W

Multi-Instance

GPU

Form Factor

Max TDP Power

* With sparsity

** SXM GPUs via HGX A100 server boards; PCIe GPUs via NVLink

Bridge for up to 2 GPUs

A100 | DATAShEET | JAN21 | 1

Incredible Performance Across Workloads

Up to 3X Higher AI Training on

Largest Models

DLRM Training

3X

Up to 249X Higher AI Inference

Performance over CPUs

BERT-LARGE Inference

250X

Up to 1.25X Higher AI Inference

Performance over A100 40GB

RNN-T Inference: Single Stream

Up to 1.8X Higher Performance for

HPC Applications

Quantum Espresso

2X

3X

200X

245X

249X

2X

1

发布评论

评论列表 (0)

  1. 暂无评论