Add to Compare

NVIDIA A800 40GB Active

Ampere GPU Architecture
40GB HBM2 memory
Powerful double precision computing
Support Multi-Instance GPU (MIG)
Max. Power Consumption: 240W
Graphics Interface: PCI-E 4.0 x16
Active Thermal Solution
Support 2-way NVLink
NVLink interconnect: 400GB/s bidirectional
Include NVIDIA AI Enterprise 3-Year Subscriptions
Support NVIDIA vGPU software

Powerful Performance and Versatility
for Data Science and HPC Workloads

The explosion of new and emerging analytics, artificial intelligence (AI) and high-performance computing (HPC) applications are fueling the next industrial revolution and transforming how creative and knowledge workers conquer their greatest challenges. Solving these challenges requires powerful software and hardware platforms that can accelerate the development and deployment of next-generation applications and simplify the process of deploying at scale.

The NVIDIA A800 40GB Active GPU, powered by the NVIDIA Ampere architecture, delivers unprecedented compute acceleration for workstations, delivering powerful performance to accelerate the development and deployment of next-generation data science, data analytics, AI and HPC applications. By using common frameworks to develop once and deploy anywhere, IT and creative professionals can leverage high-performance workstation platforms powered by A800 40GB Active to derive insights from large datasets, build, iterate, and refine AI-augmented applications and models, tackle the most demanding computational problems, and simplify deployment at scale. Supercomputing power on workstation platforms with A800 40GB Active, combined with NVIDIA’s GPU-optimized data science, AI and HPC software platforms, preserves scarce data center resources by using common frameworks to build once and deploy anywhere at scale.

Performance Features

Third-Generation Tensor Cores

Performance and Versatility for HPC and AI

First introduced in the NVIDIA Volta^™ architecture, NVIDIA Tensor Core technology has brought dramatic speedups to AI training and inference operations, bringing down training times from weeks to hours and providing massive acceleration to inference.

The NVIDIA Ampere architecture builds upon these innovations by providing up to 20X higher FLOPS for AI. It does so by improving the performance of existing precisions and bringing new precisions—TF32, INT8, and FP64—that accelerate and simplify AI adoption and extend the power of NVIDIA Tensor Cores to HPC.

Multi-Instance GPU (MIG)

Securely, Isolated Multi-Tenancy

Each AI and HPC application can benefit from acceleration, but not every application needs the performance of a full A800 40GB Active GPU. Multi-Instance GPU (MIG) maximizes the utilization of GPU-accelerated infrastructure by allowing an A800 40GB Active GPU to be partitioned into as many as seven independent instances, fully isolated at the hardware level. This provides multiple users access to GPU acceleration with their own high-bandwidth memory, cache, and compute cores. Now, developers can access breakthrough acceleration for all their applications, big and small, and get guaranteed quality of service. And IT administrators can offer right-sized GPU acceleration for optimal utilization and expand access to every user and application.

Third-Generation NVIDIA NVLink

Scale to Multiple GPUs With 80GB of Memory

Connect a pair of NVIDIA A800 40GB Active cards with NVLink to increase the effective memory footprint and scale application performance by enabling GPU-to-GPU data transfers at rates up to 100GB/s (bidirectional) for a total bandwidth of 200GB/s. Scaling applications across multiple GPUs requires extremely fast movement of data.The third generation of NVLink in A800 40GB Active provides 400GB/s of GPU-to-GPU direct bandwidth.

Ultra-Fast HBM2 Memory

To feed its massive computational throughput, the NVIDIA A800 40GB Active GPU has 40GB of high-speed HBM2 memory with a class-leading 1,555GB/s of memory bandwidth—a 79 percent increase compared to NVIDIA Quadro^® GV100. In addition to 40GB of HBM2 memory, A800 40GB Active has significantly more on-chip memory, including a 40 megabyte (MB) level 2 cache, which is nearly 7X larger than the previous generation. This provides the right combination of extreme bandwidth on-chip cache and large on-package high-bandwidth memory to accelerate the most compute-intensive AI models.

Benefits

High-Performance Data Science & AI Platform

The growth in workload complexity, data size, and the proliferation of emerging workloads like generative AI are ushering in a new era of computing, accelerating scientific discovery, improving productivity, and revolutionizing content creation. As models continue to explode in size and complexity to take on next-level challenges, an increasing number of workloads will need to run on local devices. Next-generation workstation platforms will need to deliver high-performance computing capabilities to support these complex workloads.

The NVIDIA A800 40GB Active GPU accelerates data science, AI, and HPC workflows with 432 third-generation Tensor Cores to maximize AI performance and ultra-fast and efficient inference capabilities. With third-generation NVIDIA^® NVLink^® technology, A800 40GB Active offers scalable performance for heavy AI workloads, doubling the effective memory footprint and enabling GPU-to-GPU data transfers up to 400 gigabytes per second (GB/s) of bidirectional bandwidth.

AI-Ready Development Platform with NVIDIA AI Enterprise

Enterprise adoption of AI is now mainstream and leading to an increased demand for skilled AI developers and data scientists. Organizations require a flexible, high-performance platform consisting of optimized hardware and software to maximize productivity and accelerate AI development.

NVIDIA A800 40GB Active GPUs for developer workstations include NVIDIA AI Enterprise software, supercharging AI development with best-in-class AI tools and enterprise-grade security and support.

NVIDIA AI Enterprise is an end-to-end, enterprise-grade AI software platform that offers 100+ frameworks, pretrained models, and libraries to streamline development and deployment of production AI, including generative AI, computer vision, and speech AI. Optimized and certified for reliable performance whether it’s deployed on workstations or in data centers, NVIDIA AI Enterprise provides a unified platform to develop applications once and deploy anywhere, reducing the risks involved with moving from pilot to production. NVIDIA AI Enterprise together with NVIDIA A800 40GB Active GPU delivers highest performance in data science, training, and inference out of the box.

Powerful Double-Precision HPC Performance

A800 40GB Active brings the power of Tensor Cores to HPC, providing the biggest milestone since the introduction of double-precision GPU computing for HPC. The third generation of Tensor Cores in the A800 40GB Active GPU enable matrix operations in full, IEEE-compliant FP64 precision. With more than 9.7 tera floating-point operations per second (TFLOPS) of double precision (FP64), 19.5 TFLOPS of single precision (FP32), 78 TFLOPS of half precision (FP16), 1247 tera operations per second (TOPS) of integer precision (INT8), and 624 TFLOPs of tensor operation capability, A800 40GB Active supports a wide range of compute-intensive workloads flawlessly.

Amplified Graphics Performance When Paired With the Power of NVIDIA RTX

To support display functionality and deliver high-performance graphics for visual applications, the computing capabilities of NVIDIA A800 40GB Active are designed to be paired with NVIDIA RTX™-accelerated GPUs. Both NVIDIA RTX A4000 and T1000 GPUs are certified to run in tandem with A800 40GB Active, delivering powerful real-time ray tracing and AI-accelerated graphics performance in a single-slot form factor.

* Note: NVIDIA A800 40GB Active doesn’t have display outputs and is intended for use as a computing accelerator only. A companion GPU is required to support display output capability.

Workloads

Data Science, Data Analytics, and AI

Data Science and Data Analytics

Accelerate end-to-end data science and analytics workflows with powerful performance to extract meaningful insights from large-scale datasets quickly. By combining the high-performance computing capabilities of the A800 40GB Active with NVIDIA AI Enterprise, data practitioners can leverage a large collection of libraries, tools, and technologies to accelerate data science workflows- from data prep and training to inference.

Training and Development

With 40GB of HBM2 memory and powerful third-generation Tensor Cores that deliver up to 2X the performance of the previous generation, the A800 40GB Active GPU delivers incredible performance to conquer demanding AI development and training workflows on workstation platforms, including data preparation and processing, model optimization and tuning, and early-stage training.

The NVIDIA AI Enterprise software platform accelerates and simplifies deploying AI at scale, allowing organizations to develop once and deploy anywhere. Coupling this powerful software platform with the A800 40GB Active GPU enables AI developers to build, iterate, and refine AI models on workstations using included frameworks, simplifying the scaling process and reserving costly data center computing resources for more expensive, large-scale computations.

Inference

Inference is where AI delivers results, providing actionable insights by operationalizing trained models. With 432 third-generation Tensor Cores and 6,912 CUDA^® Cores, A800 40GB Active delivers 2X the inference operation performance versus the previous generation with support for structural sparsity and a broad range of precisions, including TF32, INT8, and FP64. AI developers can use NVIDIA inference software including NVIDIA TensorRT™, NVIDIA Triton™ Inference Server, and NVIDIA Triton™ Management Service that are part of NVIDIA AI Enterprise to simplify and optimize the deployment of AI models at scale.

Generative AI

Using neural networks to identify patterns and structures within existing data, generative AI applications enable users to generate new and original content from a wide variety of inputs and outputs, including images, sounds, animation, and 3D models. Leverage the NVIDIA generative AI solution, NeMo™ Framework, included in NVIDIA AI Enterprise along with NVIDIA A800 40GB Active GPU for easy, fast, and customizable generative AI model development.

HPC

Manufacturing, Product Development, and Engineering

The A800 40GB Active GPU delivers incredible performance for GPU-accelerated computer-aided engineering (CAE) applications. Engineering and product development professionals can run large-scale simulations for finite element analysis (FEA), computational fluid dynamics (CFD), construction engineering management (CEM), and other engineering analysis codes in full FP64 precision with incredible speed, shortening development timelines and accelerating time to value. With the addition of RTX-accelerated GPUs providing display capabilities, scientists and engineers can visualize large-scale simulations and models in full design fidelity.

Energy and Geosciences

With 9.7 TFLOPS of FP64 compute performance, the A800 40GB Active GPU enables geoscience professionals to power the latest AI-augmented exploration and production software workflows and accelerate simulation processes to gain faster insight into subsurface data. For large-scale datasets, two A800 40GB Active GPUs can be connected with NVLink to provide 80GB of memory and twice the processing power.

Life Sciences

With the A800 40GB Active professionals across life science disciplines can accelerate complex data processing tasks, enable faster discovery, and improve decision-making. AI-accelerated life science applications like genomics sequencing, medical imaging, and personalized medicine can benefit from faster training and inference performance to accelerate the analysis of large datasets. For complex simulations and data processing tasks requiring high accuracy, FP64 capabilities allow for scientific applications like molecular dynamics, drug discovery, and genomic analysis to run with higher accuracy and precision, yielding more reliable results.

Architecture	NVIDIA Ampere
CUDA Cores	6912
Tensor Cores	432
GPU Memory	40 GB HBM2
Peak Double Precision (FP64) Performance	9.7 TFLOPS
FP64 Tensor Core Performance	19.5 TFLOPS
Peak Single Precision (FP32) Performance	19.5 TFLOPS
Tensor Float 32 (TF32) Tensor Core Performance	311.8 TFLOPS
Peak Half Precision (FP16) Performance	78.0 TFLOPS
BFLOAT16 Tensor Core Performance	623.8 TFLOPS
Peak Integer Operation (INT8) Performance	1247.4 TOPS
Peak Tensor Operation Performance¹	623.8 TFLOPS
Memory Interface	5120-bit
Memory Bandwidth	1555.2 GB/s
Max Power Consumption	240W
Thermal Solution	Blower Active Fan
Multi-Instance GPU	Up to 7 MIGs @5GB
Graphics Bus	PCIe 4.0 x16
Display Connectors	No Display Output Supported
Form Factor	4.4” H x 10.5” L Dual Slot
Product Weight	1181.9g
vGPU Software Support²	NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS)
vGPU Profiles Supported	4GB, 5 GB, 8GB, 10GB, 20GB, 40GB
NVLink	2-way low profile (2-slot and 3-slot bridges)Connect 2x A800 40GB Active
NVLink Interconnect	400 GB/s (bidirectional)
Server Options	NVIDIA Certified Systems™ with 1-8 GPUs
NVIDIA AI Enterprise	Included³
Power Connector	1x PCIe CEM5 16-pin
NVDEC	5x decode

¹FP16 matrix multiply with FP16 or FP32 accumulate.
²Virtualization support for the A800 40GB Active will be available in an upcoming NVIDIA virtual GPU (vGPU) release, anticipated in Q3, 2023.
³3-year software subscription and enterprise support for NVIDIA AI Enterprise license. Activation required.

Find the latest drivers for your NVIDIA product?

NVIDIA A800 40GB Active
Language	Version	Description
Quick Guide
(English)	Null ( 2017/3/20 )	Quadro Quick Installation Guide Total size: [ 997 KB ]
(Multilanguage)	V01 ( 2018/4/1 )	Supporting Models :P400, P600, P620, P1000, P2000, P4000, P5000, P6000, GP100, K420, K620, K1200, K2200, M4000, M5000 Quick start guide for Quadro series Total size: [ 2348 KB ]
(简体中文)	Null ( 2016/6/13 )	Quadro快速入門指南 Total size: [ 1192 KB ]
Tesla Data Sheet
(English)	Null ( 2016/10/20 )	DGX-1 Total size: [ 1356 KB ]
(English)	Null ( 2016/10/20 )	Tesla P100 Total size: [ 947 KB ]
(English)	Null ( 2016/10/20 )	Tesla P40 Total size: [ 4317 KB ]
(English)	Null ( 2016/10/20 )	Tesla P4 Total size: [ 4814 KB ]
(English)	Null ( 2016/10/20 )	Tesla M40 24GB Total size: [ 5396 KB ]
(繁體中文)	Null ( 2016/10/20 )	Tesla P100規格書 Total size: [ 1854 KB ]
(繁體中文)	Null ( 2016/10/20 )	DGX-1深度學習系統規格書 Total size: [ 974 KB ]
DM
(English)	Null ( 2016/6/13 )	Quadro Full Series DM Total size: [ 3167 KB ]
(繁體中文)	Null ( 2016/6/13 )	Quadro全系列中文型錄 Total size: [ 18612 KB ]
NVS Data Sheet
(English)	Null ( 2015/11/24 )	NVS810 Total size: [ 1298 KB ]
(English)	Null ( 2015/11/24 )	NVS510 Total size: [ 1886 KB ]
(English)	Null ( 2015/11/24 )	NVS315 Total size: [ 1149 KB ]
(English)	Null ( 2015/11/24 )	NVS310 Total size: [ 1188 KB ]
(繁體中文)	Null ( 2015/11/24 )	NVS810規格書 Total size: [ 1008 KB ]
(繁體中文)	Null ( 2015/11/24 )	NVS510規格書 Total size: [ 1366 KB ]
(繁體中文)	Null ( 2015/11/24 )	NVS315規格書 Total size: [ 1359 KB ]
(繁體中文)	Null ( 2015/11/24 )	NVS310規格書 Total size: [ 1420 KB ]
Tegra Data Sheet
(繁體中文)	Null ( 2016/6/13 )	Jetson TX1開發套件規格書 Total size: [ 9592 KB ]
(繁體中文)	Null ( 2016/6/13 )	Jetson TK1開發套件規格書 Total size: [ 9191 KB ]
Quadro Data Sheet
(English)	Null ( 2017/3/20 )	Quadro GP100 Total size: [ 1896 KB ]
(English)	Null ( 2016/9/14 )	Quadro P6000 Total size: [ 376 KB ]
(English)	Null ( 2016/9/14 )	Quadro P5000 Total size: [ 374 KB ]
(English)	Null ( 2017/3/20 )	Quadro P4000 Total size: [ 1545 KB ]
(English)	Null ( 2017/3/20 )	Quadro P2000 Total size: [ 1441 KB ]
(English)	Null ( 2017/3/20 )	Quadro P1000 Total size: [ 574 KB ]
(English)	Null ( 2017/3/20 )	Quadro P600 Total size: [ 603 KB ]
(English)	Null ( 2017/3/20 )	Quadro P400 Total size: [ 1454 KB ]
(English)	Null ( 2016/6/13 )	Quadro M6000 24GB Total size: [ 687 KB ]
(English)	Null ( 2015/11/24 )	Quadro M5000 Total size: [ 692 KB ]
(English)	Null ( 2015/11/24 )	Quadro M4000 Total size: [ 684 KB ]
(English)	Null ( 2016/6/13 )	Quadro M2000 Total size: [ 578 KB ]
(English)	Null ( 2015/11/24 )	Quadro K2200 Total size: [ 589 KB ]
(English)	Null ( 2015/11/24 )	Quadro K1200 Total size: [ 577 KB ]
(English)	Null ( 2015/11/24 )	Quadro K620 Total size: [ 595 KB ]
(English)	Null ( 2016/10/21 )	Quadro K420 2GB Total size: [ 601 KB ]
(Eastern Language)	Null ( 2017/3/20 )	Quadro GP100 規格書 Total size: [ 1916 KB ]
(繁體中文)	Null ( 2016/9/14 )	Quadro P6000規格書 Total size: [ 503 KB ]
(繁體中文)	Null ( 2016/9/14 )	Quadro P5000規格書 Total size: [ 457 KB ]
(繁體中文)	Null ( 2017/3/20 )	Quadro P4000 規格書 Total size: [ 1636 KB ]
(繁體中文)	Null ( 2017/3/20 )	Quadro P2000 規格書 Total size: [ 1481 KB ]
(繁體中文)	Null ( 2017/3/20 )	Quadro P1000 規格書 Total size: [ 1443 KB ]
(繁體中文)	Null ( 2017/3/20 )	Quadro P600 規格書 Total size: [ 1504 KB ]
(繁體中文)	Null ( 2017/3/20 )	Quadro P400規格書 Total size: [ 1504 KB ]
(繁體中文)	Null ( 2016/6/13 )	Quadro M6000 24GB規格書 Total size: [ 3151 KB ]
(繁體中文)	Null ( 2015/11/24 )	Quadro M5000規格書 Total size: [ 678 KB ]
(繁體中文)	Null ( 2015/11/24 )	Quadro M4000規格書 Total size: [ 639 KB ]
(繁體中文)	Null ( 2016/6/13 )	Quadro M2000規格書 Total size: [ 2749 KB ]
(繁體中文)	Null ( 2016/6/27 )	Quadro K2200規格書 Total size: [ 2741 KB ]
(繁體中文)	Null ( 2015/11/24 )	Quadro K1200規格書 Total size: [ 8284 KB ]
(繁體中文)	Null ( 2016/6/27 )	Quadro K620規格書 Total size: [ 2867 KB ]
(Thai)	Null ( 2016/10/20 )	Quadro P6000 Total size: [ 2905 KB ]
(Thai)	Null ( 2016/10/20 )	Quadro P5000 Total size: [ 5806 KB ]
(Vietnamese)	Null ( 2016/10/20 )	Quadro P6000 Total size: [ 3424 KB ]
(Vietnamese)	Null ( 2016/10/20 )	Quadro P5000 Total size: [ 3550 KB ]

Revised web page of product spec and information won't be noticed, product colorbox printing shows the actual information of the product.
Above product spec is for reference only, actual spec rely on the real product and Leadtek keeps the right to alter. Each sales region will impacts the product difference, please contact your supplier for making sure the actual product information.
The adapter, cable and software listed on the web page are for reference only and Leadtek keeps the right to alter, revised information won't be noticed.
Above brand name and product name are trademark of each corresponding company.
NVIDIA RTX PRO, NVIDIA RTX Workstation GPU and NVIDIA Data Center GPU are designed, built and tested by NVIDIA.

Ultra High-end NVIDIA RTX PRO / RTX Series

NVIDIA RTX PRO™ 6000 Blackwell Workstation Edition

Blackwell GPU / 24,064 CUDA Cores / 752 Tensor Cores / 188 RT Cores / 96GB DDR7 Memory with ECC / 600W

High End NVIDIA RTX PRO / RTX Series

NVIDIA RTX PRO™ 5000 Blackwell | RTX PRO™ 5000 72GB Blackwell

Blackwell GPU / 14,080 CUDA Cores / 440 Tensor Cores / 110 RT Cores / 48GB | 72GB DDR7 Memory with ECC

Mid-range NVIDIA RTX PRO / RTX Series

NVIDIA RTX PRO™ 2000 Blackwell

Blackwell GPU / 4,352 CUDA Cores / 136 Tensor Cores/34 RT Cores / 16GB DDR7 Memory with ECC

Entry-level NVIDIA RTX Series

NVIDIA RTX A400

Ampere GPU / 768 CUDA Cores / 24 Tensor Cores / RT Cores / 4GB DDR6 Memory

NVIDIA Long-Life Product

NVIDIA Long-Life Product

NVIDIA RTX professional graphics cards consistently derive advantages from NVIDIA's meticulously controlled design

NVIDIA A800 40GB Active

Powerful Performance and Versatility
for Data Science and HPC Workloads