The explosion of new and emerging analytics, artificial intelligence (AI) and high-performance computing (HPC) applications are fueling the next industrial revolution and transforming how creative and knowledge workers conquer their greatest challenges. Solving these challenges requires powerful software and hardware platforms that can accelerate the development and deployment of next-generation applications and simplify the process of deploying at scale.
The NVIDIA A800 40GB Active GPU, powered by the NVIDIA Ampere architecture, delivers unprecedented compute acceleration for workstations, delivering powerful performance to accelerate the development and deployment of next-generation data science, data analytics, AI and HPC applications. By using common frameworks to develop once and deploy anywhere, IT and creative professionals can leverage high-performance workstation platforms powered by A800 40GB Active to derive insights from large datasets, build, iterate, and refine AI-augmented applications and models, tackle the most demanding computational problems, and simplify deployment at scale. Supercomputing power on workstation platforms with A800 40GB Active, combined with NVIDIA’s GPU-optimized data science, AI and HPC software platforms, preserves scarce data center resources by using common frameworks to build once and deploy anywhere at scale.
The growth in workload complexity, data size, and the proliferation of emerging workloads like generative AI are ushering in a new era of computing, accelerating scientific discovery, improving productivity, and revolutionizing content creation. As models continue to explode in size and complexity to take on next-level challenges, an increasing number of workloads will need to run on local devices. Next-generation workstation platforms will need to deliver high-performance computing capabilities to support these complex workloads.
The NVIDIA A800 40GB Active GPU accelerates data science, AI, and HPC workflows with 432 third-generation Tensor Cores to maximize AI performance and ultra-fast and efficient inference capabilities. With third-generation NVIDIA® NVLink® technology, A800 40GB Active offers scalable performance for heavy AI workloads, doubling the effective memory footprint and enabling GPU-to-GPU data transfers up to 400 gigabytes per second (GB/s) of bidirectional bandwidth.
Enterprise adoption of AI is now mainstream and leading to an increased demand for skilled AI developers and data scientists. Organizations require a flexible, high-performance platform consisting of optimized hardware and software to maximize productivity and accelerate AI development.
NVIDIA A800 40GB Active GPUs for developer workstations include NVIDIA AI Enterprise software, supercharging AI development with best-in-class AI tools and enterprise-grade security and support.
NVIDIA AI Enterprise is an end-to-end, enterprise-grade AI software platform that offers 100+ frameworks, pretrained models, and libraries to streamline development and deployment of production AI, including generative AI, computer vision, and speech AI. Optimized and certified for reliable performance whether it’s deployed on workstations or in data centers, NVIDIA AI Enterprise provides a unified platform to develop applications once and deploy anywhere, reducing the risks involved with moving from pilot to production. NVIDIA AI Enterprise together with NVIDIA A800 40GB Active GPU delivers highest performance in data science, training, and inference out of the box.
A800 40GB Active brings the power of Tensor Cores to HPC, providing the biggest milestone since the introduction of double-precision GPU computing for HPC. The third generation of Tensor Cores in the A800 40GB Active GPU enable matrix operations in full, IEEE-compliant FP64 precision. With more than 9.7 tera floating-point operations per second (TFLOPS) of double precision (FP64), 19.5 TFLOPS of single precision (FP32), 78 TFLOPS of half precision (FP16), 1247 tera operations per second (TOPS) of integer precision (INT8), and 624 TFLOPs of tensor operation capability, A800 40GB Active supports a wide range of compute-intensive workloads flawlessly.
To support display functionality and deliver high-performance graphics for visual applications, the computing capabilities of NVIDIA A800 40GB Active are designed to be paired with NVIDIA RTX™-accelerated GPUs. Both NVIDIA RTX A4000 and T1000 GPUs are certified to run in tandem with A800 40GB Active, delivering powerful real-time ray tracing and AI-accelerated graphics performance in a single-slot form factor.
Accelerate end-to-end data science and analytics workflows with powerful performance to extract meaningful insights from large-scale datasets quickly. By combining the high-performance computing capabilities of the A800 40GB Active with NVIDIA AI Enterprise, data practitioners can leverage a large collection of libraries, tools, and technologies to accelerate data science workflows- from data prep and training to inference.
With 40GB of HBM2 memory and powerful third-generation Tensor Cores that deliver up to 2X the performance of the previous generation, the A800 40GB Active GPU delivers incredible performance to conquer demanding AI development and training workflows on workstation platforms, including data preparation and processing, model optimization and tuning, and early-stage training.
The NVIDIA AI Enterprise software platform accelerates and simplifies deploying AI at scale, allowing organizations to develop once and deploy anywhere. Coupling this powerful software platform with the A800 40GB Active GPU enables AI developers to build, iterate, and refine AI models on workstations using included frameworks, simplifying the scaling process and reserving costly data center computing resources for more expensive, large-scale computations.
Inference is where AI delivers results, providing actionable insights by operationalizing trained models. With 432 third-generation Tensor Cores and 6,912 CUDA® Cores, A800 40GB Active delivers 2X the inference operation performance versus the previous generation with support for structural sparsity and a broad range of precisions, including TF32, INT8, and FP64. AI developers can use NVIDIA inference software including NVIDIA TensorRT™, NVIDIA Triton™ Inference Server, and NVIDIA Triton™ Management Service that are part of NVIDIA AI Enterprise to simplify and optimize the deployment of AI models at scale.
Using neural networks to identify patterns and structures within existing data, generative AI applications enable users to generate new and original content from a wide variety of inputs and outputs, including images, sounds, animation, and 3D models. Leverage the NVIDIA generative AI solution, NeMo™ Framework, included in NVIDIA AI Enterprise along with NVIDIA A800 40GB Active GPU for easy, fast, and customizable generative AI model development.
The A800 40GB Active GPU delivers incredible performance for GPU-accelerated computer-aided engineering (CAE) applications. Engineering and product development professionals can run large-scale simulations for finite element analysis (FEA), computational fluid dynamics (CFD), construction engineering management (CEM), and other engineering analysis codes in full FP64 precision with incredible speed, shortening development timelines and accelerating time to value. With the addition of RTX-accelerated GPUs providing display capabilities, scientists and engineers can visualize large-scale simulations and models in full design fidelity.
With 9.7 TFLOPS of FP64 compute performance, the A800 40GB Active GPU enables geoscience professionals to power the latest AI-augmented exploration and production software workflows and accelerate simulation processes to gain faster insight into subsurface data. For large-scale datasets, two A800 40GB Active GPUs can be connected with NVLink to provide 80GB of memory and twice the processing power.
With the A800 40GB Active professionals across life science disciplines can accelerate complex data processing tasks, enable faster discovery, and improve decision-making. AI-accelerated life science applications like genomics sequencing, medical imaging, and personalized medicine can benefit from faster training and inference performance to accelerate the analysis of large datasets. For complex simulations and data processing tasks requiring high accuracy, FP64 capabilities allow for scientific applications like molecular dynamics, drug discovery, and genomic analysis to run with higher accuracy and precision, yielding more reliable results.
Architecture | NVIDIA Ampere |
CUDA Cores | 6912 |
Tensor Cores | 432 |
GPU Memory | 40 GB HBM2 |
Peak Double Precision (FP64) Performance | 9.7 TFLOPS |
FP64 Tensor Core Performance | 19.5 TFLOPS |
Peak Single Precision (FP32) Performance | 19.5 TFLOPS |
Tensor Float 32 (TF32) Tensor Core Performance | 311.8 TFLOPS |
Peak Half Precision (FP16) Performance | 78.0 TFLOPS |
BFLOAT16 Tensor Core Performance | 623.8 TFLOPS |
Peak Integer Operation (INT8) Performance | 1247.4 TOPS |
Peak Tensor Operation Performance1 | 623.8 TFLOPS |
Memory Interface | 5120-bit |
Memory Bandwidth | 1555.2 GB/s |
Max Power Consumption | 240W |
Thermal Solution | Blower Active Fan |
Multi-Instance GPU | Up to 7 MIGs @5GB |
Graphics Bus | PCIe 4.0 x16 |
Display Connectors | No Display Output Supported |
Form Factor | 4.4” H x 10.5” L Dual Slot |
Product Weight | 1181.9g |
vGPU Software Support2 | NVIDIA vPC/vApps, NVIDIA RTX Virtual Workstation (vWS) |
vGPU Profiles Supported | 4GB, 5 GB, 8GB, 10GB, 20GB, 40GB |
NVLink | 2-way low profile (2-slot and 3-slot bridges)Connect 2x A800 40GB Active |
NVLink Interconnect | 400 GB/s (bidirectional) |
Server Options | NVIDIA Certified Systems™ with 1-8 GPUs |
NVIDIA AI Enterprise | Included3 |
Power Connector | 1x PCIe CEM5 16-pin |
NVDEC | 5x decode |
1FP16 matrix multiply with FP16 or FP32 accumulate.
2Virtualization support for the A800 40GB Active will be available in an upcoming NVIDIA virtual GPU (vGPU) release, anticipated in Q3, 2023.
33-year software subscription and enterprise support for NVIDIA AI Enterprise license. Activation required.