NVIDIA® Quadro RTX™ 8000, powered by the NVIDIA Turing™ architecture and the NVIDIA RTX™ platform, combines unparalleled performance and memory capacity to deliver the world’s most powerful graphics card solution for professional workflows. Creative and technical professionals can now wield the power of hardware-accelerated ray tracing, AI and advanced shading to boost productivity and create amazing content faster than ever before.
The Quadro RTX 8000 features 72 RT cores for real-time ray tracing and 576 Tensor Cores for AI enhanced workflows, resulting in over 130 TFLOPS of deep learning performance. With 48 GB of GDDR6 memory, scalable to 96 GB with NVIDIA NVLink technology, the Quadro RTX 8000 is designed to work with the most memory intensive workloads like creating the most complex models, building massive architectural datasets, visualizing immense data science workloads, working with 8K movie content in real time, and speeding up high resolution final frame rendering. VirtualLink® provides connectivity to next-generation, high-resolution VR HMDs to let you view your work in the most compelling virtual environments. The NVIDIA Quadro RTX 8000 redefines what’s possible.
Links two GPUs with a high-speed interconnect to scale memory capacity to 48 GB and drive higher performance with up to 100 GB/s of data transfer.
Equipped with industry-first 24 GB of ultra-fast GDDR6 memory to hold complex designs, massive architectural datasets, 8K movie content, and more.
Armed with the all-new RT Core for ray tracing, 576 Tensor Cores for AI and 4,068 CUDA cores for parallel computing, NVIDIA Turing is simply the world's most advanced GPU.
Industry-first implementation of VirtualLink to simplify connectivity to current and next-generation high-resolution VR head-mounted displays.
Connect a pair of Quadro RTX 8000 cards with NVLink to double the effective memory footprint and scale application performance by enabling GPU-to-GPU data transfers at rates up to 100 GB/s (total bidirectional bandwidth).
Leverage multiple GPUs to dynamically scale graphics performance, enhance image quality, expand display real estate, and assemble a fully virtualized system.
Dramatically reduce visual aliasing artifacts or "jaggies" with up to 64X FSAA (128x with SLI )for unparalleled image quality and highly realistic scenes.
Texture from and render to 32K x 32K surfaces to support applications that demand the highest resolution and quality image processing.
Synchronize the display and image output of up to 32 displays[iii] from 8 GPUs (connected through two Sync II boards) in a single system, reducing the number of machines needed to create an advanced video visualization environment.
Deep learning frameworks such as Caffe2, MXNet, CNTK, TensorFlow, and others deliver dramatically faster training times and higher multi-node training performance. GPU accelerated libraries such as cuDNN, cuBLAS, and TensorRT delivers higher performance for both deep learning inference and High-Performance Computing (HPC) applications.
Natively execute standard programming languages like C/C++ and Fortran, and APIs such as OpenCL, OpenACC and Direct Compute to accelerates techniques such as ray tracing, video and image processing, and computation fluid dynamics.
A single, seamless 49-bit virtual address space allows for the transparent migration of data between the full allocation of CPU and GPU memory.
GPUDirect for Video speeds communication between the GPU and video I/O devices by avoiding unnecessary system memory copies and CPU overhead.
Maximize system uptime, seamlessly manage wide-scale deployments and remotely control graphics and display settings for efficient operations.
i This feature requires implementation by software applications and it is not a stand-alone utility. Please contact quadrohelp@nvidia.com for details on availability.
ii Application must be aware of and be optimized for NVLink to take advantage of this capability.
iii Feature supported in future driver release.
GPU Architecture | Turing |
CUDA Parallel Processing cores | 4608 |
NVIDIA Tensor Cores | 576 |
NVIDIA RT Cores | 72 |
Frame Buffer Memory | 48 GB GDDR6 |
RTX-OPS | 84T |
Rays Cast | 10 Giga Rays/Sec |
Peak Single Precision (FP32) Performance | 16.3 TFLOPS |
Peak Half Precision (FP16) Performance | 32.6 TFLOPS |
Peak Integer Operation (INT8) Performance | 65.2 TOPS |
Deep Learning TeraFLOPS1 | 130.5 Tensor TFLOPS |
Memory Interface | 384-bit |
Memory Bandwidth | 672 GB/s |
Max Power Consumption | 295 W |
Graphics Bus | PCI Express 3.0 x16 |
Display Connectors | DP 1.4 (4) + VirtualLink (1) |
Form Factor | 4.4” H x 10.5” L Dual Slot |
Product Weight | 1.002 kg |
Thermal Solution | Active |
Power Connector | 1x 8-pin & 1x 6-pin N |
Frame lock | Compatible (with Quadro Sync II) |
NVLink Interconnect | 100 GB/s |
1 FP16 matrix multiply with FP16 or FP32 accumulate