The application of GPU in the field of traditional graphic visualization has been continuously expanding. Whether it is 3D design, high-resolution multi-screen splicing display, or special effects rendering, more and more people are turning to more stable professional graphics cards. As a cutting-edge industry, deep learning and big data also use a large number of professional GPUs to speed up the development of training. NVIDIA's latest Turing architecture products have been available since last year. So what are the changes to the new Quadro RTX 5000 professional graphics card, and what have been improved? Let's find out in the unboxing and review.
RTX 5000 box appearance
Packaging and accessories
Card front
Connectors at the top of the card
Display connectors
Graphics card PCB
Component | Model |
---|---|
Motherboard | Gigabyte Z390 AORUS Master |
CPU | I9 9900K |
DISK | NVME SSD 512GB |
MEM | 64GB DDR4 |
Graphic | RTX 5000 |
Power | ATX 1000W |
System | Windows 10 64 1809/Ubuntu 16.4 |
Driver version | 419.71 |
SPECviewPerf 13 |
---|
Superposition Benchmark |
Vray Benchmark |
CUDA-Z 0.10.251 |
3Dmark Port Royal |
OctaneBench 2019 Preview |
NVIDIA Tensorflow example |
RTX 5000 | P5000 | |
---|---|---|
CUDA Cores | 3072 | 2560 |
Tensor Core | 384 | N/A |
RT Core | 48 | N/A |
GPU Memory | 16 GB GDDR6 | 16 GB GDDR5X |
Graphics Bus | PCI Express 3.0 x 16 | PCI Express 3.0 x 16 |
Bridge Mode | NVLink | SLI |
Display connectors | DP1.4(4), VirtualLink (1) | DP1.4(4) DVI-D (1) |
VRReady | YES | YES |
SYNC | Quadro SYNC II | Quadro SYNC II |
Power consumption | Total board power:265W | Total board power:180W |
SPECviewperf 13 is a benchmark software that is widely used to measure graphics performance based on professional applications. The program tests the performance of professional graphics software based on OpenGL and DirectX, and SPECviewperf 13 brings 9 new professional graphics test scenes.
The SPECviewperf 13 test is closer to real-world applications. Some of the test scenarios even contain more than 60 million fixed-point data, which fully reflects the professional graphics performance of the card. The default configuration is used in the test.
From the test results, the RTX 5000 consistently outperformed the Quadro P5000 in all benchmarks, and the performance of the snx even increased by more than 40%. It seems that the Turing architecture adds more than just the efficiency of ray tracing and deep learning. There is also tremendous increase in the performance of professional applications.
This program is more like a complex game environment, where the graphics card's DX and OpenGL rendering performance and stability with different lighting effects are put to the test.
In terms of DirectX performance, RTX 5000 is nearly 45% faster than P5000. In terms of OpenGL, RTX 5000 has improved by about 50% over P5000. The performance of the two major graphics APIs has increased by a wide margin, and it is believed that RTX 5000 has higher usability in the professional 3D visualization field.
Chaos Group's V-Ray has long been recognized by the majority of users in the rendering field. Due to the improved GPU rendering performance, Chaos Group introduced the V-Ray GPU NEXT version on V-Ray Next, which supports calling NVIDIA CUDA core for rendering. With the constantly evolving technology, the quality of GPU rendering is almost the same as that of CPU. GPUs are powerful, rendering time and cost is lower, and multi-card rendering is supported, so many renderers are adding their own GPU rendering capabilities. This benchmark only tests the rendering performance of a single card. The shorter the rendering time means the better performance.
Test screenshot
Test result
From the test results, using the latest RTX 5000 saves 35% of time compared with the P5000. The RTX 5000 is more efficient in V-Ray rendering.
Just like the CPU-Z and GPU-Z we are familiar with, CUDA-Z is a collection of basic information about NVIDIA GPU. It can be used on GeForce, Quadro and Tesla cards.
Test screenshot
CUDA performance
In the CUDA-Z test, the capability that is used the most is single-precision floating-point operations. If you are using double-precision scientific calculations, it is recommended to use GV 100 or GP 100 GPUs that have high double-precision performance. The RTX 5000's single-precision performance reaches 11.7T, a 36% increase over the previous P5000. Such a powerful single-precision floating-point performance is unmatched by the CPU, which is why more and more applications have shifted computing from the CPU to the GPU.
Test scene
Test result
Currently NVIDIA's RTX rendering feature can be benchmarked in 3DMark's Port Royal. The Quadro P5000 cannot run this benchmark because it does not have an RT Core.
The test scene has a large amount of metal material, and the reflection effect is amazing. The RTX 5000 renders at around 28 FPS, which is fairly smooth. At present, the game industry already has BF5 using this technology, and it is believed that it will be widely used in the industrial manufacturing field and the later stage of film and television production in the future.
The OC renderer is a GPU-accelerated renderer for 3D design and animation. It can be used with 3D modeling and special effects software such as 3ds Max, CINEMA 4D, NUKE, and MODO. The OC renderer is a rendering software that supports Out of Core. The latest OctaneBench 2019 Preview is a software that supports RT Core to accelerate ray-traced rendering. We can look at the difference in rendering speed between RTX on and off.
Test screenshot
The software renders the same scene with RTX on and off respectively. It can be seen in the test results that the rendering speed with RTX on is nearly three times faster than that with RTX off. It is obvious that the rendering speed has been significantly improved with the help of RT Core.
We chose an example of NVIDIA Tensorflow to test the performance of the card. With the same parameter settings, the more pictures the graphics card trains in one second, the better the performance in terms of instance deep learning.
As you can see in the screenshot above, the RTX 5000 can process up to 441 images per second at full load.
In the screenshot above, the P5000 processes up to 194 images per second.
From the performance of data processed per second, the RTX 5000 is 2.2 times the speed of the P5000. It's a lot faster because of the call to Tensor Core for calculations. It can be seen that Tensor Core still plays a big role in the acceleration of deep learning. All High-end Quadro RTX cards have Tensor Core, so they can be used for the applications with graphics plus AI. For example, AI denoise, AI image recognition, or AI inference can be implemented while rendering.
The major features of the RTX 5000 graphics card are: