June 18, 2013—NVIDIA today announced the public availability of the latest version the NVIDIA® CUDA® parallel computing platform and programming model, which for the first time delivers support for ARM-based platforms.
Available today as a free download at http://developer.nvidia.com/cuda-toolkit, the CUDA 5.5 release candidate brings the power of GPU-accelerated computing to ARM platforms, the world’s fastest-growing processor ecosystem – approximately 10 times larger than the x86 CPU-based market.
The new CUDA release provides programmers with a robust, easy-to-use platform to develop advanced science, engineering, mobile and high performance computing (HPC) applications on ARM and x86 CPU-based systems.
“Since developers started using CUDA in 2006, successive generations of better, exponentially faster CUDA GPUs have dramatically boosted the performance of applications on x86-based systems,” said Ian Buck, general manager of GPU Computing Software at NVIDIA. “With support for ARM, the new CUDA release gives developers tremendous flexibility to quickly and easily add GPU acceleration to applications on the broadest range of next-generation HPC platforms.”
Combining high-performance CUDA-enabled GPU accelerators with low-power ARM-based SoCs enables ARM-based systems to penetrate new markets that require the highest levels of energy-efficient compute performance. These market segments include: defense systems, automotive, energy exploration, mobile computing, robotics, scientific research, HPC and others.
Robust Parallel Programming Features
In addition to providing native support for ARM platforms, the CUDA 5.5 release delivers a number of new advanced performance and productivity features, including:
- Enhanced Hyper-Q support – Now supported across multiple MPI processes on all Linux systems
- MPI Workload Prioritization – Allows application developers to prioritize CUDA streams on the critical path first, optimizing overall application run time
- New guided performance analysis – Visual Profiler and Nsight Eclipse Edition now walk developers step-by-step through the process of identifying performance bottlenecks and applying optimizations
- Fast cross-compile on x86 – Reduces development time for large applications by enabling developers to compile ARM code on fast x86 processers, and transfer the compiled application to ARM
In addition, CUDA 5.5 offers a full suite of programming tools, GPU-accelerated math libraries and documentation for both x86- and ARM-based platforms:
- Robust programming tools – Full support for the CUDA compiler, debugger and performance analysis tools
- GPU-accelerated math libraries – FFT, RNG, BLAS, sparse matrix operations, and nearly 5,000 signal- and image-processing primitives in the NVIDIA Performance Primitives (NPP) library
- Documentation/programming guides – Complete documentation, code samples and more to help developers quickly learn how to take advantage of GPU-accelerated computing
CUDA is a parallel computing platform and programming model developed by NVIDIA. It enables dramatic increases in computing performance by harnessing the power of GPUs. With more than 1.8 million downloads, supporting more than 200 leading engineering, scientific and commercial applications, the CUDA programming model is taught in over 640 universities worldwide and is the most popular way for developers to take advantage of GPU-accelerated computing.