cub下载 - cub源码下载

cub

C/C++

2.1.0

下载

️ CUB 存储库已存档，现在是统一 nvidia/cccl 存储库的一部分。请参阅此处的公告了解更多信息。请访问新存储库以获取最新更新。 ️

关于CUB

CUB 为 CUDA 编程模型的每一层提供最先进的、可重用的软件组件：

设备范围的原语
- 排序、前缀扫描、缩减、直方图等
- 兼容CUDA动态并行
块范围的“集体”原语
- I/O、排序、前缀扫描、缩减、直方图等
- 与任意螺纹块尺寸和类型兼容
扭曲范围内的“集体”基元
- Warp-wide前缀扫描、缩减等
- 安全且特定于架构
线程和资源实用程序
- PTX 内在函数、设备反射、纹理缓存迭代器、缓存内存分配器等。

CUB 包含在 NVIDIA HPC SDK 和 CUDA 工具包中。

我们推荐 CUB 项目网站以获取更多信息和示例。

一个简单的例子

# include < cub/cub.cuh >

// Block-sorting CUDA kernel
__global__ void BlockSortKernel ( int *d_in, int *d_out)
{
     using namespace cub ;

     // Specialize BlockRadixSort, BlockLoad, and BlockStore for 128 threads
     // owning 16 integer items each
     typedef BlockRadixSort< int , 128 , 16 >                     BlockRadixSort;
     typedef BlockLoad< int , 128 , 16 , BLOCK_LOAD_TRANSPOSE>   BlockLoad;
     typedef BlockStore< int , 128 , 16 , BLOCK_STORE_TRANSPOSE> BlockStore;

     // Allocate shared memory
     __shared__ union {
         typename BlockRadixSort::TempStorage  sort;
         typename BlockLoad::TempStorage       load;
         typename BlockStore::TempStorage      store;
     } temp_storage;

     int block_offset = blockIdx. x * ( 128 * 16 );	  // OffsetT for this block's ment

     // Obtain a segment of 2048 consecutive keys that are blocked across threads
     int thread_keys[ 16 ];
     BlockLoad (temp_storage. load ). Load (d_in + block_offset, thread_keys);
     __syncthreads ();

     // Collectively sort the keys
     BlockRadixSort (temp_storage. sort ). Sort (thread_keys);
     __syncthreads ();

     // Store the sorted segment
     BlockStore (temp_storage. store ). Store (d_out + block_offset, thread_keys);
}

每个线程块使用cub::BlockRadixSort对其自己的输入段进行集体排序。该类根据要排序的数据类型、每个块的线程数、每个线程的键数以及隐式的目标编译体系结构进行专门化。

cub::BlockLoad和cub::BlockStore类同样是专门化的。此外，为了提供对设备内存的合并访问，这些原语被配置为使用条带访问模式（其中连续线程同时访问连续项）来访问内存，然后将键转置为跨线程的元素的阻塞排列。

一旦专门化，这些类就会公开不透明的TempStorage成员类型。线程块使用这些存储类型来静态分配线程块所需的共享内存的联合。（或者，这些存储类型可以别名为全局内存分配）。

支持的编译器

使用以下编译器的指定版本定期测试 CUB。不受支持的版本可能会发出弃用警告，可以通过在编译期间定义 CUB_IGNORE_DEPRECATED_COMPILER 来消除警告。

NVCC 11.0+
海湾合作委员会 5+
铿锵7+
MSVC 2019+ (19.20/16.0/14.20)

发布

除了 GitHub 之外，CUB 还随 NVIDIA HPC SDK 和 CUDA 工具包一起分发。

有关特定版本的详细信息，请参阅变更日志。

CUB 发布	包含在
2.0.1	CUDA工具包12.0
2.0.0	待定
1.17.2	待定
1.17.1	待定
1.17.0	待定
1.16.0	待定
1.15.0	NVIDIA HPC SDK 22.1 和 CUDA 工具包 11.6
1.14.0	NVIDIA HPC SDK 21.9
1.13.1	CUDA 工具包 11.5
1.13.0	NVIDIA HPC SDK 21.7
1.12.1	CUDA 工具包 11.4
1.12.0	NVIDIA HPC SDK 21.3
1.11.0	CUDA 工具包 11.3
1.10.0	NVIDIA HPC SDK 20.9 和 CUDA 工具包 11.2
1.9.10-1	NVIDIA HPC SDK 20.7 和 CUDA 工具包 11.1
1.9.10	NVIDIA HPC SDK 20.5
1.9.9	CUDA工具包11.0
1.9.8-1	NVIDIA HPC SDK 20.3
1.9.8	CUDA 工具包 11.0 抢先体验
1.9.8	CUDA 11.0 抢先体验
1.8.0
1.7.5	推力1.9.2
1.7.4	推力1.9.1-2
1.7.3
1.7.2
1.7.1
1.7.0	推力1.9.0-5
1.6.4
1.6.3
1.6.2（之前为 1.5.5）
1.6.1（之前为 1.5.4）
1.6.0（之前的 1.5.3）
1.5.2
1.5.1
1.5.0
1.4.1
1.4.0
1.3.2
1.3.1
1.3.0
1.2.3
1.2.2
1.2.0
1.1.1
1.0.2
1.0.1
0.9.4
0.9.2
0.9.1
0.9.0

开发流程

CUB 和推力相互依赖。建议克隆 Thrust 并将 CUB 构建为 Thrust 的组件。

CUB 使用 CMake 构建系统来构建单元测试、示例和标头测试。要作为开发人员构建 CUB，应遵循以下秘诀：

 # Clone Thrust and CUB from Github. CUB is located in Thrust's
# `dependencies/cub` submodule.
git clone --recursive https://github.com/NVIDIA/thrust.git
cd thrust

# Create build directory:
mkdir build
cd build

# Configure -- use one of the following:
cmake -DTHRUST_INCLUDE_CUB_CMAKE=ON ..   # Command line interface.
ccmake -DTHRUST_INCLUDE_CUB_CMAKE=ON ..  # ncurses GUI (Linux only)
cmake-gui  # Graphical UI, set source/build directories and options in the app

# Build:
cmake --build . -j < num jobs >   # invokes make (or ninja, etc)

# Run tests and examples:
ctest

默认情况下，以 C++14 标准为目标，但这可以在 CMake 中更改。有关配置 CUB 构建和创建拉取请求的更多信息，请参阅 CONTRIBUTING.md。

开源许可证

CUB 可在“New BSD”开源许可证下使用：

 Copyright (c) 2010-2011, Duane Merrill.  All rights reserved.
Copyright (c) 2011-2018, NVIDIA CORPORATION.  All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
   *  Redistributions of source code must retain the above copyright
      notice, this list of conditions and the following disclaimer.
   *  Redistributions in binary form must reproduce the above copyright
      notice, this list of conditions and the following disclaimer in the
      documentation and/or other materials provided with the distribution.
   *  Neither the name of the NVIDIA CORPORATION nor the
      names of its contributors may be used to endorse or promote products
      derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL NVIDIA CORPORATION BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

展开

附加信息