C(和 C++)中的便携式 Roaring 位图,完全支持您最喜欢的编译器(GNU GCC、LLVM 的 clang、Visual Studio、Apple Xcode、Intel oneAPI)。包含在开源 C 软件 Awesome C 列表中。
位集,也称为位图,通常用作快速数据结构。不幸的是,它们可能使用太多内存。为了弥补这一点,我们经常使用压缩位图。
Roaring 位图是压缩位图,其性能往往优于常规压缩位图,例如 WAH、EWAH 或 Concise。它们被多个主要系统(例如 Apache Lucene)和衍生系统(例如 Solr 和 Elasticsearch、Metamarkets 的 Druid、LinkedIn Pinot、Netflix Atlas、Apache Spark、OpenSearchServer、Cloud Torrent、Whoosh、InfluxDB、Pilosa、Bleve、Microsoft Visual Studio Team)使用服务 (VSTS) 和 eBay 的 Apache Kylin。 CRoaring 库用于多个系统,例如 Apache Doris、ClickHouse、Redpanda 和 StarRocks。 YouTube SQL 引擎 Google Procella 使用 Roaring 位图进行索引。
我们发表了一篇关于该库的设计和评估的同行评审文章:
人们发现咆哮位图在许多重要的应用程序中都能很好地工作:
尽可能使用 Roaring 进行位图压缩。不要使用其他位图压缩方法(Wang et al., SIGMOD 2017)
有一个用于实现之间互操作性的序列化格式规范。因此,可以从 C++ 序列化 Roaring Bitmap,在 Java 中读取它,修改它,将其序列化回来,然后在 Go 和 Python 中读取它。
CRoring 的主要目标是提供充分利用最新硬件的高性能低级实现。 Roaring 位图已经可以通过 Java、Go、Rust... 实现在各种平台上使用。 CRoring 是一个旨在通过接近最新硬件来实现卓越性能的库。
(c) 2016-...CRoring 作者。
几乎没有人能够访问实际的大端系统。尽管如此,我们通过模拟器支持大端系统,例如 IBM s390x,但 IO 序列化除外,它仅在小端系统上受支持(请参阅问题 423)。
CRoring 库可以合并到单个源文件中,从而更容易集成到其他项目中。此外,通过将所有关键代码编译到一个编译单元中,可以提高性能。有关基本原理,请参阅 SQLite 文档或相应的 Wikipedia 条目。选择此路线的用户不需要依赖 CRoaring 的构建系统(基于 CMake)。
我们提供合并文件作为每个版本的一部分。
如果 Linux 或 macOS 用户安装了最新的 C 或 C++ 编译器以及标准实用程序 ( wget
),则可以按照以下说明进行操作。
wget https://github.com/RoaringBitmap/CRoaring/releases/download/v2.1.0/roaring.c
wget https://github.com/RoaringBitmap/CRoaring/releases/download/v2.1.0/roaring.h
wget https://github.com/RoaringBitmap/CRoaring/releases/download/v2.1.0/roaring.hh
demo.c
的新文件,其中包含以下内容: #include
#include
#include "roaring.c"
int main () {
roaring_bitmap_t * r1 = roaring_bitmap_create ();
for ( uint32_t i = 100 ; i < 1000 ; i ++ ) roaring_bitmap_add ( r1 , i );
printf ( "cardinality = %dn" , ( int ) roaring_bitmap_get_cardinality ( r1 ));
roaring_bitmap_free ( r1 );
bitset_t * b = bitset_create ();
for ( int k = 0 ; k < 1000 ; ++ k ) {
bitset_set ( b , 3 * k );
}
printf ( "%zu n" , bitset_count ( b ));
bitset_free ( b );
return EXIT_SUCCESS ;
}
demo.cpp
的新文件,其中包含以下内容: # include < iostream >
# include " roaring.hh " // the amalgamated roaring.hh includes roaring64map.hh
# include " roaring.c "
int main () {
roaring::Roaring r1;
for ( uint32_t i = 100 ; i < 1000 ; i++) {
r1. add (i);
}
std::cout << " cardinality = " << r1. cardinality () << std::endl;
roaring::Roaring64Map r2;
for ( uint64_t i = 18000000000000000100ull ; i < 18000000000000001000ull ; i++) {
r2. add (i);
}
std::cout << " cardinality = " << r2. cardinality () << std::endl;
return 0 ;
}
cc -o demo demo.c
c++ -std=c++11 -o demopp demo.cpp
./demo
cardinality = 900
1000
./demopp
cardinality = 900
cardinality = 900
如果您喜欢 CMake 和 CPM,则只需在CMakeLists.txt
文件中添加几行即可获取CRoaring
版本。请参阅我们的 CPM 演示了解更多详细信息。
cmake_minimum_required ( VERSION 3.10)
project (roaring_demo
LANGUAGES CXX C
)
set (CMAKE_CXX_STANDARD 17)
set (CMAKE_C_STANDARD 11)
add_executable (hello hello.cpp)
# You can add CPM.cmake like so:
# mkdir -p cmake
# wget -O cmake/CPM.cmake https://github.com/cpm-cmake/CPM.cmake/releases/latest/download/get_cpm.cmake
include (cmake/CPM.cmake)
CPMAddPackage(
NAME roaring
GITHUB_REPOSITORY "RoaringBitmap/CRoaring"
GIT_TAG v2.0.4
OPTIONS "BUILD_TESTING OFF"
)
target_link_libraries (hello roaring::roaring)
如果您喜欢 CMake,则只需在CMakeLists.txt
文件中添加几行即可获取CRoaring
版本。请参阅我们的演示以了解更多详细信息。
如果您在本地安装了 CRoaring 库,则可以将其与 CMake 的find_package
函数一起使用,如下例所示:
cmake_minimum_required ( VERSION 3.15)
project (test_roaring_install VERSION 0.1.0 LANGUAGES CXX C)
set (CMAKE_CXX_STANDARD 11)
set (CMAKE_CXX_STANDARD_REQUIRED ON )
set (CMAKE_C_STANDARD 11)
set (CMAKE_C_STANDARD_REQUIRED ON )
find_package (roaring REQUIRED)
file (WRITE main.cpp "
#include
#include " roaring/roaring.hh "
int main() {
roaring::Roaring r1;
for (uint32_t i = 100; i < 1000; i++) {
r1.add(i);
}
std::cout << " cardinality = " << r1.cardinality() << std::endl;
return 0;
}" )
add_executable (repro main.cpp)
target_link_libraries (repro PUBLIC roaring::roaring)
要自己生成合并文件,您可以调用 bash 脚本...
./amalgamation.sh
如果您喜欢静默输出,可以使用以下命令重定向stdout
:
./amalgamation.sh > /dev/null
(Bash shell 在 Linux 和 macOS 下是标准的。Bash shell 在 Windows 下作为 GitHub Desktop 的一部分以Git Shell
名称提供。因此,如果您已从 GitHub Desktop 中克隆了CRoaring
GitHub 存储库,则可以右键单击CRoaring
,选择Git Shell
,然后输入上述命令。)
无需调用 CRoaring 目录中的脚本。您可以从要写入合并文件的任何目录调用它。
它将为 C 用户生成三个文件: roaring.h
、 roaring.c
和amalgamation_demo.c
...以及一些简短的说明。 amalgamation_demo.c
文件是一个简短的示例,而roaring.h
和roaring.c
是“合并”文件(包括项目的所有源文件和头文件)。这意味着您只需将文件roaring.h
和roaring.c
复制到您的项目中即可开始!无需生成库!请参阅amalgamation_demo.c
文件。
C接口在文件中找到
我们还有一个 C++ 接口:
一些用户必须处理大量数据。对于这些用户来说,了解addMany
(C++) roaring_bitmap_or_many
(C) 函数可能很重要,因为在可能的情况下批量添加值会更快、更经济。此外,定期调用runOptimize
(C++) 或roaring_bitmap_run_optimize
(C) 函数可能会有所帮助。
我们有根据 Google 基准构建的微基准。在 Linux 或 macOS 下,您可以按如下方式运行它们:
cmake -B build -D ENABLE_ROARING_MICROBENCHMARKS=ON
cmake --build build
./build/microbenchmarks/bench
默认情况下,基准测试工具选择一个数据集(例如CRoaring/benchmarks/realdata/census1881
)。我们有几个数据集,您可以选择其他数据集:
./build/microbenchmarks/bench benchmarks/realdata/wikileaks-noquotes
您可以出于基准测试的目的禁用某些功能。例如,假设您有一个 x64 处理器,即使您的处理器和编译器都支持 AVX-512,您也可以在没有 AVX-512 的情况下对代码进行基准测试:
cmake -B buildnoavx512 -D ROARING_DISABLE_AVX512=ON -D ENABLE_ROARING_MICROBENCHMARKS=ON
cmake --build buildnoavx512
./buildnoavx512/microbenchmarks/bench
您也可以在没有 AVX 或 AVX-512 的情况下进行基准测试:
cmake -B buildnoavx -D ROARING_DISABLE_AVX=ON -D ENABLE_ROARING_MICROBENCHMARKS=ON
cmake --build buildnoavx
./buildnoavx/microbenchmarks/bench
对于一般用户,CRoring 将应用默认分配器,无需额外代码。但是,还为那些想要自定义内存分配器的人提供了全局内存挂钩。这是一个例子:
#include
int main (){
// define with your own memory hook
roaring_memory_t my_hook { my_malloc , my_free ...};
// initialize global memory hook
roaring_init_memory_hook ( my_hook );
// write you code here
...
}
默认情况下我们使用:
static roaring_memory_t global_memory_hook = {
. malloc = malloc ,
. realloc = realloc ,
. calloc = calloc ,
. free = free ,
. aligned_malloc = roaring_bitmap_aligned_malloc ,
. aligned_free = roaring_bitmap_aligned_free ,
};
我们要求free
/ aligned_free
函数遵循 C 约定,其中free(NULL)
/ aligned_free(NULL)
不起作用。
此示例假设 CRoaring 已构建并且您正在链接相应的库。默认情况下,CRoaring 会将其头文件安装在roaring
目录中。如果您使用合并脚本,并且未链接预构建的 CRoaring 库,则可以添加行#include "roaring.c"
并将#include
替换为#include "roaring.h"
。
#include
#include
#include
#include
bool roaring_iterator_sumall ( uint32_t value , void * param ) {
* ( uint32_t * ) param += value ;
return true; // iterate till the end
}
int main () {
// create a new empty bitmap
roaring_bitmap_t * r1 = roaring_bitmap_create ();
// then we can add values
for ( uint32_t i = 100 ; i < 1000 ; i ++ ) roaring_bitmap_add ( r1 , i );
// check whether a value is contained
assert ( roaring_bitmap_contains ( r1 , 500 ));
// compute how many bits there are:
uint32_t cardinality = roaring_bitmap_get_cardinality ( r1 );
printf ( "Cardinality = %d n" , cardinality );
// if your bitmaps have long runs, you can compress them by calling
// run_optimize
uint32_t expectedsizebasic = roaring_bitmap_portable_size_in_bytes ( r1 );
roaring_bitmap_run_optimize ( r1 );
uint32_t expectedsizerun = roaring_bitmap_portable_size_in_bytes ( r1 );
printf ( "size before run optimize %d bytes, and after %d bytesn" ,
expectedsizebasic , expectedsizerun );
// create a new bitmap containing the values {1,2,3,5,6}
roaring_bitmap_t * r2 = roaring_bitmap_from ( 1 , 2 , 3 , 5 , 6 );
roaring_bitmap_printf ( r2 ); // print it
// we can also create a bitmap from a pointer to 32-bit integers
uint32_t somevalues [] = { 2 , 3 , 4 };
roaring_bitmap_t * r3 = roaring_bitmap_of_ptr ( 3 , somevalues );
// we can also go in reverse and go from arrays to bitmaps
uint64_t card1 = roaring_bitmap_get_cardinality ( r1 );
uint32_t * arr1 = ( uint32_t * ) malloc ( card1 * sizeof ( uint32_t ));
assert ( arr1 != NULL );
roaring_bitmap_to_uint32_array ( r1 , arr1 );
roaring_bitmap_t * r1f = roaring_bitmap_of_ptr ( card1 , arr1 );
free ( arr1 );
assert ( roaring_bitmap_equals ( r1 , r1f )); // what we recover is equal
roaring_bitmap_free ( r1f );
// we can go from arrays to bitmaps from "offset" by "limit"
size_t offset = 100 ;
size_t limit = 1000 ;
uint32_t * arr3 = ( uint32_t * ) malloc ( limit * sizeof ( uint32_t ));
assert ( arr3 != NULL );
roaring_bitmap_range_uint32_array ( r1 , offset , limit , arr3 );
free ( arr3 );
// we can copy and compare bitmaps
roaring_bitmap_t * z = roaring_bitmap_copy ( r3 );
assert ( roaring_bitmap_equals ( r3 , z )); // what we recover is equal
roaring_bitmap_free ( z );
// we can compute union two-by-two
roaring_bitmap_t * r1_2_3 = roaring_bitmap_or ( r1 , r2 );
roaring_bitmap_or_inplace ( r1_2_3 , r3 );
// we can compute a big union
const roaring_bitmap_t * allmybitmaps [] = { r1 , r2 , r3 };
roaring_bitmap_t * bigunion = roaring_bitmap_or_many ( 3 , allmybitmaps );
assert (
roaring_bitmap_equals ( r1_2_3 , bigunion )); // what we recover is equal
// can also do the big union with a heap
roaring_bitmap_t * bigunionheap =
roaring_bitmap_or_many_heap ( 3 , allmybitmaps );
assert ( roaring_bitmap_equals ( r1_2_3 , bigunionheap ));
roaring_bitmap_free ( r1_2_3 );
roaring_bitmap_free ( bigunion );
roaring_bitmap_free ( bigunionheap );
// we can compute intersection two-by-two
roaring_bitmap_t * i1_2 = roaring_bitmap_and ( r1 , r2 );
roaring_bitmap_free ( i1_2 );
// we can write a bitmap to a pointer and recover it later
uint32_t expectedsize = roaring_bitmap_portable_size_in_bytes ( r1 );
char * serializedbytes = malloc ( expectedsize );
// When serializing data to a file, we recommend that you also use
// checksums so that, at deserialization, you can be confident
// that you are recovering the correct data.
roaring_bitmap_portable_serialize ( r1 , serializedbytes );
// Note: it is expected that the input follows the specification
// https://github.com/RoaringBitmap/RoaringFormatSpec
// otherwise the result may be unusable.
// The 'roaring_bitmap_portable_deserialize_safe' function will not read
// beyond expectedsize bytes.
// We also recommend that you use checksums to check that serialized data corresponds
// to the serialized bitmap. The CRoaring library does not provide checksumming.
roaring_bitmap_t * t = roaring_bitmap_portable_deserialize_safe ( serializedbytes , expectedsize );
if ( t == NULL ) { return EXIT_FAILURE ; }
const char * reason = NULL ;
// If your input came from an untrusted source, then you need to validate the
// resulting bitmap. Failing to do so could lead to undefined behavior, crashes and so forth.
if (! roaring_bitmap_internal_validate ( t , & reason )) {
return EXIT_FAILURE ;
}
// At this point, the bitmap is safe.
assert ( roaring_bitmap_equals ( r1 , t )); // what we recover is equal
roaring_bitmap_free ( t );
// we can also check whether there is a bitmap at a memory location without
// reading it
size_t sizeofbitmap =
roaring_bitmap_portable_deserialize_size ( serializedbytes , expectedsize );
assert ( sizeofbitmap ==
expectedsize ); // sizeofbitmap would be zero if no bitmap were found
// We can also read the bitmap "safely" by specifying a byte size limit.
// The 'roaring_bitmap_portable_deserialize_safe' function will not read
// beyond expectedsize bytes.
// We also recommend that you use checksums to check that serialized data corresponds
// to the serialized bitmap. The CRoaring library does not provide checksumming.
t = roaring_bitmap_portable_deserialize_safe ( serializedbytes , expectedsize );
if ( t == NULL ) {
printf ( "Problem during deserialization.n" );
// We could clear any memory and close any file here.
return EXIT_FAILURE ;
}
// We can validate the bitmap we recovered to make sure it is proper.
// If the data came from an untrusted source, you should call
// roaring_bitmap_internal_validate.
const char * reason_failure = NULL ;
if (! roaring_bitmap_internal_validate ( t , & reason_failure )) {
printf ( "safely deserialized invalid bitmap: %sn" , reason_failure );
// We could clear any memory and close any file here.
return EXIT_FAILURE ;
}
assert ( roaring_bitmap_equals ( r1 , t )); // what we recover is equal
roaring_bitmap_free ( t );
free ( serializedbytes );
// we can iterate over all values using custom functions
uint32_t counter = 0 ;
roaring_iterate ( r1 , roaring_iterator_sumall , & counter );
// we can also create iterator structs
counter = 0 ;
roaring_uint32_iterator_t * i = roaring_iterator_create ( r1 );
while ( i -> has_value ) {
counter ++ ; // could use i->current_value
roaring_uint32_iterator_advance ( i );
}
// you can skip over values and move the iterator with
// roaring_uint32_iterator_move_equalorlarger(i,someintvalue)
roaring_uint32_iterator_free ( i );
// roaring_bitmap_get_cardinality(r1) == counter
// for greater speed, you can iterate over the data in bulk
i = roaring_iterator_create ( r1 );
uint32_t buffer [ 256 ];
while ( 1 ) {
uint32_t ret = roaring_uint32_iterator_read ( i , buffer , 256 );
for ( uint32_t j = 0 ; j < ret ; j ++ ) {
counter += buffer [ j ];
}
if ( ret < 256 ) {
break ;
}
}
roaring_uint32_iterator_free ( i );
roaring_bitmap_free ( r1 );
roaring_bitmap_free ( r2 );
roaring_bitmap_free ( r3 );
return EXIT_SUCCESS ;
}
我们还支持 C 语言中的高效 64 位压缩位图:
roaring64_bitmap_t *r2 = roaring64_bitmap_create();
for ( uint64_t i = 100 ; i < 1000 ; i++) roaring64_bitmap_add(r2, i);
printf ( " cardinality (64-bit) = %d n " , ( int ) roaring64_bitmap_get_cardinality(r2));
roaring64_bitmap_free (r2);
该 API 类似于传统的 32 位位图。请参阅头文件roaring64.h
(与roaring.h
进行比较)。
我们支持约定位集(未压缩)作为库的一部分。
简单的例子:
bitset_t * b = bitset_create ();
bitset_set ( b , 10 );
bitset_get ( b , 10 ); // returns true
bitset_free ( b ); // frees memory
更高级的例子:
bitset_t * b = bitset_create ();
for ( int k = 0 ; k < 1000 ; ++ k ) {
bitset_set ( b , 3 * k );
}
// We have bitset_count(b) == 1000.
// We have bitset_get(b, 3) is true
// You can iterate through the values:
size_t k = 0 ;
for ( size_t i = 0 ; bitset_next_set_bit ( b , & i ); i ++ ) {
// You will have i == k
k += 3 ;
}
// We support a wide range of operations on two bitsets such as
// bitset_inplace_symmetric_difference(b1,b2);
// bitset_inplace_symmetric_difference(b1,b2);
// bitset_inplace_difference(b1,b2);// should make no difference
// bitset_inplace_union(b1,b2);
// bitset_inplace_intersection(b1,b2);
// bitsets_disjoint
// bitsets_intersect
在某些情况下,您可能希望将 Roaring 位图转换为传统(未压缩)位图。事实上,位集具有一些优点,例如在某些情况下具有更高的查询性能。以下代码说明了如何执行此操作:
roaring_bitmap_t * r1 = roaring_bitmap_create ();
for ( uint32_t i = 100 ; i < 100000 ; i += 1 + ( i % 5 )) {
roaring_bitmap_add ( r1 , i );
}
for ( uint32_t i = 100000 ; i < 500000 ; i += 100 ) {
roaring_bitmap_add ( r1 , i );
}
roaring_bitmap_add_range ( r1 , 500000 , 600000 );
bitset_t * bitset = bitset_create ();
bool success = roaring_bitmap_to_bitset ( r1 , bitset );
assert ( success ); // could fail due to memory allocation.
assert ( bitset_count ( bitset ) == roaring_bitmap_get_cardinality ( r1 ));
// You can then query the bitset:
for ( uint32_t i = 100 ; i < 100000 ; i += 1 + ( i % 5 )) {
assert ( bitset_get ( bitset , i ));
}
for ( uint32_t i = 100000 ; i < 500000 ; i += 100 ) {
assert ( bitset_get ( bitset , i ));
}
// you must free the memory:
bitset_free ( bitset );
roaring_bitmap_free ( r1 );
您应该意识到,在某些情况下,约定位集 ( bitset_t *
) 可能比 Roaring 位图使用更多的内存。您应该运行基准测试来确定转换为位集在您的情况下是否具有性能优势。
此示例假设 CRoaring 已构建并且您正在链接相应的库。默认情况下,CRoaring 会将其头文件安装在roaring
目录中,因此您可能需要将#include "roaring.hh"
替换为#include
。如果您使用合并脚本工作,并且没有链接到 CRoaring 预构建库,则可以添加#include "roaring.c"
行。
# include < iostream >
# include " roaring.hh "
using namespace roaring ;
int main () {
Roaring r1;
for ( uint32_t i = 100 ; i < 1000 ; i++) {
r1. add (i);
}
// check whether a value is contained
assert (r1. contains ( 500 ));
// compute how many bits there are:
uint32_t cardinality = r1. cardinality ();
// if your bitmaps have long runs, you can compress them by calling
// run_optimize
uint32_t size = r1. getSizeInBytes ();
r1. runOptimize ();
// you can enable "copy-on-write" for fast and shallow copies
r1. setCopyOnWrite ( true );
uint32_t compact_size = r1. getSizeInBytes ();
std::cout << " size before run optimize " << size << " bytes, and after "
<< compact_size << " bytes. " << std::endl;
// create a new bitmap with varargs
Roaring r2 = Roaring::bitmapOf ( 5 , 1 , 2 , 3 , 5 , 6 );
r2. printf ();
printf ( " n " );
// create a new bitmap with initializer list
Roaring r2i = Roaring::bitmapOfList ({ 1 , 2 , 3 , 5 , 6 });
assert (r2i == r2);
// we can also create a bitmap from a pointer to 32-bit integers
const uint32_t values[] = { 2 , 3 , 4 };
Roaring r3 ( 3 , values);
// we can also go in reverse and go from arrays to bitmaps
uint64_t card1 = r1. cardinality ();
uint32_t *arr1 = new uint32_t [card1];
r1. toUint32Array (arr1);
Roaring r1f (card1, arr1);
delete[] arr1;
// bitmaps shall be equal
assert (r1 == r1f);
// we can copy and compare bitmaps
Roaring z (r3);
assert (r3 == z);
// we can compute union two-by-two
Roaring r1_2_3 = r1 | r2;
r1_2_3 |= r3;
// we can compute a big union
const Roaring *allmybitmaps[] = {&r1, &r2, &r3};
Roaring bigunion = Roaring::fastunion ( 3 , allmybitmaps);
assert (r1_2_3 == bigunion);
// we can compute intersection two-by-two
Roaring i1_2 = r1 & r2;
// we can write a bitmap to a pointer and recover it later
uint32_t expectedsize = r1. getSizeInBytes ();
char *serializedbytes = new char [expectedsize];
r1. write (serializedbytes);
// readSafe will not overflow, but the resulting bitmap
// is only valid and usable if the input follows the
// Roaring specification: https://github.com/RoaringBitmap/RoaringFormatSpec/
Roaring t = Roaring::readSafe (serializedbytes, expectedsize);
assert (r1 == t);
delete[] serializedbytes;
// we can iterate over all values using custom functions
uint32_t counter = 0 ;
r1.iterate(
[]( uint32_t value, void *param) {
*( uint32_t *)param += value;
return true ;
},
&counter);
// we can also iterate the C++ way
counter = 0 ;
for (Roaring::const_iterator i = t. begin (); i != t. end (); i++) {
++counter;
}
// counter == t.cardinality()
// we can move iterators to skip values
const uint32_t manyvalues[] = { 2 , 3 , 4 , 7 , 8 };
Roaring rogue ( 5 , manyvalues);
Roaring::const_iterator j = rogue. begin ();
j. equalorlarger ( 4 ); // *j == 4
return EXIT_SUCCESS;
}
CRoring 遵循标准的 cmake 工作流程。从项目的根目录(CRoaring)开始,您可以执行以下操作:
mkdir -p build
cd build
cmake ..
cmake --build .
# follow by 'ctest' if you want to test.
# you can also type 'make install' to install the library on your system
# C header files typically get installed to /usr/local/include/roaring
# whereas C++ header files get installed to /usr/local/include/roaring
(您可以将build
目录替换为任何其他目录名称。)默认情况下,所有测试都在所有平台上构建,要跳过构建和运行测试,请在命令行中添加-DENABLE_ROARING_TESTS=OFF
。
与所有cmake
项目一样,您可以通过添加(例如) -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++
到cmake
命令行来指定要使用的编译器。
如果您使用 clang 或 gcc 并且知道目标架构,则可以通过指定-DROARING_ARCH=arch
来设置架构。例如,如果您有许多服务器,但最旧的服务器正在运行 Intel haswell
架构,您可以指定 - DROARING_ARCH=haswell
。在这种情况下,生成的二进制文件将针对具有 haswell 进程特性的处理器进行优化,并且可能无法在旧架构上运行。您可以通过输入man gcc
找到有效体系结构值的列表。
mkdir -p build_haswell
cd build_haswell
cmake -DROARING_ARCH=haswell ..
cmake --build .
对于调试版本,从项目的根目录(CRoaring)开始,尝试
mkdir -p debug
cd debug
cmake -DCMAKE_BUILD_TYPE=Debug -DROARING_SANITIZE=ON ..
ctest
要检查您的代码是否遵守样式约定(确保安装了clang-format
):
./tools/clang-format-check.sh
要根据样式约定重新格式化代码(确保安装了clang-format
):
./tools/clang-format.sh
我们假设您有一台至少装有 Visual Studio 2015 的普通 Windows PC 和 x64 处理器。
要从命令行至少使用 Visual Studio 2015 进行构建:
cmake
。VisualStudio
。CRoaring
,选择Open in Git Shell
,然后在新创建的 shell 中键入cd VisualStudio
。VisualStudio
存储库中的 shell 中键入cmake -DCMAKE_GENERATOR_PLATFORM=x64 ..
。 (或者,如果您想构建静态库,您可以使用命令行cmake -DCMAKE_GENERATOR_PLATFORM=x64 -DROARING_BUILD_STATIC=ON ..
)RoaringBitmap.sln
)。在 Visual Studio 中打开此文件。您现在应该能够构建项目并运行测试。例如,在Solution Explorer
窗口(可从View
菜单中访问)中,右键单击ALL_BUILD
并选择Build
。要测试代码,仍然在Solution Explorer
窗口中,选择RUN_TESTS
并选择Build
。要直接在 IDE 中直接使用 Visual Studio 2017 进行构建:
Visual C++ tools for CMake
。File > Open > Folder...
打开 CRoring 文件夹。Solution Explorer
中父目录中的CMakeLists.txt
,然后选择Build
以生成项目。Select Startup Item...
菜单并选择其中一项测试。按下拉列表左侧的按钮运行测试。我们在代码中针对 AVX2 和 AVX-512 进行了优化,并且它们会根据运行时检测到的硬件进行动态调整。
conan
)您可以安装预构建的二进制文件以进行roaring
,也可以使用 Conan 从源代码构建它。使用以下命令安装最新版本:
conan install --requires="roaring/[*]" --build=missing
有关如何使用 Conan 的详细说明,请参阅 Conan 文档。
柯南维护者和社区贡献者会不断更新roaring
柯南配方。如果版本已过时,请在 ConanCenterIndex 存储库上创建问题或拉取请求。
vcpkg
) Windows、Linux 和 macOS 上的 vcpkg 用户可以使用他们最喜欢的 shell 中的一个命令下载并安装roaring
。
在 Linux 和 macOS 上:
$ ./vcpkg install roaring
将构建并安装roaring
作为静态库。
在 Windows(64 位)上:
.vcpkg.exe install roaring:x64-windows
将构建并安装roaring
作为共享库。
.vcpkg.exe install roaring:x64-windows-static
将构建并安装roaring
作为静态库。
这些命令还将打印出有关如何使用 MSBuild 或基于 CMake 的项目中的库的说明。
如果您发现vcpkg
附带的roaring
版本已过时,请随时通过提交问题或创建 PR 的方式向vcpkg
社区报告。
我们的 AVX2 代码不使用浮点数或乘法,因此它不受多核英特尔处理器上睿频频率限制的影响。
我们的 AVX-512 代码仅在最新的硬件(Intel Ice Lake 或更好的以及 AMD Zen 4)上启用,其中未观察到 SIMD 特定的频率限制。
例如,与 STL 容器一样,CRoaring 库没有内置线程支持。因此,每当您在一个线程中修改位图时,在其他线程中查询它是不安全的。但是,您可以安全地复制位图并使用两个副本