grep
, ripgrep
, ugrep
, The Silver Searcher
etc.
The following tests compare the performance of hypergrep
against:
v13.0.0
v2.2.0
v3.11.2
Type | Value |
---|---|
Processor | 11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz 3.50 GHz |
Instruction Set Extensions | Intel® SSE4.1, Intel® SSE4.2, Intel® AVX2, Intel® AVX-512 |
Installed RAM | 32.0 GB (31.9 GB usable) |
SSD | ADATA SX8200PNP |
OS | Ubuntu 20.04 LTS |
C++ Compiler | g++ (Ubuntu 11.1.0-1ubuntu1-20.04) 11.1.0 |
vcpkg commit: 662dbb5
Library | Version |
---|---|
argparse | 2.9 |
concurrentqueue | 1.0.3 |
fmt | 10.0.0 |
hyperscan | 5.4.2 |
libgit2 | 1.6.4 |
OpenSubtitles.raw.en.txt
The following searches are performed on a single large file cached in memory (~13GB, OpenSubtitles.raw.en.gz
).
Regex | Line Count | ag | ugrep | ripgrep | hypergrep |
---|---|---|---|---|---|
Count number of times Holmes did somethinghgrep -c 'Holmes did w'
|
27 | n/a | 1.820 | 1.022 | 0.696 |
Literal with Regex Suffixhgrep -nw 'Sherlock [A-Z]w+' en.txt
|
7882 | n/a | 1.812 | 1.509 | 0.803 |
Simple Literalhgrep -nw 'Sherlock Holmes' en.txt
|
7653 | 15.764 | 1.888 | 1.524 | 0.658 |
Simple Literal (case insensitive)hgrep -inw 'Sherlock Holmes' en.txt
|
7871 | 15.599 | 6.945 | 2.162 | 0.650 |
Alternation of Literalshgrep -n 'Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty' en.txt
|
10078 | n/a | 6.886 | 1.836 | 0.689 |
Alternation of Literals (case insensitive)hgrep -in 'Sherlock Holmes|John Watson|Irene Adler|Inspector Lestrade|Professor Moriarty' en.txt
|
10333 | n/a | 7.029 | 3.940 | 0.770 |
Words surrounding a literal stringhgrep -n 'w+[x20]+Holmes[x20]+w+' en.txt
|
5020 | n/a | 6m 11s | 1.523 | 0.638 |
torvalds/linux
The following searches are performed on the entire Linux kernel source tree (after running make defconfig && make -j8
). The commit used is f1fcb.
Regex | Line Count | ag | ugrep | ripgrep | hypergrep |
---|---|---|---|---|---|
Simple Literalhgrep -nw 'PM_RESUME'
|
9 | 2.807 | 0.316 | 0.147 | 0.140 |
Simple Literal (case insensitive)hgrep -niw 'PM_RESUME'
|
39 | 2.904 | 0.435 | 0.149 | 0.141 |
Regex with Literal Suffixhgrep -nw '[A-Z]+_SUSPEND'
|
536 | 3.080 | 1.452 | 0.148 | 0.143 |
Alternation of four literalshgrep -nw '(ERR_SYS|PME_TURN_OFF|LINK_REQ_RST|CFG_BME_EVT)'
|
16 | 3.085 | 0.410 | 0.153 | 0.146 |
Unicode Greekhgrep -n 'p{Greek}'
|
111 | 3.762 | 0.484 | 0.345 | 0.146 |
apple/swift
The following searches are performed on the entire Apple Swift source tree. The commit used is 3865b.
Regex | Line Count | ag | ugrep | ripgrep | hypergrep |
---|---|---|---|---|---|
Function/Struct/Enum declaration followed by a valid identifier and opening parenthesishgrep -n '(func|struct|enum)s+[A-Za-z_][A-Za-z0-9_]*s*('
|
59026 | 1.148 | 0.954 | 0.154 | 0.090 |
Words starting with alphabetic characters followed by at least 2 digitshgrep -nw '[A-Za-z]+d{2,}'
|
127858 | 1.169 | 1.238 | 0.156 | 0.095 |
Workd starting with Uppercase letter, followed by alpha-numeric chars and/or underscores hgrep -nw '[A-Z][a-zA-Z0-9_]*'
|
2012372 | 3.131 | 2.598 | 0.550 | 0.482 |
Guard let statement followed by valid identifierhgrep -n 'guards+lets+[a-zA-Z_][a-zA-Z0-9_]*s*=s*w+'
|
839 | 0.828 | 0.174 | 0.054 | 0.047 |
/usr
The following searches are performed on the /usr
directory.
Regex | Line Count | ag | ugrep | ripgrep | hypergrep |
---|---|---|---|---|---|
Any HTTPS or FTP URLhgrep "(https?|ftp)://[^s/$.?#].[^s]*"
|
13682 | 4.597 | 2.894 | 0.305 | 0.171 |
Any IPv4 IP addresshgrep -w "(?:d{1,3}.){3}d{1,3}"
|
12643 | 4.727 | 2.340 | 0.324 | 0.166 |
Any E-mail addresshgrep -w "[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+.[A-Za-z]{2,}"
|
47509 | 5.477 | 37.209 | 0.494 | 0.220 |
Any valid date MM/DD/YYYY hgrep "(0[1-9]|1[0-2])/(0[1-9]|[12]d|3[01])/(19|20)d{2}"
|
116 | 4.239 | 1.827 | 0.251 | 0.163 |
Count the number of HEX valueshgrep -cw "(?:0x)?[0-9A-Fa-f]+"
|
68042 | 5.765 | 28.691 | 1.439 | 0.611 |
Search any C/C++ for a literalhgrep --filter ".(c|cpp|h|hpp)$" test
|
7355 | n/a | 0.505 | 0.118 | 0.079 |
vcpkg
git clone https://github.com/microsoft/vcpkg
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg install concurrentqueue fmt argparse libgit2 hyperscan
hypergrep
using cmake
and vcpkg
git clone https://github.com/p-ranav/hypergrep
cd hypergrep
cmake
is older than 3.19
mkdir build
cd build
cmake -DCMAKE_TOOLCHAIN_FILE=<path_to_vcpkg>/scripts/buildsystems/vcpkg.cmake ..
make
cmake
is newer than 3.19
Use the release
preset:
export VCPKG_ROOT=<path_to_vcpkg>
cmake -B build -S . --preset release
cmake --build build
To build the binary for x86_64 portability, invoke cmake with -DBUILD_PORTABLE=on
option. This will use -march=x86-64 -mtune=generic
and -static-libgcc -static-libstdc++
, and link the C++ standard library and GCC runtime statically into the binary, reducing dependencies on the target system.