Darknet Object Detection Framework and YOLO
Darknet is an open-source neural network framework developed primarily in C and C++, with support for CUDA acceleration.
YOLO (You Only Look Once), a cutting-edge real-time object detection system, is a prominent implementation within the Darknet framework.
Read more about how Hank.ai is contributing to the Darknet/YOLO community:
Announcing Darknet V3 "Jazz": https://darknetcv.ai/blog/announcing-darknet-v3-jazz
Darknet/YOLO Website: https://darknetcv.ai/
Darknet/YOLO FAQ: https://darknetcv.ai/faq/
Darknet/YOLO Discord Server: https://discord.gg/zSq8rtW
Papers
1. YOLOv7: https://arxiv.org/abs/2207.02696
2. Scaled-YOLOv4: https://arxiv.org/abs/2102.12725
3. YOLOv4: https://arxiv.org/abs/2004.10934
4. YOLOv3: https://pjreddie.com/media/files/papers/YOLOv3.pdf
General Information
The Darknet/YOLO framework maintains its position as one of the fastest and most accurate object detection systems.
Key advantages of Darknet/YOLO:
Free and Open Source: Darknet/YOLO is completely open source, allowing for free integration into existing projects, including commercial ones.
High Performance: Darknet V3 ("Jazz"), released in October 2024, demonstrates remarkable performance, achieving up to 1000 FPS on the LEGO dataset with an NVIDIA RTX 3090 GPU.
Versatility: The CPU version of Darknet/YOLO can be deployed on various platforms, including Raspberry Pi, cloud servers, desktops, laptops, and powerful training rigs. The GPU version requires a CUDA-capable NVIDIA GPU.
Cross-Platform Compatibility: Darknet/YOLO is known to operate seamlessly on Linux, Windows, and Mac.
Darknet Versioning
Darknet 0.x: This refers to the original Darknet tool developed by Joseph Redmon between 2013 and 2017. It lacked a formal version number.
Darknet 1.x: This version was maintained by Alexey Bochkovskiy from 2017 to 2021. It also lacked a formal version number.
Darknet 2.x "OAK": This version was sponsored by Hank.ai and maintained by Stéphane Charette starting in 2023. This was the first release to introduce a version command. It returned version 2.x until late 2024.
Darknet 3.x "JAZZ": This version, released in October 2024, marked a significant development phase, introducing a new C and C++ API, enhanced performance, and numerous bug fixes.
MSCOCO Pre-trained Weights
Various popular YOLO versions have been pre-trained on the MSCOCO dataset. This dataset consists of 80 classes, which can be found in the cfg/coco.names file.
Pre-trained weights available for download:
1. YOLOv2 (November 2016)
* YOLOv2-tiny
* YOLOv2-full
2. YOLOv3 (May 2018)
* YOLOv3-tiny
* YOLOv3-full
3. YOLOv4 (May 2020)
* YOLOv4-tiny
* YOLOv4-full
4. YOLOv7 (August 2022)
* YOLOv7-tiny
* YOLOv7-full
Example usage:
`
wget --no-clobber https://github.com/hank-ai/darknet/releases/download/v2.0/yolov4-tiny.weights
darknet02displayannotatedimages coco.names yolov4-tiny.cfg yolov4-tiny.weights image1.jpg
darknet03display_videos coco.names yolov4-tiny.cfg yolov4-tiny.weights video1.avi
DarkHelp coco.names yolov4-tiny.cfg yolov4-tiny.weights image1.jpg
DarkHelp coco.names yolov4-tiny.cfg yolov4-tiny.weights video1.avi
`
Note: The MSCOCO pre-trained weights are provided primarily for demonstration purposes. Training custom networks is strongly encouraged, with MSCOCO typically used to verify system functionality.
Building Darknet
Darknet relies on C++17 or newer, OpenCV, and utilizes CMake to generate project files.
Building process:
1. Google Colab: The Google Colab instructions are the same as the Linux instructions. Refer to the colab subdirectory for Jupyter notebooks showcasing specific tasks.
2. Linux CMake Method:
* Prerequisites:
* Build-essential tools: sudo apt-get install build-essential
* Git: sudo apt-get install git
* OpenCV: sudo apt-get install libopencv-dev
* CMake: sudo apt-get install cmake
* Installation:
* Create working directories: mkdir ~/srccd ~/src
* Clone the repository: git clone https://github.com/hank-ai/darknet
* Navigate to the Darknet directory: cd darknet
* Create a build directory: mkdir build
* Build Darknet:
* cd build
* cmake -DCMAKEBUILDTYPE=Release ..
* make -j4
* package
* Install the package: sudo dpkg -i darknet-VERSION.deb
* Optional: CUDA/cuDNN Installation
* Download and install CUDA from https://developer.nvidia.com/cuda-downloads
* Download and install cuDNN from https://developer.nvidia.com/rdp/cudnn-download or https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#cudnn-package-manager-installation-overview
* Ensure you can run nvcc and nvidia-smi. You may need to modify your PATH variable.
* If installing CUDA or cuDNN later or upgrading to a newer version, ensure to re-build Darknet after modifying your environment.
3. Windows CMake Method:
* Prerequisites:
* Git: winget install Git.Git
* CMake: winget install Kitware.CMake
* NSIS: winget install nsis.nsis
* Visual Studio 2022 Community Edition: winget install Microsoft.VisualStudio.2022.Community
* Modify Visual Studio installation to include C++ support:
* Open Visual Studio Installer
* Click "Modify"
* Select "Desktop Development with C++"
* Click "Modify" and then "Yes"
* Installation:
* Open Developer Command Prompt for VS 2022 (not PowerShell).
* Install Microsoft VCPKG:
* cd c:
* mkdir c:srccd c:src
* git clone https://github.com/microsoft/vcpkg
* cd vcpkg
* bootstrap-vcpkg.bat .vcpkg.exe integrate
* install .vcpkg.exe integrate powershell.vcpkg.exe install opencv[contrib,dnn,freetype,jpeg,openmp,png,webp,world]:x64-windows
* Clone Darknet and build:
* cd c:src
* git clone https://github.com/hank-ai/darknet.git
* cd darknet
* mkdir build
* cd build
* cmake -DCMAKEBUILDTYPE=Release -DCMAKETOOLCHAINFILE=C:/src/vcpkg/scripts/buildsystems/vcpkg.cmake ..
* msbuild.exe /property:Platform=x64;Configuration=Release /target:Build -maxCpuCount -verbosity:normal -detailedSummary darknet.sln
* msbuild.exe /property:Platform=x64;Configuration=Release PACKAGE.vcxproj
* Optional: CUDA/cuDNN Installation
* Download and install CUDA from https://developer.nvidia.com/cuda-downloads
* Download and install cuDNN from https://developer.nvidia.com/rdp/cudnn-download or https://docs.nvidia.com/deeplearning/cudnn/install-guide/index.html#download-windows
* Unzip cuDNN and copy the bin, include, and lib directories into C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/[version] (overwriting existing files if necessary).
* Ensure you can run nvcc.exe. You may need to modify your PATH variable.
Using Darknet
CLI (Command Line Interface)
General Usage: Darknet offers a command-line interface (CLI) for interacting with its functionalities. It's not exhaustive; refer to the DarkHelp project CLI for additional features.
Pre-trained Models: For most commands, you'll need the .weights file along with the corresponding .names and .cfg files. You can either train your own network (highly recommended) or utilize pre-trained models available online. Examples include:
* LEGO Gears (object detection in images)
* Rolodex (text detection in images)
* MSCOCO (standard 80-class object detection)
Common Commands:
Help: darknet help
Version: darknet version
Prediction with an Image:
* V2: darknet detector test cars.data cars.cfg cars_best.weights image1.jpg
* V3: darknet02displayannotatedimages cars.cfg image1.jpg
* DarkHelp: DarkHelp cars.cfg cars_best.weights image1.jpg
Output Coordinates:
* V2: darknet detector test animals.data animals.cfg animalsbest.weights -extoutput dog.jpg
* V3: darknet01inference_images animals dog.jpg
* DarkHelp: DarkHelp --json animals.cfg animals.names animals_best.weights dog.jpg
Video Processing:
* V2:
* darknet detector demo animals.data animals.cfg animalsbest.weights -extoutput test.mp4 (Video prediction)
* darknet detector demo animals.data animals.cfg animals_best.weights -c 0 (Webcam input)
* darknet detector demo animals.data animals.cfg animalsbest.weights test.mp4 -outfilename res.avi (Saving results to video)
* V3:
* darknet03display_videos animals.cfg test.mp4 (Video prediction)
* darknet08display_webcam animals (Webcam input)
* darknet05processvideosmultithreaded animals.cfg animals.names animals_best.weights test.mp4 (Saving results to video)
* DarkHelp:
* DarkHelp animals.cfg animals.names animals_best.weights test.mp4 (Video prediction)
* DarkHelp animals.cfg animals.names animals_best.weights test.mp4 (Saving results to video)
JSON Output:
* V2: darknet detector demo animals.data animals.cfg animalsbest.weights test50.mp4 -jsonport 8070 -mjpegport 8090 -extoutput
* V3: darknet06imagestojson animals image1.jpg
* DarkHelp: DarkHelp --json animals.names animals.cfg animals_best.weights image1.jpg
GPU Selection: darknet detector demo animals.data animals.cfg animals_best.weights -i 1 test.mp4
Accuracy Evaluation:
* darknet detector map driving.data driving.cfg driving_best.weights ... (mAP@IoU=50)
* darknet detector map animals.data animals.cfg animalsbest.weights -iouthresh 0.75 (mAP@IoU=75)
Anchor Calculation: (Use DarkMark for optimal anchor recalculation)
* darknet detector calcanchors animals.data -numof_clusters 6 -width 320 -height 256
Network Training:
* darknet detector -map -dont_show train animals.data animals.cfg
Training a New Network
DarkMark: The recommended approach for annotation and training is to utilize DarkMark, which automates the process of generating the necessary Darknet files.
Manual Setup:
1. Create a Project Directory: For instance, ~/nn/animals/ to train a network for detecting animals.
2. Copy a Configuration File: Select a template configuration file from cfg/ (e.g., cfg/yolov4-tiny.cfg) and place it in your project directory.
3. Create a .names File: In the same directory, create a text file named animals.names. List the classes you want to detect, one per line, with no blank lines or comments. Example:
`
dog
cat
bird
horse
`
4. Create a .data File: In the same directory, create a text file named animals.data. This file contains information about your training data. Example:
`
classes=4
train=/home/username/nn/animals/animals_train.txt
valid=/home/username/nn/animals/animals_valid.txt
names=/home/username/nn/animals/animals.names
backup=/home/username/nn/animals
`
5. Dataset Directory: Create a directory to store your images and corresponding annotations (e.g., ~/nn/animals/dataset). Each image will need an associated .txt file describing the annotations. These .txt files must follow a specific format and are best generated using DarkMark or similar software.
6. Train and Validation Files: Create the "train" and "valid" text files as specified in your .data file. These files list the images to be used for training and validation, respectively.
7. Modify the Configuration File:
* Batch Size: Set batch=64.
* Subdivisions: Start with subdivisions=1. Increase as necessary based on your GPU's memory capacity.
Max Batches: Use maxbatches=2000 numberofclasses. In this case, maxbatches=8000.
* Steps: Set steps to 80% and 90% of max_batches. Example: steps=6400,7200.
* Width and Height: Adjust network dimensions (width and height). Refer to the Darknet/YOLO FAQ for guidance on determining optimal sizes.
* Classes: Update classes=... to match the number of classes in your .names file (in this case, classes=4).
Filters: Adjust filters=... in the [convolutional] sections before each [yolo] section. Calculate using filters = (numberofclasses + 5) 3. In this case, filters=27.
8. Start Training: Navigate to your project directory and execute the following command:
`
darknet detector -map -dont_show train animals.data animals.cfg
`
* Verbose Output: For more detailed training information, use --verbose.
* Progress: The best weights will be saved as animals_best.weights, and training progress can be monitored through the chart.png file.
Other Tools and Links
DarkMark: For Darknet/YOLO project management, image annotation, annotation verification, and training file generation.
DarkHelp: A robust alternative CLI to Darknet, with features like image tiling, object tracking, and a C++ API for commercial applications.
Darknet/YOLO FAQ: https://darknetcv.ai/faq/
Stéphane Charette's YouTube Channel: Find tutorials and example videos: https://www.youtube.com/channel/UCOQ-nJ8l6kG3153g09XwY8g
Darknet/YOLO Discord Server: https://discord.gg/zSq8rtW
Roadmap
Last updated: 2024-10-30
Completed:
1. Replaced qsort() with std::sort() where applicable during training (some remaining cases).
2. Removed check_mistakes, getchar(), and system().
3. Converted Darknet to use the C++ compiler (g++ on Linux, VisualStudio on Windows).
4. Fixed Windows build.
5. Fixed Python support.
6. Built Darknet library.
7. Re-enabled labels on predictions ("alphabet" code).
8. Re-enabled CUDA/GPU code.
9. Re-enabled CUDNN.
10. Re-enabled CUDNN half.
11. Removed hardcoded CUDA architecture.
12. Improved CUDA version information.
13. Re-enabled AVX.
14. Removed old solutions and Makefile.
15. Made OpenCV non-optional.
16. Removed dependency on the old pthread library.
17. Removed STB.
18. Re-wrote CMakeLists.txt to use the new CUDA detection.
19. Removed old "alphabet" code and deleted 700+ images in data/labels.
20. Implemented out-of-source build.
21. Enhanced version number output.
22. Performance optimizations related to training (ongoing).
23. Performance optimizations related to inference (ongoing).
24. Employed pass-by-reference where possible.
25. Cleaned up .hpp files.
26. Re-wrote darknet.h.
27. Used cv::Mat as a proper C++ object instead of casting to void*.
28. Fixed or standardized how internal image structure is used.
29. Fixed build for ARM-based Jetson devices (new Jetson Orin devices are working).
30. Improved Python API in V3.
Short-Term Goals:
1. Replace printf() with std::cout (in progress).
2. Investigate old ZED camera support.
3. Improve and standardize command line parsing (in progress).
Mid-Term Goals:
1. Remove all char* code and replace with std::string.
2. Eliminate compiler warnings and ensure consistent warning handling (in progress).
3. Utilize cv::Mat more effectively instead of custom image structures in C (in progress).
4. Replace old list functionality with std::vector or std::list.
5. Fix support for 1-channel grayscale images.
6. Add support for N-channel images where N > 3 (e.g., images with an additional depth or thermal channel).
7. Ongoing code cleanup (in progress).
Long-Term Goals:
1. Address CUDA/CUDNN issues with all GPUs.
2. Re-write CUDA+cuDNN code.
3. Explore adding support for non-NVIDIA GPUs.
4. Implement support for rotated bounding boxes or an "angle" attribute.
5. Add support for keypoints/skeletons.
6. Implement heatmaps (in progress).
7. Implement segmentation.