This report contains a "load many images" node which is going to load the image set by the number of file names from smallest to largest, and the images will no longer be loaded in the wrong order! Setting index=0 makes it load from the first small value (image flie name) image, and index=2 will load them from the second image. Another node "load images & resize" can resize the image by the first loaded image.
flux dev running effect result runs by flux dev:
Reverse inference result screenshot caption result screenshot:
Sample workflow download workflow example download: https://github.com/Cyber-BCat/ComfyUI_Auto_Caption/blob/main/workflow/autocaption%20exampleworkflow.json
Notice: Follow these three steps to get started Note: Follow these three steps to get started
1. Install dependencies requirements.txt (note: the version of transformers cannot be too low. If you use windows, you need to install the relevant dependencies of the windows version)
Click directly: install_req.bat to install dependencies
1.Click "install_req.bat" or use cmd code to install requirements, which are necessary.
2. Run the automatic download model (manual download is recommended) 2. Run the automatic download model (manual download is recommended)
(1)."Download downloda" https://huggingface.co/google/siglip-so400m-patch14-384 "Put in putin" clip/siglip-so400m-patch14-384
(2)."Manual download only" https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha/tree/main/wpkklhc6 "Put in putin" Auto_Caption
(3)."Recommended download" https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit (If you have an A100, you can consider downloading meta-llama/Meta-Llama-3.1-8B " for A100")"putin" LLM/Meta-Llama-3.1-8B-bnb-4bit