segment anything下载 - segment anything源代码下载

segment anything

其他源码

1.0.0

下载

最新更新——SAM 2：分割图像和视频中的任何内容

请查看我们关于Segment Anything Model 2 (SAM 2)的新版本。

SAM 2 代码：https://github.com/facebookresearch/segment-anything-2
SAM 2 演示：https://sam2.metademolab.com/
SAM 2 论文：https://arxiv.org/abs/2408.00714

SAM 2架构

Segment Anything Model 2 (SAM 2)是解决图像和视频中快速视觉分割问题的基础模型。我们将 SAM 扩展到视频，将图像视为具有单帧的视频。该模型设计是一个简单的变压器架构，具有用于实时视频处理的流存储器。我们构建了一个模型在环数据引擎，它通过用户交互改进模型和数据，以收集我们的 SA-V 数据集，这是迄今为止最大的视频分割数据集。根据我们的数据进行训练的 SAM 2 在广泛的任务和视觉领域中提供了强大的性能。

分割任何东西

元人工智能研究，FAIR

亚历山大·基里洛夫、埃里克·明通、尼基拉·拉维、毛汉子、克洛伊·罗兰、劳拉·古斯塔夫森、肖泰特、斯宾塞·怀特海德、亚历克斯·伯格、卢万彦、皮奥特·达勒、罗斯·吉尔希克

[ Paper ] [ Project ] [ Demo ] [ Dataset ] [ Blog ] [ BibTeX ]

SAM设计

分段任意模型 (SAM)根据点或框等输入提示生成高质量的对象蒙版，并且可用于为图像中的所有对象生成蒙版。它在包含 1100 万张图像和 11 亿个掩模的数据集上进行了训练，在各种分割任务上具有强大的零样本性能。

安装

该代码需要python>=3.8 ，以及pytorch>=1.7和torchvision>=0.8 。请按照此处的说明安装 PyTorch 和 TorchVision 依赖项。强烈建议安装支持 CUDA 的 PyTorch 和 TorchVision。

安装Segment Anything：

 pip install git+https://github.com/facebookresearch/segment-anything.git

或在本地克隆存储库并安装

 git clone [email protected]:facebookresearch/segment-anything.git
cd segment-anything; pip install -e .

以下可选依赖项对于掩模后处理、以 COCO 格式保存掩模、示例笔记本以及以 ONNX 格式导出模型是必需的。运行示例笔记本还需要jupyter 。

 pip install opencv-python pycocotools matplotlib onnxruntime onnx

入门

首先下载一个模型检查点。然后只需几行即可使用该模型从给定的提示中获取掩码：

 from segment_anything import SamPredictor, sam_model_registry
sam = sam_model_registry["<model_type>"](checkpoint="<path/to/checkpoint>")
predictor = SamPredictor(sam)
predictor.set_image(<your_image>)
masks, _, _ = predictor.predict(<input_prompts>)

或为整个图像生成蒙版：

 from segment_anything import SamAutomaticMaskGenerator, sam_model_registry
sam = sam_model_registry["<model_type>"](checkpoint="<path/to/checkpoint>")
mask_generator = SamAutomaticMaskGenerator(sam)
masks = mask_generator.generate(<your_image>)

此外，还可以从命令行为图像生成蒙版：

 python scripts/amg.py --checkpoint <path/to/checkpoint> --model-type <model_type> --input <image_or_folder> --output <path/to/output>

有关更多详细信息，请参阅有关使用 SAM 和提示以及自动生成掩码的示例笔记本。

ONNX 导出

SAM 的轻量级掩码解码器可以导出为 ONNX 格式，以便它可以在支持 ONNX 运行时的任何环境中运行，例如演示中展示的浏览器内。导出模型

 python scripts/export_onnx_model.py --checkpoint <path/to/checkpoint> --model-type <model_type> --output <path/to/output>

有关如何将通过 SAM 主干进行的图像预处理与使用 ONNX 模型的掩模预测相结合的详细信息，请参阅示例笔记本。建议使用最新稳定版本的 PyTorch 进行 ONNX 导出。

网络演示

demo/文件夹有一个简单的一页 React 应用程序，它展示了如何在具有多线程的 Web 浏览器中使用导出的 ONNX 模型运行掩模预测。请参阅demo/README.md了解更多详细信息。

模型检查点

该模型的三个模型版本具有不同的骨干尺寸。这些模型可以通过运行来实例化

 from segment_anything import sam_model_registry
sam = sam_model_registry["<model_type>"](checkpoint="<path/to/checkpoint>")

单击下面的链接下载相应模型类型的检查点。

default或vit_h ：ViT-H SAM 模型。
vit_l ：ViT-L SAM 模型。
vit_b ：ViT-B SAM 模型。

数据集

请参阅此处了解数据集的概述。数据集可以在这里下载。下载数据集即表示您同意已阅读并接受 SA-1B 数据集研究许可证的条款。

我们将每个图像的蒙版保存为 json 文件。它可以按以下格式作为 python 字典加载。

{
    "image"                 : image_info ,
    "annotations"           : [ annotation ],
}

image_info {
    "image_id"              : int ,              # Image id
    "width"                 : int ,              # Image width
    "height"                : int ,              # Image height
    "file_name"             : str ,              # Image filename
}

annotation {
    "id"                    : int ,              # Annotation id
    "segmentation"          : dict ,             # Mask saved in COCO RLE format.
    "bbox"                  : [ x , y , w , h ],     # The box around the mask, in XYWH format
    "area"                  : int ,              # The area in pixels of the mask
    "predicted_iou"         : float ,            # The model's own prediction of the mask's quality
    "stability_score"       : float ,            # A measure of the mask's quality
    "crop_box"              : [ x , y , w , h ],     # The crop of the image used to generate the mask, in XYWH format
    "point_coords"          : [[ x , y ]],         # The point coordinates input to the model to generate the mask
}

图像 ID 可以在 sa_images_ids.txt 中找到，也可以使用上面的链接下载。

要将 COCO RLE 格式的掩码解码为二进制：

 from pycocotools import mask as mask_utils
mask = mask_utils.decode(annotation["segmentation"])

有关操作以 RLE 格式存储的掩码的更多说明，请参阅此处。

执照

该模型根据 Apache 2.0 许可证获得许可。

贡献

请参阅贡献和行为准则。

贡献者

Segment Anything 项目是在许多贡献者（按字母顺序排列）的帮助下实现的：

亚伦·阿德考克、瓦伊巴夫·阿加瓦尔、莫特扎·贝赫鲁兹、傅成阳、阿什利·加布里埃尔、阿胡瓦·金斯坦德、艾伦·古德曼、苏曼斯·古拉姆、胡家波、索米亚·贾因、德万什·库克雷贾、罗伯特·郭、Joshua Lane、李阳浩、Lilian Luong、Jitendra Malik、玛丽卡·马尔霍特拉、William Ngan、Omkar Parkhi、Nikhil Raina、德克罗、尼尔·塞茹尔、凡妮莎·史塔克、巴拉·瓦拉达拉詹、布拉姆·瓦斯蒂、扎克瑞·温斯特罗姆

引用任何片段

如果您在研究中使用 SAM 或 SA-1B，请使用以下 BibTeX 条目。

 @article{kirillov2023segany,
  title={Segment Anything},
  author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{'a}r, Piotr and Girshick, Ross},
  journal={arXiv:2304.02643},
  year={2023}
}

展开

附加信息