This article introduces the EdgeSAM model and its performance optimization, as well as the release of the EfficientSAM model. EdgeSAM achieves significant performance improvements on iPhone 14, reaching 30 frames per second, which is 40 times faster than the original model. This model effectively improves model accuracy and solves the problem of data set bias by adopting a pure CNN architecture and introducing technologies such as hint encoders, mask decoders, and lightweight modules. In addition, the application of dynamic prompt sampling strategy further improves the efficiency and accuracy of the model. The release of EfficientSAM provides valuable experience for lightweight segmentation model research.
The EdgeSAM model achieves a 40x performance improvement at 30 frames per second on iPhone 14. By optimizing the ViT-based SAM image encoder into a pure CNN architecture, it is adapted to edge devices. Introduce hint encoders, mask decoders, and lightweight modules to improve model accuracy and address dataset bias. A dynamic cue sampling strategy is employed to guide student models to focus on specific parts. At the same time, EfficientSAM was released to reduce the computational complexity of the SAM model and provide valuable experience for lightweight segmentation models.The emergence of EdgeSAM and EfficientSAM marks significant progress in the application of lightweight segmentation models on mobile devices, providing new possibilities for future AI applications in the field of edge computing and providing developers with more effective tools and Learn from experience.