Welcome to the project documentation for Object Segmentation and Position Shifting. This repository contains the code for segmenting an object in a given image based on a text prompt and then moving the object to a new position within the same scene. It leverages cutting-edge generative AI models to accomplish this task in a user-friendly way.
This project involves two key tasks aimed at post-production editing of product images, particularly for e-commerce purposes.
Task 1: Segments an object in the image based on a text prompt (e.g., "shelf") and highlights it with a red mask.
Task 2: Moves the object within the scene by shifting it in the x and y directions, as specified by the user.
For instance, you may have an image containing a "shelf." You can run the segmentation to highlight all shelves in the image. Then, you can move the identified shelf within the scene using pixel offsets.
Text-based Object Segmentation: Identifies objects using class prompts like "shelf" and highlights them.
Position Shift: Moves the object by pixel values in the x (horizontal) and y (vertical) directions.
Pre-built models: Leverages existing models like SAM (Segment Anything Model) and Stable Diffusion Inpainting for object manipulation without requiring retraining.
To set up the project locally, follow these steps:
Clone the Repository:
git clone https://github.com/your-repo/object-segmentation-shift.git
cd object-segmentation-shift
Create and Activate a Virtual Environment:
python -m venv venv
source venv/bin/activate # On Windows, use `venvScriptsactivate`
Install Dependencies:
pip install -r requirements.txt
To segment an object in an image based on a class prompt:
python run.py --image ./example.jpg --class shelf --output ./generated.png
This command will take example.jpg
, segment all instances of the object specified in the class prompt (e.g., "shelf"), and output an image with red masks on those objects.
To shift the segmented object in the image:
python run.py --image ./example.jpg --class shelf --x 80 --y 0
This will move the identified shelf 80 pixels to the right and 0 pixels in the vertical direction.
Here are the results from applying the algorithm to sample images:
Input Image | Segmentation (Task 1) | Shifted Object (Task 2) |
---|---|---|
Data | res1 | res2 |
Task 1: The object is highlighted in red.
Before segmentation
After segmentation
Task 2: The object is shifted as per the user-defined x and y values.
Complex Object Boundaries: In cases where objects have intricate boundaries or are partially obscured, segmentation may not be perfect.
Shifting Artifacts: When moving objects with complex backgrounds, ensuring that the background regenerates naturally poses challenges.
Fine-tuning Models: Experimenting with fine-tuning techniques to improve segmentation precision.
Seamless Background Inpainting: Using advanced inpainting techniques to handle background reconstruction after object movement.
If you'd like to contribute to this project, please follow these steps:
This project is licensed under the MIT License. See the LICENSE file for details.