marqo ecommerce embeddings 다운로드 - marqo ecommerce embeddings 소스 코드 다운로드

marqo ecommerce embeddings

기타 소스코드

1.0.0

다운로드

Marqo 전자상거래 임베딩 모델

이 연구에서는 전자상거래 제품을 위한 두 가지 최첨단 임베딩 모델인 Marqo-Ecommerce-B와 Marqo-Ecommerce-L을 소개합니다.

벤치마킹 결과에 따르면 Marqo-Ecommerce 모델은 다양한 지표에서 지속적으로 다른 모든 모델보다 우수한 성능을 보였습니다. 구체적으로 marqo-ecommerce-L marqo-ecommerce-hard 의 세 가지 작업 모두에서 현재 최고의 오픈 소스 모델인 ViT-SO400M-14-SigLIP 와 비교할 때 MRR에서 17.6% , nDCG@10에서 20.5% 의 평균 개선을 달성했습니다. marqo-ecommerce-hard 데이터세트. 최고의 프라이빗 모델인 Amazon-Titan-Multimodal 과 비교했을 때 세 가지 작업 모두에서 MRR이 평균 38.9% , nDCG@10이 45.1% , Text-to-Image 작업 전체에서 Recall이 35.9% 향상되었습니다. marqo-ecommerce-hard 데이터 세트.

더 많은 벤치마킹 결과는 아래에서 확인할 수 있습니다.

공개된 콘텐츠 :

Marqo-Ecommerce-B 및 Marqo-Ecommerce-L 임베딩 모델
평가용 GoogleShopping-1m 및 AmazonProducts-3m
평가 코드

다중 분할 시각적

모델

임베딩 모델	#Params(m)	차원	포옹얼굴	.pt 다운로드	단일 배치 텍스트 추론(A10g)	단일 배치 이미지 추론(A10g)
Marqo-전자상거래-B	203	768	Marqo/marqo-ecommerce-embeddings-B	링크	5.1ms	5.7ms
Marqo-전자상거래-L	652	1024	Marqo/marqo-ecommerce-embeddings-L	링크	10.3ms	11.0ms

OpenCLIP을 사용하여 HuggingFace에서 로드

OpenCLIP에서 모델을 로드하려면 아래를 참조하세요. 모델은 Hugging Face에서 호스팅되고 OpenCLIP을 사용하여 로드됩니다. run_models.py 내에서도 이 코드를 찾을 수 있습니다.

 pip install open_clip_torch

 from PIL import Image
import open_clip
import requests
import torch

# Specify model from Hugging Face Hub
model_name = 'hf-hub:Marqo/marqo-ecommerce-embeddings-L'
model , preprocess_train , preprocess_val = open_clip . create_model_and_transforms ( model_name )
tokenizer = open_clip . get_tokenizer ( model_name )

# Preprocess the image and tokenize text inputs
# Load an example image from a URL
img = Image . open ( requests . get ( 'https://raw.githubusercontent.com/marqo-ai/marqo-ecommerce-embeddings/refs/heads/main/images/dining-chairs.png' , stream = True ). raw )
image = preprocess_val ( img ). unsqueeze ( 0 )
text = tokenizer ([ "dining chairs" , "a laptop" , "toothbrushes" ])

# Perform inference
with torch . no_grad (), torch . cuda . amp . autocast ():
    image_features = model . encode_image ( image , normalize = True )
    text_features = model . encode_text ( text , normalize = True )

    # Calculate similarity probabilities
    text_probs = ( 100.0 * image_features @ text_features . T ). softmax ( dim = - 1 )

# Display the label probabilities
print ( "Label probs:" , text_probs )
# [1.0000e+00, 8.3131e-12, 5.2173e-12]

변환기를 사용하여 HuggingFace에서 로드

Transformers에서 모델을 로드하려면 아래를 참조하세요. 모델은 Hugging Face에서 호스팅되고 Transformers를 사용하여 로드됩니다.

 from transformers import AutoModel , AutoProcessor
import torch
from PIL import Image
import requests

model_name = 'Marqo/marqo-ecommerce-embeddings-L'
# model_name = 'Marqo/marqo-ecommerce-embeddings-B'

model = AutoModel . from_pretrained ( model_name , trust_remote_code = True )
processor = AutoProcessor . from_pretrained ( model_name , trust_remote_code = True )

img = Image . open ( requests . get ( 'https://raw.githubusercontent.com/marqo-ai/marqo-ecommerce-embeddings/refs/heads/main/images/dining-chairs.png' , stream = True ). raw ). convert ( "RGB" )
image = [ img ]
text = [ "dining chairs" , "a laptop" , "toothbrushes" ]
processed = processor ( text = text , images = image , padding = 'max_length' , return_tensors = "pt" )
processor . image_processor . do_rescale = False
with torch . no_grad ():
    image_features = model . get_image_features ( processed [ 'pixel_values' ], normalize = True )
    text_features = model . get_text_features ( processed [ 'input_ids' ], normalize = True )

    text_probs = ( 100 * image_features @ text_features . T ). softmax ( dim = - 1 )
    
print ( text_probs )
# [1.0000e+00, 8.3131e-12, 5.2173e-12]

평가

평가에는 GCL(Generalized Contrastive Learning)이 사용됩니다. 다음 코드는 scripts 에서도 찾을 수 있습니다.

 git clone https://github.com/marqo-ai/GCL

GCL에 필요한 패키지를 설치합니다.

1. GoogleShopping-Text2이미지 검색.

 cd ./GCL
MODEL=hf-hub:Marqo/marqo-ecommerce-B
outdir=MarqoModels/GE/marqo-ecommerce-B/gs-title2image
mkdir -p $outdir
hfdataset=Marqo/google-shopping-general-eval
python  evals/eval_hf_datasets_v1.py 
      --model_name $MODEL 
      --hf-dataset $hfdataset 
      --output-dir $outdir 
      --batch-size 1024 
      --num_workers 8 
      --left-key "['title']" 
      --right-key "['image']" 
      --img-or-txt "[['txt'], ['img']]" 
      --left-weight "[1]" 
      --right-weight "[1]" 
      --run-queries-cpu 
      --top-q 4000 
      --doc-id-key item_ID 
      --context-length "[[64], [0]]"

2. GoogleShopping-Category2이미지 검색.

 cd ./GCL
MODEL=hf-hub:Marqo/marqo-ecommerce-B
outdir=MarqoModels/GE/marqo-ecommerce-B/gs-cat2image
mkdir -p $outdir
hfdataset=Marqo/google-shopping-general-eval
python  evals/eval_hf_datasets_v1.py 
      --model_name $MODEL 
      --hf-dataset $hfdataset 
      --output-dir $outdir 
      --batch-size 1024 
      --num_workers 8 
      --left-key "['query']" 
      --right-key "['image']" 
      --img-or-txt "[['txt'], ['img']]" 
      --left-weight "[1]" 
      --right-weight "[1]" 
      --run-queries-cpu 
      --top-q 4000 
      --doc-id-key item_ID 
      --context-length "[[64], [0]]"

3. AmazonProducts-Category2Image 검색.

 cd ./GCL
MODEL=hf-hub:Marqo/marqo-ecommerce-B
outdir=MarqoModels/GE/marqo-ecommerce-B/ap-title2image
mkdir -p $outdir
hfdataset=Marqo/amazon-products-eval
python  evals/eval_hf_datasets_v1.py 
      --model_name $MODEL 
      --hf-dataset $hfdataset 
      --output-dir $outdir 
      --batch-size 1024 
      --num_workers 8 
      --left-key "['title']" 
      --right-key "['image']" 
      --img-or-txt "[['txt'], ['img']]" 
      --left-weight "[1]" 
      --right-weight "[1]" 
      --run-queries-cpu 
      --top-q 4000 
      --doc-id-key item_ID 
      --context-length "[[64], [0]]"

세부 성능

우리의 벤치마킹 프로세스는 두 가지 별개의 체제로 나누어졌으며 각각은 marqo-ecommerce-hard 및 marqo-ecommerce-easy의 서로 다른 전자상거래 제품 목록 데이터 세트를 사용했습니다. 두 데이터 세트 모두 제품 이미지와 텍스트를 포함했으며 크기만 달랐습니다. "쉬운" 데이터 세트는 약 10~30배 더 작으며(200,000개 제품 대 4M 제품) 속도 제한 모델, 특히 Cohere-Embeddings-v3 및 GCP-Vertex(각각 0.66rps 및 2rps 제한)를 수용하도록 설계되었습니다. "하드" 데이터세트는 400만 개의 전자상거래 제품 목록을 포함하고 실제 전자상거래 검색 시나리오를 더 잘 대표하므로 진정한 도전 과제를 나타냅니다.

이 두 시나리오 내에서 모델은 세 가지 다른 작업에 대해 벤치마킹되었습니다.

Google 쇼핑 텍스트를 이미지로 변환
Google 쇼핑 카테고리를 이미지로
Amazon 제품 텍스트를 이미지로 변환

Marqo-전자상거래-하드

Marqo-Ecommerce-Hard는 전체 400만 개의 데이터 세트를 사용하여 수행된 포괄적인 평가를 조사하여 실제 상황에서 우리 모델의 강력한 성능을 강조합니다.

GoogleShopping-Text2이미지 검색.

임베딩 모델	지도	R@10	MRR	nDCG@10
Marqo-전자상거래-L	0.682	0.878	0.683	0.726
Marqo-전자상거래-B	0.623	0.832	0.624	0.668
ViT-SO400M-14-SigLip	0.573	0.763	0.574	0.613
ViT-L-16-SigLip	0.540	0.722	0.540	0.577
ViT-B-16-SigLip	0.476	0.660	0.477	0.513
아마존-타이탄-멀티모달	0.475	0.648	0.475	0.509
지나-V1-CLIP	0.285	0.402	0.285	0.306

GoogleShopping-Category2이미지 검색.

임베딩 모델	지도	P@10	MRR	nDCG@10
Marqo-전자상거래-L	0.463	0.652	0.822	0.666
Marqo-전자상거래-B	0.423	0.629	0.810	0.644
ViT-SO400M-14-SigLip	0.352	0.516	0.707	0.529
ViT-L-16-SigLip	0.324	0.497	0.687	0.509
ViT-B-16-SigLip	0.277	0.458	0.660	0.473
아마존-타이탄-멀티모달	0.246	0.429	0.642	0.446
지나-V1-CLIP	0.123	0.275	0.504	0.294

AmazonProducts-Text2Image 검색.

임베딩 모델	지도	R@10	MRR	nDCG@10
Marqo-전자상거래-L	0.658	0.854	0.663	0.703
Marqo-전자상거래-B	0.592	0.795	0.597	0.637
ViT-SO400M-14-SigLip	0.560	0.742	0.564	0.599
ViT-L-16-SigLip	0.544	0.715	0.548	0.580
ViT-B-16-SigLip	0.480	0.650	0.484	0.515
아마존-타이탄-멀티모달	0.456	0.627	0.457	0.491
지나-V1-CLIP	0.265	0.378	0.266	0.285

Marqo-전자상거래-쉬움

언급한 대로 벤치마킹 프로세스는 marqo-ecommerce-hard와 marqo-ecommerce-easy의 두 가지 시나리오로 구분되었습니다. 이 섹션에서는 10~30배 더 작은 코퍼스를 특징으로 하고 속도 제한 모델을 수용하도록 설계된 후자를 다룹니다. 두 데이터 세트에 걸쳐 전체 200,000개의 제품을 사용하여 수행된 종합적인 평가를 살펴보겠습니다. 위에서 이미 벤치마킹한 모델 외에도 이러한 벤치마크에는 Cohere-embedding-v3 및 GCP-Vertex도 포함됩니다.

GoogleShopping-Text2이미지 검색.

임베딩 모델	지도	R@10	MRR	nDCG@10
Marqo-전자상거래-L	0.879	0.971	0.879	0.901
Marqo-전자상거래-B	0.842	0.961	0.842	0.871
ViT-SO400M-14-SigLip	0.792	0.935	0.792	0.825
GCP-정점	0.740	0.910	0.740	0.779
ViT-L-16-SigLip	0.754	0.907	0.754	0.789
ViT-B-16-SigLip	0.701	0.870	0.701	0.739
아마존-타이탄-멀티모달	0.694	0.868	0.693	0.733
지나-V1-CLIP	0.480	0.638	0.480	0.511
Cohere 임베딩-v3	0.358	0.515	0.358	0.389

GoogleShopping-Category2이미지 검색.

임베딩 모델	지도	P@10	MRR	nDCG@10
Marqo-전자상거래-L	0.515	0.358	0.764	0.590
Marqo-전자상거래-B	0.479	0.336	0.744	0.558
ViT-SO400M-14-SigLip	0.423	0.302	0.644	0.487
GCP-정점	0.417	0.298	0.636	0.481
ViT-L-16-SigLip	0.392	0.281	0.627	0.458
ViT-B-16-SigLip	0.347	0.252	0.594	0.414
아마존-타이탄-멀티모달	0.308	0.231	0.558	0.377
지나-V1-CLIP	0.175	0.122	0.369	0.229
Cohere 임베딩-v3	0.136	0.110	0.315	0.178

AmazonProducts-Text2Image 검색.

임베딩 모델	지도	R@10	MRR	nDCG@10
Marqo-전자상거래-L	0.92	0.978	0.928	0.940
Marqo-전자상거래-B	0.897	0.967	0.897	0.914
ViT-SO400M-14-SigLip	0.860	0.954	0.860	0.882
ViT-L-16-SigLip	0.842	0.940	0.842	0.865
GCP-정점	0.808	0.933	0.808	0.837
ViT-B-16-SigLip	0.797	0.917	0.797	0.825
아마존-타이탄-멀티모달	0.762	0.889	0.763	0.791
지나-V1-CLIP	0.530	0.699	0.530	0.565
Cohere 임베딩-v3	0.433	0.597	0.433	0.465

소환

 @software{zhu2024marqoecommembed_2024,
        author = {Tianyu Zhu and and Jesse Clark},
        month = oct,
        title = {{Marqo Ecommerce Embeddings - Foundation Model for Product Embeddings}},
        url = {https://github.com/marqo-ai/marqo-ecommerce-embeddings/},
        version = {1.0.0},
        year = {2024}
        }

확장하다

추가 정보