clip as service下載 - clip as service源代碼下載

clip as service

其他源碼

v0.8.3

下載

剪貼畫徽標：非結構化數據的數據結構

剪輯服務是一種用於嵌入圖像和文本的低延遲高尺度性服務。它可以輕鬆地將其作為微服務集成到神經搜索解決方案中。

⚡快速：將剪貼模型與tensorrt，onnx運行時和pytorch一起使用800qps ^[*]使用pytorch。根據請求和響應的非阻滯雙工流，專為大型數據和長期運行的任務而設計。

？彈性：具有自動負載平衡的單個GPU上的多個剪輯模型的水平擴展。

？易於使用：沒有學習曲線，客戶端和服務器上的簡約設計。圖像和句子嵌入的直觀且一致的API。

？現代：異步客戶支持。輕鬆在帶有TLS和壓縮的GRPC，HTTP，Websocket協議之間切換。

？集成：與包括Jina和Docarray在內的神經搜索生態系統的平穩整合。立即構建跨模式和多模式解決方案。

^{[*]使用GeForce RTX 3090上的默認配置（單副本，Pytorch no jit）。}

文本和圖像嵌入

通過https？

通過grpc？⚡⚡

curl 
-X POST https:// < your-inference-address > -http.wolf.jina.ai/post 
-H ' Content-Type: application/json ' 
-H ' Authorization: <your access token> ' 
-d ' {"data":[{"text": "First do it"}, 
    {"text": "then do it right"}, 
    {"text": "then do it better"}, 
    {"uri": "https://picsum.photos/200"}], 
    "execEndpoint":"/"} '

 # pip install clip-client
from clip_client import Client

c = Client (
    'grpcs://<your-inference-address>-grpc.wolf.jina.ai' ,
    credential = { 'Authorization' : '<your access token>' },
)

r = c . encode (
    [
        'First do it' ,
        'then do it right' ,
        'then do it better' ,
        'https://picsum.photos/200' ,
    ]
)
print ( r )

視覺推理

有四種基本的視覺推理技能：對象識別，對象計數，顏色識別和空間關係理解。讓我們嘗試一下：

您需要安裝jq （JSON處理器）以使結果優化。

圖像	通過https？
	curl -X POST https:// < your-inference-address > -http.wolf.jina.ai/post -H ' Content-Type: application/json ' -H ' Authorization: <your access token> ' -d ' {"data":[{"uri": "https://picsum.photos/id/1/300/300", "matches": [{"text": "there is a woman in the photo"}, {"text": "there is a man in the photo"}]}], "execEndpoint":"/rank"} ' \| jq " .data[].matches[] \| (.text, .scores.clip_score.value) " 給出： `"there is a woman in the photo" 0.626907229423523 "there is a man in the photo" 0.37309277057647705`
	curl -X POST https:// < your-inference-address > -http.wolf.jina.ai/post -H ' Content-Type: application/json ' -H ' Authorization: <your access token> ' -d ' {"data":[{"uri": "https://picsum.photos/id/133/300/300", "matches": [ {"text": "the blue car is on the left, the red car is on the right"}, {"text": "the blue car is on the right, the red car is on the left"}, {"text": "the blue car is on top of the red car"}, {"text": "the blue car is below the red car"}]}], "execEndpoint":"/rank"} ' \| jq " .data[].matches[] \| (.text, .scores.clip_score.value) " 給出： `"the blue car is on the left, the red car is on the right" 0.5232442617416382 "the blue car is on the right, the red car is on the left" 0.32878655195236206 "the blue car is below the red car" 0.11064132302999496 "the blue car is on top of the red car" 0.03732786327600479`
	curl -X POST https:// < your-inference-address > -http.wolf.jina.ai/post -H ' Content-Type: application/json ' -H ' Authorization: <your access token> ' -d ' {"data":[{"uri": "https://picsum.photos/id/102/300/300", "matches": [{"text": "this is a photo of one berry"}, {"text": "this is a photo of two berries"}, {"text": "this is a photo of three berries"}, {"text": "this is a photo of four berries"}, {"text": "this is a photo of five berries"}, {"text": "this is a photo of six berries"}]}], "execEndpoint":"/rank"} ' \| jq " .data[].matches[] \| (.text, .scores.clip_score.value) " 給出： `"this is a photo of three berries" 0.48507222533226013 "this is a photo of four berries" 0.2377079576253891 "this is a photo of one berry" 0.11304923892021179 "this is a photo of five berries" 0.0731358453631401 "this is a photo of two berries" 0.05045759305357933 "this is a photo of six berries" 0.04057715833187103`

文件

安裝

剪貼即服務由兩個可以獨立安裝的Python clip-server和clip-client組成。兩者都需要Python 3.7+。

安裝服務器

Pytorch運行時⚡	ONNX運行時⚡⚡	Tensorrt運行時⚡⚡⚡
pip install clip-server	pip install " clip-server[onnx] "	pip install nvidia-pyindex pip install " clip-server[tensorrt] "

您還可以在Google Colab上託管服務器，利用其免費的GPU/TPU。

安裝客戶端

pip install clip-client

快速檢查

安裝後，您可以運行簡單的連接檢查。

CS	命令	期望輸出
伺服器	python -m clip_server
客戶	from clip_client import Client c = Client ( 'grpc://0.0.0.0:23456' ) c . profile ()

您可以將0.0.0.0更改為Intranet或公共IP地址，以測試私人和公共網絡的連接。

開始

基本用法

啟動服務器： python -m clip_server 。記住其地址和端口。

創建客戶端：

 from clip_client import Client

 c = Client ( 'grpc://0.0.0.0:51000' )

要嵌入句子：

 r = c . encode ([ 'First do it' , 'then do it right' , 'then do it better' ])

print ( r . shape )  # [3, 512]

獲取圖像嵌入：

 r = c . encode ([ 'apple.png' ,  # local image 
              'https://clip-as-service.jina.ai/_static/favicon.png' ,  # remote image
              'data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o/XBs/fNwfjZ0frl3/zy7////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAkAABAALAAAAAAQABAAAAVVICSOZGlCQAosJ6mu7fiyZeKqNKToQGDsM8hBADgUXoGAiqhSvp5QAnQKGIgUhwFUYLCVDFCrKUE1lBavAViFIDlTImbKC5Gm2hB0SlBCBMQiB0UjIQA7' ])  # in image URI

print ( r . shape )  # [3, 512]

可以在文檔中找到更全面的服務器和客戶用戶指南。

文本到圖像跨模式搜索10行

讓我們使用剪貼畫AS服務構建文本對圖像搜索。即，用戶可以輸入句子，並且程序返回匹配的圖像。我們將使用完全看起來像數據集和docarray軟件包。請注意，docArray包含在clip-client中作為上游依賴性，因此您無需單獨安裝它。

加載圖像

首先，我們加載圖像。您可以簡單地將它們從Jina Cloud中拉出來：

 from docarray import DocumentArray

da = DocumentArray . pull ( 'ttl-original' , show_progress = True , local_cache = True )

或下載TTL數據集，解壓縮，手動加載

另外，您可以完全看上去像官方網站，解壓縮和加載圖像：

 from docarray import DocumentArray

da = DocumentArray . from_files ([ 'left/*.jpg' , 'right/*.jpg' ])

該數據集包含12,032張圖像，因此拉動可能需要一段時間。完成後，您可以將其可視化並獲得這些圖像的第一個味道：

 da . plot_image_sprites ()

圖像精靈的可視化完全看起來像數據集

編碼圖像

使用python -m clip_server啟動服務器。假設使用GRPC協議為0.0.0.0:51000 （運行服務器後您將獲得此信息）。

創建一個Python客戶端腳本：

 from clip_client import Client

c = Client ( server = 'grpc://0.0.0.0:51000' )

da = c . encode ( da , show_progress = True )

根據您的GPU和客戶端服務器網絡，可能需要一段時間才能嵌入12K圖像。就我而言，大約花了兩分鐘。

下載預編碼的數據集

如果您不耐煩或沒有GPU，那麼等待可能是地獄。在這種情況下，您可以簡單地拉出我們的預編碼圖像數據集：

 from docarray import DocumentArray

da = DocumentArray . pull ( 'ttl-embedding' , show_progress = True , local_cache = True )

通過句子搜索

讓我們構建一個簡單的提示，允許用戶鍵入句子：

 while True :
    vec = c . encode ([ input ( 'sentence> ' )])
    r = da . find ( query = vec , limit = 9 )
    r [ 0 ]. plot_image_sprites ()

展示櫃

現在，您可以輸入任意英語句子，並查看前9個匹配圖像。搜索是快速而本能的。讓我們玩一些樂趣：

“快樂的馬鈴薯”	“超級邪惡的人工智能”	“一個喜歡他的漢堡的傢伙”

“貓教授非常認真”	“自我工程師與父母一起生活”	“不會有明天，所以讓我們吃不健康”

讓我們保存下一個示例的嵌入結果：

 da . save_binary ( 'ttl-image' )

圖像到文本的跨模式搜索10行

我們還可以切換最後一個程序的輸入和輸出，以實現圖像到文本搜索。確切地說，給定查詢圖像找到最能描述圖像的句子。

讓我們使用“驕傲和偏見”一書中的所有句子。

 from docarray import Document , DocumentArray

d = Document ( uri = 'https://www.gutenberg.org/files/1342/1342-0.txt' ). load_uri_to_text ()
da = DocumentArray (
    Document ( text = s . strip ()) for s in d . text . replace ( ' r n ' , '' ). split ( '.' ) if s . strip ()
)

讓我們看看我們得到的東西：

 da . summary ()

            Documents Summary            
                                         
  Length                 6403            
  Homogenous Documents   True            
  Common Attributes      ('id', 'text')  
                                         
                     Attributes Summary                     
                                                            
  Attribute   Data type   #Unique values   Has empty value  
 ────────────────────────────────────────────────────────── 
  id          ('str',)    6403             False            
  text        ('str',)    6030             False

編碼句子

現在編碼這6,403個句子，根據您的GPU和網絡可能需要10秒或更少的時間：

 from clip_client import Client

c = Client ( 'grpc://0.0.0.0:51000' )

r = c . encode ( da , show_progress = True )

下載預編碼的數據集

同樣，對於那些不耐煩或沒有GPU的人，我們已經準備了一個預編碼的文本數據集：

 from docarray import DocumentArray

da = DocumentArray . pull ( 'ttl-textual' , show_progress = True , local_cache = True )

通過圖像搜索

讓我們加載我們先前存儲的圖像嵌入，隨機採樣10個圖像文檔，然後找到每個圖像文檔的最接近1個鄰居。

 from docarray import DocumentArray

img_da = DocumentArray . load_binary ( 'ttl-image' )

for d in img_da . sample ( 10 ):
    print ( da . find ( d . embedding , limit = 1 )[ 0 ]. text )

展示櫃

愉快的時光！注意，與上一個示例不同，這裡的輸入是圖像，句子是輸出。所有句子均來自“驕傲和偏見”一書。


此外，他的容貌有真相	加德納笑了	他叫什麼名字	但是，到下午茶時間，劑量已經足夠了，先生	你看起來不好


“一個gamester！”她哭了	如果您在鈴鐺上提到我的名字，您將被關注	沒關係Lizzy小姐的頭髮	伊麗莎白很快將成為先生的妻子	我前一天晚上看到了他們

通過剪輯模型等級圖像文本匹配

從0.3.0剪輯中，服務添加了一個新的/rank終點，該端點根據其在剪輯模型中的聯合可能性重新排列了交叉模式匹配。例如，給定一個圖像文檔，其中一些預定義的句子匹配如下：

 from clip_client import Client
from docarray import Document

c = Client ( server = 'grpc://0.0.0.0:51000' )
r = c . rank (
    [
        Document (
            uri = '.github/README-img/rerank.png' ,
            matches = [
                Document ( text = f'a photo of a { p } ' )
                for p in (
                    'control room' ,
                    'lecture room' ,
                    'conference room' ,
                    'podium indoor' ,
                    'television studio' ,
                )
            ],
        )
    ]
)

print ( r [ '@m' , [ 'text' , 'scores__clip_score__value' ]])

 [['a photo of a television studio', 'a photo of a conference room', 'a photo of a lecture room', 'a photo of a control room', 'a photo of a podium indoor'], 
[0.9920725226402283, 0.006038925610482693, 0.0009973491542041302, 0.00078492151806131, 0.00010626466246321797]]

現在可以看到a photo of a television studio排名最高， clip_score得分為0.992 。在實踐中，可以使用此端點來重新將另一個搜索系統的匹配結果重新排列，以提高跨模式搜索質量。

通過剪輯模型等級文本圖像匹配

在DALL·E流項目中，剪輯被要求對DALL·e的生成結果進行排名。它的執行人包裹在夾子clip-client調用.arank() - .rank()的異步版本：

 from clip_client import Client
from jina import Executor , requests , DocumentArray


class ReRank ( Executor ):
    def __init__ ( self , clip_server : str , ** kwargs ):
        super (). __init__ ( ** kwargs )
        self . _client = Client ( server = clip_server )

    @ requests ( on = '/' )
    async def rerank ( self , docs : DocumentArray , ** kwargs ):
        return await self . _client . arank ( docs )

dalle流的夾子服務

感興趣？這只是刮擦剪貼即服務能力的表面。閱讀我們的文檔以了解更多信息。

支持

加入我們的Discord社區，與其他社區成員聊天。
觀看我們的工程所有人，以學習Jina的新功能，並使用最新的AI技術保持最新狀態。
在我們的YouTube頻道上訂閱最新視頻教程

加入我們

夾子服務由Jina AI支持，並在Apache-2.0下獲得許可。我們正在積極僱用AI工程師，解決方案工程師，以開放源代碼構建下一個神經搜索生態系統。

展開

附加信息

版本 v0.8.3
類型其他源碼
更新時間 2025-03-02
大小 11.55MB
來自於 Github

相關應用

Inf CLIP

2024-11-03
full service漢化版

2023-10-20
作為洞

2023-05-29
as hole軟體

2023-05-29
頭AS代碼

2022-07-24
夾鬥

2011-05-24

爲您推薦

chat.petals.dev

其他源碼

1.0.0
GPT Prompt Templates

其他源碼

1.0.0
GPTyped

其他源碼

GPTyped 1.0.5
waymo open dataset

其他源碼

December 2023 Update
chat.petals.dev

其他源碼

1.0.0
Sunamu

其他源碼

Release 2.2.0
waymo open dataset

其他源碼

December 2023 Update
termwind

其他類別

v2.3.0
wp functions

其他類別

1.0.0

相關資訊全部

圖像	通過https？
	curl -X POST https:// < your-inference-address > -http.wolf.jina.ai/post -H ' Content-Type: application/json ' -H ' Authorization: <your access token> ' -d ' {"data":[{"uri": "https://picsum.photos/id/1/300/300", "matches": [{"text": "there is a woman in the photo"}, {"text": "there is a man in the photo"}]}], "execEndpoint":"/rank"} ' \| jq " .data[].matches[] \| (.text, .scores.clip_score.value) " 給出： `"there is a woman in the photo" 0.626907229423523 "there is a man in the photo" 0.37309277057647705`
	curl -X POST https:// < your-inference-address > -http.wolf.jina.ai/post -H ' Content-Type: application/json ' -H ' Authorization: <your access token> ' -d ' {"data":[{"uri": "https://picsum.photos/id/133/300/300", "matches": [ {"text": "the blue car is on the left, the red car is on the right"}, {"text": "the blue car is on the right, the red car is on the left"}, {"text": "the blue car is on top of the red car"}, {"text": "the blue car is below the red car"}]}], "execEndpoint":"/rank"} ' \| jq " .data[].matches[] \| (.text, .scores.clip_score.value) " 給出： `"the blue car is on the left, the red car is on the right" 0.5232442617416382 "the blue car is on the right, the red car is on the left" 0.32878655195236206 "the blue car is below the red car" 0.11064132302999496 "the blue car is on top of the red car" 0.03732786327600479`
	curl -X POST https:// < your-inference-address > -http.wolf.jina.ai/post -H ' Content-Type: application/json ' -H ' Authorization: <your access token> ' -d ' {"data":[{"uri": "https://picsum.photos/id/102/300/300", "matches": [{"text": "this is a photo of one berry"}, {"text": "this is a photo of two berries"}, {"text": "this is a photo of three berries"}, {"text": "this is a photo of four berries"}, {"text": "this is a photo of five berries"}, {"text": "this is a photo of six berries"}]}], "execEndpoint":"/rank"} ' \| jq " .data[].matches[] \| (.text, .scores.clip_score.value) " 給出： `"this is a photo of three berries" 0.48507222533226013 "this is a photo of four berries" 0.2377079576253891 "this is a photo of one berry" 0.11304923892021179 "this is a photo of five berries" 0.0731358453631401 "this is a photo of two berries" 0.05045759305357933 "this is a photo of six berries" 0.04057715833187103`