embodied agents下載 - embodied agents原始碼下載

文檔：文檔

簡單的機器人代理範例：
使用 SimplerEnv 的模擬範例：
？使用 OpenVLA 的汽車代理：
⏺️在機器人上記錄資料集

？支援、討論和操作方法：

更新：

2024 年 8 月 28 日，體現代理 v1.2

新的文檔網站上線了！
新增了在機器人上本地記錄資料集的功能。
新增多個新的感官代理，即深度估計、物件偵測、託管公共 API 端點的影像分割。和一個簡單的mbodied來嘗試它們。
新增了自動代理程式以進行動態代理選擇。

2024 年 6 月 30 日，embodied-agents v1.0 ：

新增了支援 OpenVLA 的 Motor Agent，並託管免費 API 端點。
新增了支援 ie 3D 物件姿勢檢測的 Sensory Agent。
改進了自動資料集記錄。
代理現在可以對 API 伺服器（即 Gradio、vLLM）進行遠端操作呼叫。
已修復錯誤並改進效能。
PyPI 專案已重新命名為mbodied 。

具身代理人

embodied Agents是一個工具包，只需幾行程式碼即可將大型多模態模型整合到現有的機器人堆疊中。它提供一致性、可靠性、可擴展性，並且可配置到任何觀察和操作空間。

範例類

Sample 類別是用於序列化、記錄和操作任意資料的基本模型。它被設計為可擴展、靈活且強類型。透過將觀察或操作物件包裝在 Sample 類別中，您將能夠輕鬆地在下列物件之間進行轉換：

用於創建新健身房環境的健身房空間。
用於插入 ML 模型的扁平列表、陣列或張量。
具有語義搜尋功能的 HuggingFace 資料集。
Pydantic BaseModel 用於可靠且快速的 json 序列化/反序列化。

要了解有關具體代理的所有可能性的更多信息，請查看文檔

你可知道

您可以將Sample或 Dict 清單pack到單一Sample或Dict中並相應地unpack嗎？
只要為 Sample 類別提供有效的 json 模式，您就可以將任何 python 結構unflatten為Sample類別嗎？

API參考

建立樣本

建立 Sample 只需要使用Sample類別包裝一個 Python 字典。此外，它們還可以由 kwargs、Gym Spaces 和 Tensors 等製成。

 from mbodied . types . sample import Sample
# Creating a Sample instance
sample = Sample ( observation = [ 1 , 2 , 3 ], action = [ 4 , 5 , 6 ])

# Flattening the Sample instance
flat_list = sample . flatten ()
print ( flat_list ) # Output: [1, 2, 3, 4, 5, 6]

# Generating a simplified JSON schema
>> > schema = sample . schema ()
{ 'type' : 'object' , 'properties' : { 'observation' : { 'type' : 'array' , 'items' : { 'type' : 'integer' }}, 'action' : { 'type' : 'array' , 'items' : { 'type' : 'integer' }}}}

# Unflattening a list into a Sample instance
Sample . unflatten ( flat_list , schema )
>> > Sample ( observation = [ 1 , 2 , 3 ], action = [ 4 , 5 , 6 ])

使用 Pydantic 進行序列化和反序列化

Sample 類別利用 Pydantic 強大的序列化和反序列化功能，讓您可以輕鬆地在 Sample 實例和 JSON 之間進行轉換。

 # Serialize the Sample instance to JSON
sample = Sample ( observation = [ 1 , 2 , 3 ], action = [ 4 , 5 , 6 ])
json_data = sample . model_dump_json ()
print ( json_data ) # Output: '{"observation": [1, 2, 3], "action": [4, 5, 6]}'

# Deserialize the JSON data back into a Sample instance
json_data = '{"observation": [1, 2, 3], "action": [4, 5, 6]}'
sample = Sample . model_validate ( from_json ( json_data ))
print ( sample ) # Output: Sample(observation=[1, 2, 3], action=[4, 5, 6])

轉換為不同的容器

 # Converting to a dictionary
sample_dict = sample . to ( "dict" )
print ( sample_dict ) # Output: {'observation': [1, 2, 3], 'action': [4, 5, 6]}

# Converting to a NumPy array
sample_np = sample . to ( "np" )
print ( sample_np ) # Output: array([1, 2, 3, 4, 5, 6])

# Converting to a PyTorch tensor
sample_pt = sample . to ( "pt" )
print ( sample_pt ) # Output: tensor([1, 2, 3, 4, 5, 6])

健身房空間整合

 gym_space = sample . space ()
print ( gym_space )
# Output: Dict('action': Box(-inf, inf, (3,), float64), 'observation': Box(-inf, inf, (3,), float64))

有關更多詳細信息，請參閱sample.py。

訊息

Message 類別代表單一完成樣本空間。它可以是文字、圖像、文字/圖像清單、樣本或其他形式。 Message 類別旨在處理各種類型的內容並支援不同的角色，例如使用者、助理或系統。

您可以透過多種方式建立Message 。它們都可以被 mbodi 的後端理解。

 from mbodied . types . message import Message

Message ( role = "user" , content = "example text" )
Message ( role = "user" , content = [ "example text" , Image ( "example.jpg" ), Image ( "example2.jpg" )])
Message ( role = "user" , content = [ Sample ( "Hello" )])

後端

Backend 類別是 Backend 實作的抽象基底類別。它提供了與不同後端服務互動所需的基本結構和方法，例如用於根據給定訊息產生完成的 API 呼叫。有關如何實現各種後端的信息，請參閱後端目錄。

代理人

Agent 是下面列出的各種代理程式的基底類別。它提供了一個用於建立代理的模板，該代理可以與遠端後端/伺服器通訊並可選擇記錄其操作和觀察結果。

語言代理

語言代理可以連接到您選擇的不同後端或轉換器。它包括記錄對話、管理上下文、查找訊息、忘記訊息、儲存上下文以及根據指令和圖像採取行動的方法。

原生支援 API 服務：OpenAI、Anthropic、vLLM、Ollama、HTTPX 或任何 gradio 端點。更多即將推出！

要將 OpenAI 用於您的機器人後端：

 from mbodied . agents . language import LanguageAgent

agent = LanguageAgent ( context = "You are a robot agent." , model_src = "openai" )

執行指令：

 instruction = "pick up the fork"
response = robot_agent . act ( instruction , image )

語言代理也可以連接到 vLLM。例如，假設您正在 1.2.3.4:1234 上執行 vLLM 伺服器 Mistral-7B。您需要做的就是：

 agent = LanguageAgent (
    context = context ,
    model_src = "openai" ,
    model_kwargs = { "api_key" : "EMPTY" , "base_url" : "http://1.2.3.4:1234/v1" },
)
response = agent . act ( "Hello, how are you?" , model = "mistralai/Mistral-7B-Instruct-v0.3" )

使用 Ollama 的範例：

 agent = LanguageAgent (
    context = "You are a robot agent." , model_src = "ollama" ,
    model_kwargs = { "endpoint" : "http://localhost:11434/api/chat" }
)
response = agent . act ( "Hello, how are you?" , model = "llama3.1" )

汽車代理

Motor Agent 與 Language Agent 類似，但它不會傳回字串，而是始終傳回Motion 。 Motor Agent 通常由機器人變壓器模型提供支持，即 OpenVLA、RT1、Octo 等。然而，有些（例如 OpenVLA）在沒有量化的情況下可能難以運作。請參閱 OpenVLA 代理程式和範例 OpenVLA 伺服器

感覺劑

這些代理程式與環境互動以收集感測器數據。它們始終返回SensorReading ，它可以是各種形式的處理後的感官輸入，例如影像、深度資料或音訊訊號。

目前，我們有：

深度估計
物體偵測
影像分割

處理機器人感測器資訊的代理。

自動代理

自動代理根據任務和模型動態選擇並初始化正確的代理。

 from mbodied . agents . auto . auto_agent import AutoAgent

# This makes it a LanguageAgent
agent = AutoAgent ( task = "language" , model_src = "openai" )
response = agent . act ( "What is the capital of France?" )

# This makes it a motor agent: OpenVlaAgent
auto_agent = AutoAgent ( task = "motion-openvla" , model_src = "https://api.mbodi.ai/community-models/" )
action = auto_agent . act ( "move hand forward" , Image ( size = ( 224 , 224 )))

# This makes it a sensory agent: DepthEstimationAgent
auto_agent = AutoAgent ( task = "sense-depth-estimation" , model_src = "https://api.mbodi.ai/sense/" )
depth = auto_agent . act ( image = Image ( size = ( 224 , 224 )))

或者，您也可以使用 auto_agent 中的get_agent方法。

 language_agent = get_agent ( task = "language" , model_src = "openai" )

議案

Motion_controls 模組定義了各種運動來控制機器人作為 Pydantic 模型。它們也是Sample的子類，因此擁有上述Sample的所有功能。這些控制涵蓋了一系列動作，從簡單的關節運動到複雜的姿勢和完整的機器人控制。

機器人

您可以透過對 Robot 進行子類化來非常輕鬆地整合自訂機器人硬體。您只需要實作do()函數即可執行操作（如果您想在機器人上記錄資料集，還需要一些附加方法）。在我們的範例中，我們使用模擬機器人。我們還有一個 XArm 機器人作為範例。

記錄資料集

在機器人上記錄資料集非常簡單！您需要做的就是為您的機器人實作get_observation() 、 get_state()和prepare_action()方法。之後，您可以隨時在機器人上記錄資料集。請參閱 example/5_teach_robot_record_dataset.py 和此 colab：以了解更多詳細資訊。

 from mbodied . robots import SimRobot
from mbodied . types . motion . control import HandControl , Pose

robot = SimRobot ()
robot . init_recorder ( frequency_hz = 5 )
with robot . record ( "pick up the fork" ):
  motion = HandControl ( pose = Pose ( x = 0.1 , y = 0.2 , z = 0.3 , roll = 0.1 , pitch = 0.2 , yaw = 0.3 ))
  robot . do ( motion )

錄音機

資料集記錄器是一個較低層級的記錄器，用於在您與機器人互動/教導機器人時將您的對話和機器人的動作記錄到資料集中。您可以為記錄器定義任何觀察空間和動作空間。有關空間的更多詳細信息，請參閱體育館。

 from mbodied . data . recording import Recorder
from mbodied . types . motion . control import HandControl
from mbodied . types . sense . vision import Image
from gymnasium import spaces

observation_space = spaces . Dict ({
    'image' : Image ( size = ( 224 , 224 )). space (),
    'instruction' : spaces . Text ( 1000 )
})
action_space = HandControl (). space ()
recorder = Recorder ( 'example_recorder' , out_dir = 'saved_datasets' , observation_space = observation_space , action_space = action_space )

# Every time robot makes a conversation or performs an action:
recorder . record ( observation = { 'image' : image , 'instruction' : instruction ,}, action = hand_control )

資料集保存到./saved_datasets 。

重播器

Replayer 類別旨在處理和管理Recorder產生的 HDF5 檔案中儲存的資料。它提供了多種功能，包括讀取樣本、生成統計資料、提取唯一項目以及轉換資料集以與 HuggingFace 一起使用。 Replayer 還支援在處理過程中保存特定影像，並提供用於各種操作的命令列介面。

使用 Replayer 迭代 Recorder 中的資料集的範例：

 from mbodied . data . replaying import Replayer

replayer = Replayer ( path = str ( "path/to/dataset.h5" ))
for observation , action in replayer :
   ...

目錄結構

├─ assets/ ............. Images, icons, and other static assets
├─ examples/ ........... Example scripts and usage demonstrations
├─ resources/ .......... Additional resources for examples
├─ src/
│  └─ mbodied/
│     ├─ agents/ ....... Modules for robot agents
│     │  ├─ backends/ .. Backend implementations for different services for agents
│     │  ├─ language/ .. Language based agents modules
│     │  ├─ motion/ .... Motion based agents modules
│     │  └─ sense/ ..... Sensory, e.g. audio, processing modules
│     ├─ data/ ......... Data handling and processing
│     ├─ hardware/ ..... Hardware modules, i.e. camera
│     ├─ robot/ ........ Robot interface and interaction
│     └─ types/ ........ Common types and definitions
└─ tests/ .............. Unit tests