instructor php下載 - instructor php原始碼下載

PHP 講師

由法學碩士提供支援的 PHP 結構化資料提取。專為簡單、透明和控製而設計。

什麼是導師？

Instructor 是一個函式庫，可讓您從多種類型的輸入中提取結構化的、經過驗證的資料：文字、圖像或 OpenAI 風格的聊天序列陣列。它由大型語言模型 (LLM) 提供支援。

講師簡化了 PHP 專案中的 LLM 整合。它處理從 LLM 輸出中提取結構化資料的複雜性，因此您可以專注於建立應用程式邏輯並更快地迭代。

Instructor for PHP 的靈感來自 Jason Liu 所創建的 Python Instructor 函式庫。

這是一個簡單的 CLI 演示應用程序，使用 Instructor 從文字中提取結構化資料：

功能亮點

核心特點

無需編寫樣板代碼即可獲得法學碩士的結構化回复
傳回資料的驗證
當 LLM 回應無效資料時發生錯誤時自動重試
以最小的摩擦將 LLM 支援整合到您現有的 PHP 程式碼中 - 無需框架，無需進行大量程式碼更改

靈活的輸入

使用相同、簡單的 API 處理各種類型的輸入資料：文字、一系列聊天訊息或圖像
「結構化到結構化」處理 - 提供物件或陣列作為輸入，並取得帶有推理結果的對象
展示例子以提高推理質量

客製化

依照您想要的方式定義回應資料模型：類型提示類別、JSON 架構陣列或具有Structure類別的動態資料形狀
自訂提示和重試提示
使用屬性或 PHP DocBlocks 為 LLM 提供附加說明
透過提供您自己的模式、反序列化、驗證和轉換介面的實現來自訂回應模型處理

同步和串流媒體支持

支援同步或串流響應
取得部分更新並串流已完成的序列項

可觀察性

透過事件詳細了解內部處理
調試模式可查看 LLM API 請求和回應的詳細信息

支援多個 LLM/API 提供者

在 LLM 提供者之間輕鬆切換
支援最受歡迎的 LLM API（包括 OpenAI、Gemini、Anthropic、Cohere、Azure、Groq、Mistral、Fireworks AI、Together AI）
OpenRouter 支援 - 存取 100 多種語言模型
將本地模型與 Ollama 一起使用

其他能力

開發人員友好的 LLM 上下文快取可降低成本並加快推理速度（對於人擇模型）
開發人員友善的從圖像中提取資料（適用於 OpenAI、Anthropic 和 Gemini 模型）

文件和範例

從不斷增長的文檔和 50 多本食譜中了解更多信息

其他語言講師

查看以下其他語言的實作：

Python（原文）
JavaScript（連接埠）
長生不老藥（埠）

如果您想將 Instructor 移植為其他語言，請在 Twitter 上聯絡我們，我們很樂意協助您入門！

講師如何增強您的工作流程

與直接使用 API 相比，講師介紹了三個關鍵增強功能。

回應模型

您只需指定一個 PHP 類別即可透過 LLM 聊天完成的「魔力」將資料提取到其中。就是這樣。

講師利用結構化的 LLM 回應，降低了從文字資料中提取資訊的程式碼的脆弱性。

講師可協助您編寫更簡單、更易於理解的程式碼 - 您不再需要定義冗長的函數呼叫定義或編寫用於將傳回的 JSON 指派到目標資料物件的程式碼。

驗證

LLM 產生的回應模型可以按照一組規則自動驗證。目前，Instructor 僅支援 Symfony 驗證。

您也可以提供上下文物件來使用增強的驗證器功能。

最大重試次數

您可以設定請求的重試次數。

如果出現驗證或反序列化錯誤，講師將重複要求最多指定次數，嘗試從 LLM 獲得有效回應。

開始使用

安裝教練很簡單。在終端機中執行以下命令，您將獲得更流暢的資料處理體驗！

composer require cognesy/instructor-php

用法

基本範例

這是一個簡單的範例，演示了 Instructor 如何從提供的文字（或聊天訊息序列）中檢索結構化資訊。

回應模型類別是一個普通的 PHP 類，帶有指定物件欄位類型的類型提示。

 use Cognesy  Instructor  Instructor ;

// Step 0: Create .env file in your project root:
// OPENAI_API_KEY=your_api_key

// Step 1: Define target data structure(s)
class Person {
    public string $ name ;
    public int $ age ;
}

// Step 2: Provide content to process
$ text = " His name is Jason and he is 28 years old. " ;

// Step 3: Use Instructor to run LLM inference
$ person = ( new Instructor )-> respond (
    messages: $ text ,
    responseModel: Person ::class,
);

// Step 4: Work with structured response data
assert ( $ person instanceof Person ); // true
assert ( $ person -> name === ' Jason ' ); // true
assert ( $ person -> age === 28 ); // true

echo $ person -> name ; // Jason
echo $ person -> age ; // 28

var_dump ( $ person );
// Person {
//     name: "Jason",
//     age: 28
// }

注意：教師支援類別/物件作為回應模型。如果您想提取簡單類型或枚舉，則需要將它們包裝在標量適配器中 - 請參閱下面的部分：提取標量值。

連接到各種 LLM API 供應商

Instructor 允許您在llm.php檔案中定義多個 API 連線。當您想在應用程式中使用不同的 LLM 或 API 提供者時，這非常有用。

預設配置位於 Instructor 程式碼庫根目錄中的/config/llm.php 。它包含一組與 Instructor 開箱即用支援的所有 LLM API 的預先定義連接。

設定檔定義與 LLM API 的連接及其參數。它還指定在呼叫 Instructor 時使用的預設連線而不指定客戶端連線。

    // This is fragment of /config/llm.php file
    ' defaultConnection ' => ' openai ' ,
    // . . .
    ' connections ' => [
        ' anthropic ' => [ ... ],
        ' azure ' => [ ... ],
        ' cohere1 ' => [ ... ],
        ' cohere2 ' => [ ... ],
        ' fireworks ' => [ ... ],
        ' gemini ' => [ ... ],
        ' grok ' => [ ... ],
        ' groq ' => [ ... ],
        ' mistral ' => [ ... ],
        ' ollama ' => [
            ' providerType ' => LLMProviderType :: Ollama -> value ,
            ' apiUrl ' => ' http://localhost:11434/v1 ' ,
            ' apiKey ' => Env :: get ( ' OLLAMA_API_KEY ' , '' ),
            ' endpoint ' => ' /chat/completions ' ,
            ' defaultModel ' => ' qwen2.5:0.5b ' ,
            ' defaultMaxTokens ' => 1024 ,
            ' httpClient ' => ' guzzle-ollama ' , // use custom HTTP client configuration
        ],
        ' openai ' => [ ... ],
        ' openrouter ' => [ ... ],
        ' together ' => [ ... ],
    // ...

若要自訂可用連接，您可以修改現有條目或新增您自己的條目。

透過預定義連接連接到 LLM API 就像使用連接名稱呼叫withClient方法一樣簡單。

 <?php
// ...
$ user = ( new Instructor )
    -> withConnection ( ' ollama ' )
    -> respond (
        messages: " His name is Jason and he is 28 years old. " ,
        responseModel: Person ::class,
    );
// ...

您可以透過INSTRUCTOR_CONFIG_PATH環境變數來變更 Instructor 使用的設定檔的位置。您可以使用預設設定檔的副本作為起點。

結構化到結構化的處理

Instructor 提供了一種使用結構化資料作為輸入的方法。當您想要使用物件資料作為輸入並取得具有 LLM 推理結果的另一個物件時，這非常有用。

Instructor 的respond()和request()方法的input欄位可以是一個對象，也可以是一個陣列或只是一個字串。

 <?php
use Cognesy  Instructor  Instructor ;

class Email {
    public function __construct (
        public string $ address = '' ,
        public string $ subject = '' ,
        public string $ body = '' ,
    ) {}
}

$ email = new Email (
    address: ' joe@gmail ' ,
    subject: ' Status update ' ,
    body: ' Your account has been updated. '
);

$ translation = ( new Instructor )-> respond (
    input: $ email ,
    responseModel: Email ::class,
    prompt: ' Translate the text fields of email to Spanish. Keep other fields unchanged. ' ,
);

assert ( $ translation instanceof Email ); // true
dump ( $ translation );
// Email {
//     address: "joe@gmail",
//     subject: "Actualización de estado",
//     body: "Su cuenta ha sido actualizada."
// }
?>

驗證

講師根據資料模型中指定的驗證規則驗證 LLM 回應的結果。

有關可用驗證規則的更多詳細信息，請檢查 Symfony 驗證約束。

 use Symfony  Component  Validator  Constraints as Assert ;

class Person {
    public string $ name ;
    #[ Assert  PositiveOrZero ]
    public int $ age ;
}

$ text = " His name is Jason, he is -28 years old. " ;
$ person = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => $ text ]],
    responseModel: Person ::class,
);

// if the resulting object does not validate, Instructor throws an exception

最大重試次數

如果提供了 maxRetries 參數且 LLM 回應不符合驗證標準，講師將進行後續推理嘗試，直到結果符合要求或達到 maxRetries。

講師使用驗證錯誤來告知LLM在回應中發現的問題，以便LLM可以在下一次嘗試中嘗試自我修正。

 use Symfony  Component  Validator  Constraints as Assert ;

class Person {
    #[ Assert  Length (min: 3 )]
    public string $ name ;
    #[ Assert  PositiveOrZero ]
    public int $ age ;
}

$ text = " His name is JX, aka Jason, he is -28 years old. " ;
$ person = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => $ text ]],
    responseModel: Person ::class,
    maxRetries: 3 ,
);

// if all LLM's attempts to self-correct the results fail, Instructor throws an exception

致電講師的其他方式

您可以呼叫request()方法來設定請求的參數，然後呼叫get()來取得回應。

 use Cognesy  Instructor  Instructor ;

$ instructor = ( new Instructor )-> request (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Person ::class,
);
$ person = $ instructor -> get ();

串流媒體支援

Instructor 支援部分結果串流傳輸，讓您可以在資料可用時立即開始處理資料。

 <?php
use Cognesy  Instructor  Instructor ;

$ stream = ( new Instructor )-> request (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Person ::class,
    options: [ ' stream ' => true ]
)-> stream ();

foreach ( $ stream as $ partialPerson ) {
    // process partial person data
    echo $ partialPerson -> name ;
    echo $ partialPerson -> age ;
}

// after streaming is done you can get the final, fully processed person object...
$ person = $ stream -> getLastUpdate ()
// . . . to, for example, save it to the database
$ db -> save ( $ person );
?>

部分結果

您可以定義onPartialUpdate()回呼來接收部分結果，這些結果可用於在 LLM 完成推理之前開始更新 UI。

注意：部分更新未經驗證。僅在完全收到回應後才驗證回應。

 use Cognesy  Instructor  Instructor ;

function updateUI ( $ person ) {
    // Here you get partially completed Person object update UI with the partial result
}

$ person = ( new Instructor )-> request (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Person ::class,
    options: [ ' stream ' => true ]
)-> onPartialUpdate (
    fn( $ partial ) => updateUI ( $ partial )
)-> get ();

// Here you get completed and validated Person object
$ this -> db -> save ( $ person ); // ...for example: save to DB

快速方式

字串作為輸入

您可以提供字串而不是訊息數組。當您想要從單一文字區塊中提取資料並希望保持程式碼簡單時，這非常有用。

 // Usually, you work with sequences of messages:

$ value = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => " His name is Jason, he is 28 years old. " ]],
    responseModel: Person ::class,
);

// ...but if you want to keep it simple, you can just pass a string:

$ value = ( new Instructor )-> respond (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Person ::class,
);

提取標量值

有時我們只是想快速獲得結果，而不需要為回應模型定義類，特別是當我們試圖以字串、整數、布林值或浮點形式獲得直接、簡單的答案時。 Instructor 為此類情況提供了簡化的 API。

 use Cognesy  Instructor  Extras  Scalar  Scalar ;
use Cognesy  Instructor  Instructor ;

$ value = ( new Instructor )-> respond (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Scalar :: integer ( ' age ' ),
);

var_dump ( $ value );
// int(28)

在此範例中，我們從文字中提取單一整數值。您也可以使用Scalar::string() 、 Scalar::boolean()和Scalar::float()來擷取其他類型的值。

提取枚舉值

此外，您可以使用 Scalar 適配器透過Scalar::enum()提取提供的選項之一。

 use Cognesy  Instructor  Extras  Scalar  Scalar ;
use Cognesy  Instructor  Instructor ;

enum ActivityType : string {
    case Work = ' work ' ;
    case Entertainment = ' entertainment ' ;
    case Sport = ' sport ' ;
    case Other = ' other ' ;
}

$ value = ( new Instructor )-> respond (
    messages: " His name is Jason, he currently plays Doom Eternal. " ,
    responseModel: Scalar :: enum ( ActivityType ::class, ' activityType ' ),
);

var_dump ( $ value );
// enum(ActivityType:Entertainment)

提取物件序列

Sequence 是一個包裝類，可用來表示 Instructor 從提供的上下文中提取的物件清單。

通常更方便的是，不建立具有單一數組屬性的專用類別來處理給定類別的物件清單。

序列的附加獨特功能是，它們可以按序列中每個已完成的項目進行串流傳輸，而不是在任何屬性更新上進行串流傳輸。

 class Person
{
    public string $ name ;
    public int $ age ;
}

$ text = <<<TEXT
    Jason is 25 years old. Jane is 18 yo. John is 30 years old
    and Anna is 2 years younger than him.
TEXT ;

$ list = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => $ text ]],
    responseModel: Sequence :: of ( Person ::class),
    options: [ ' stream ' => true ]
);

在序列部分查看有關序列的更多資訊。

指定資料模型

類型提示

使用 PHP 類型提示來指定提取資料的類型。

使用可以為 null 的類型來指示給定欄位是可選的。

    class Person {
        public string $ name ;
        public ? int $ age ;
        public Address $ address ;
    }

DocBlock 類型提示

您也可以使用 PHP DocBlock 樣式註解來指定擷取資料的類型。當您想要為 LLM 指定屬性類型，但不能或不想在程式碼層級強制執行類型時，這非常有用。

 class Person {
    /** @var string */
    public $ name ;
    /** @var int */
    public $ age ;
    /** @var Address $address person's address */
    public $ address ;
}

有關 DocBlock 網站上的更多詳細信息，請參閱 PHPDoc 文件。

類型化集合/數組

PHP 目前不支援泛型或類型提示來指定數組元素類型。

使用 PHP DocBlock 樣式註解來指定陣列元素的類型。

 class Person {
    // ...
}

class Event {
    // ...
    /** @var Person[] list of extracted event participants */
    public array $ participants ;
    // ...
}

複雜資料擷取

教師可以從文本中檢索複雜的資料結構。您的回應模型可以包含巢狀物件、陣列和枚舉。

 use Cognesy  Instructor  Instructor ;

// define a data structures to extract data into
class Person {
    public string $ name ;
    public int $ age ;
    public string $ profession ;
    /** @var Skill[] */
    public array $ skills ;
}

class Skill {
    public string $ name ;
    public SkillType $ type ;
}

enum SkillType {
    case Technical = ' technical ' ;
    case Other = ' other ' ;
}

$ text = " Alex is 25 years old software engineer, who knows PHP, Python and can play the guitar. " ;

$ person = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => $ text ]],
    responseModel: Person ::class,
); // client is passed explicitly, can specify e.g. different base URL

// data is extracted into an object of given class
assert ( $ person instanceof Person ); // true

// you can access object's extracted property values
echo $ person -> name ; // Alex
echo $ person -> age ; // 25
echo $ person -> profession ; // software engineer
echo $ person -> skills [ 0 ]-> name ; // PHP
echo $ person -> skills [ 0 ]-> type ; // SkillType::Technical
// ...

var_dump ( $ person );
// Person {
//     name: "Alex",
//     age: 25,
//     profession: "software engineer",
//     skills: [
//         Skill {
//              name: "PHP",
//              type: SkillType::Technical,
//         },
//         Skill {
//              name: "Python",
//              type: SkillType::Technical,
//         },
//         Skill {
//              name: "guitar",
//              type: SkillType::Other
//         },
//     ]
// }

動態資料模式

如果要在運行時定義資料的形狀，可以使用Structure類別。

結構可讓您定義和修改要由 LLM 提取的任意形狀的資料。類別可能不是最適合此目的，因為在執行期間聲明或更改它們是不可能的。

透過結構，您可以動態定義自訂資料形狀，例如基於使用者輸入或處理上下文，以指定您需要 LLM 從提供的文字或聊天訊息中推斷出的資訊。

下面的範例示範如何定義結構並將其用作回應模型：

 <?php
use Cognesy  Instructor  Extras  Structure  Field ;
use Cognesy  Instructor  Extras  Structure  Structure ;

enum Role : string {
    case Manager = ' manager ' ;
    case Line = ' line ' ;
}

$ structure = Structure :: define ( ' person ' , [
    Field :: string ( ' name ' ),
    Field :: int ( ' age ' ),
    Field :: enum ( ' role ' , Role ::class),
]);

$ person = ( new Instructor )-> respond (
    messages: ' Jason is 25 years old and is a manager. ' ,
    responseModel: $ structure ,
);

// you can access structure data via field API...
assert ( $ person -> field ( ' name ' ) === ' Jason ' );
// ...or as structure object properties
assert ( $ person -> age === 25 );
?>

有關詳細信息，請參閱結構部分。

更改LLM模型和選項

您可以指定將傳遞到 OpenAI / LLM 端點的模型和其他選項。

 use Cognesy  Instructor  Features  LLM  Data  LLMConfig ;
use Cognesy  Instructor  Features  LLM  Drivers  OpenAIDriver ;
use Cognesy  Instructor  Instructor ;

// OpenAI auth params
$ yourApiKey = Env :: get ( ' OPENAI_API_KEY ' ); // use your own API key

// Create instance of OpenAI driver initialized with custom parameters
$ driver = new OpenAIDriver ( new LLMConfig (
    apiUrl: ' https://api.openai.com/v1 ' , // you can change base URI
    apiKey: $ yourApiKey ,
    endpoint: ' /chat/completions ' ,
    metadata: [ ' organization ' => '' ],
    model: ' gpt-4o-mini ' ,
    maxTokens: 128 ,
));

/// Get Instructor with the default client component overridden with your own
$ instructor = ( new Instructor )-> withDriver ( $ driver );

$ user = $ instructor -> respond (
    messages: " Jason (@jxnlco) is 25 years old and is the admin of this project. He likes playing football and reading books. " ,
    responseModel: User ::class,
    model: ' gpt-3.5-turbo ' ,
    options: [ ' stream ' => true ]
);

對語言模型和 API 提供者的支持

講師為以下 API 提供者提供開箱即用的支援：

人擇
Azure 開放人工智慧
連貫性
煙火人工智慧
格羅克
米斯特拉爾
奧拉馬（在本地主機上）
開放人工智慧
開放路由器
一起人工智慧

有關使用範例，請檢查程式碼儲存庫中的 Hub 部分或examples目錄。

使用 DocBlocks 作為 LLM 的附加說明

您可以使用 PHP DocBlocks (/** */) 在類別或欄位層級為 LLM 提供附加說明，例如闡明您的期望或 LLM 應如何處理您的資料。

講師從定義的類別和屬性中提取 PHP DocBlocks 註釋，並將它們包含在發送給 LLM 的回應模型規格中。

使用 PHP DocBlocks 指令不是必需的，但有時您可能想澄清您的意圖以改進 LLM 的推理結果。

 /**
 * Represents a skill of a person and context in which it was mentioned. 
 */
class Skill {
    public string $ name ;
    /** @var SkillType $type type of the skill, derived from the description and context */
    public SkillType $ type ;
    /** Directly quoted, full sentence mentioning person's skill */
    public string $ context ;
}

自訂驗證

驗證混合

您可以使用 ValidationMixin 特徵來新增簡單的自訂資料物件驗證功能。

 use Cognesy  Instructor  Features  Validation  Traits  ValidationMixin ;

class User {
    use ValidationMixin ;

    public int $ age ;
    public int $ name ;

    public function validate () : array {
        if ( $ this -> age < 18 ) {
            return [ " User has to be adult to sign the contract. " ];
        }
        return [];
    }
}

驗證回調

講師使用 Symfony 驗證元件來驗證擷取的資料。您可以使用#[Assert/Callback]註解來建立完全自訂的驗證邏輯。

 use Cognesy  Instructor  Instructor ;
use Symfony  Component  Validator  Constraints as Assert ;
use Symfony

展開