instructor phpダウンロード - instructor phpソースコードのダウンロード

PHP講師

LLM を利用した PHP での構造化データ抽出。シンプルさ、透明性、制御性を重視して設計されています。

インストラクターとは何ですか？

Instructor は、テキスト、画像、または OpenAI スタイルのチャットシーケンス配列など、複数の種類の入力から構造化され検証されたデータを抽出できるライブラリです。これは、Large Language Model (LLM) によって強化されています。

Instructor は、PHP プロジェクトでの LLM の統合を簡素化します。 LLM 出力から構造化データを抽出する複雑さを処理するため、アプリケーションロジックの構築に集中して反復処理を高速化できます。

Instructor for PHP は、Jason Liu によって作成された Python 用の Instructor ライブラリからインスピレーションを得ています。

Instructor を使用してテキストから構造化データを抽出する簡単な CLI デモアプリを次に示します。

機能のハイライト

コア機能

定型コードを記述せずに LLM から構造化された応答を取得する
返されたデータの検証
LLM が無効なデータで応答したときにエラーが発生した場合の自動再試行
最小限の手間で LLM サポートを既存の PHP コードに統合します - フレームワークや大規模なコード変更は必要ありません

柔軟な入力

同じシンプルな API を使用して、さまざまなタイプの入力データ (テキスト、一連のチャットメッセージ、画像など) を処理します。
「構造化対構造化」処理 - オブジェクトまたは配列を入力として提供し、推論の結果を含むオブジェクトを取得します。
推論の品質を向上させるための例を示す

カスタマイズ

応答データモデルを希望どおりに定義: タイプヒンテッドクラス、JSON スキーマ配列、またはStructureクラスを使用した動的データ形状
プロンプトのカスタマイズとプロンプトの再試行
属性または PHP DocBlock を使用して、LLM に追加の指示を提供します
独自のスキーマ、逆シリアル化、検証、変換インターフェイスの実装を提供して、応答モデルの処理をカスタマイズします。

同期とストリーミングのサポート

同期応答またはストリーミング応答の両方をサポート
部分的な更新を取得し、完了したシーケンスアイテムをストリーミングする

可観測性

イベントを通じて内部処理の詳細な洞察を得る
LLM API リクエストとレスポンスの詳細を確認するためのデバッグモード

複数の LLM / API プロバイダーのサポート

LLM プロバイダーを簡単に切り替える
最も一般的な LLM API のサポート (OpenAI、Gemini、Anthropic、Cohere、Azure、Groq、Mistral、Fireworks AI、Togetter AI を含む)
OpenRouter サポート - 100 以上の言語モデルへのアクセス
Ollama でローカルモデルを使用する

その他の機能

開発者に優しい LLM コンテキストキャッシュによりコストを削減し、推論を高速化します (人間モデルの場合)
開発者にとって使いやすい画像からのデータ抽出 (OpenAI、Anthropic、Gemini モデルの場合)

ドキュメントと例

増え続けるドキュメントと 50 を超えるクックブックからさらに詳しく学ぶ

他言語の講師

以下の他の言語での実装を確認してください。

パイソン（オリジナル）
JavaScript (ポート)
エリクサー（ポート）

Instructor を別の言語に移植したい場合は、Twitter までご連絡ください。お手伝いをさせていただきます。

インストラクターがワークフローを強化する方法

インストラクターは、API の直接使用と比較して 3 つの重要な機能強化を紹介します。

応答モデル

LLM チャット補完の「魔法」を介してデータを抽出する PHP クラスを指定するだけです。それで終わりです。

インストラクターは、構造化された LLM 応答を活用することで、テキストデータから情報を抽出するコードの脆弱性を軽減します。

インストラクターは、よりシンプルで理解しやすいコードを作成するのに役立ちます。長い関数呼び出し定義を定義したり、返された JSON をターゲットデータオブジェクトに割り当てるためのコードを作成したりする必要はもうありません。

検証

LLM によって生成された応答モデルは、一連のルールに従って自動的に検証できます。現在、Instructor は Symfony 検証のみをサポートしています。

拡張されたバリデーター機能を使用するためにコンテキストオブジェクトを提供することもできます。

最大再試行回数

リクエストの再試行回数を設定できます。

インストラクターは、検証または逆シリアル化エラーが発生した場合に、指定された回数までリクエストを繰り返し、LLM から有効な応答を取得しようとします。

始めましょう

インストラクターのインストールは簡単です。ターミナルで次のコマンドを実行すると、よりスムーズなデータ処理エクスペリエンスが得られます。

composer require cognesy/instructor-php

使用法

基本的な例

これは、インストラクターが提供されたテキスト (またはチャットメッセージシーケンス) から構造化情報を取得する方法を示す簡単な例です。

応答モデルクラスは、オブジェクトのフィールドの型を指定するタイプヒントを備えたプレーンな PHP クラスです。

 use Cognesy  Instructor  Instructor ;

// Step 0: Create .env file in your project root:
// OPENAI_API_KEY=your_api_key

// Step 1: Define target data structure(s)
class Person {
    public string $ name ;
    public int $ age ;
}

// Step 2: Provide content to process
$ text = " His name is Jason and he is 28 years old. " ;

// Step 3: Use Instructor to run LLM inference
$ person = ( new Instructor )-> respond (
    messages: $ text ,
    responseModel: Person ::class,
);

// Step 4: Work with structured response data
assert ( $ person instanceof Person ); // true
assert ( $ person -> name === ' Jason ' ); // true
assert ( $ person -> age === 28 ); // true

echo $ person -> name ; // Jason
echo $ person -> age ; // 28

var_dump ( $ person );
// Person {
//     name: "Jason",
//     age: 28
// }

注:インストラクターは、応答モデルとしてクラス/オブジェクトをサポートします。単純な型または列挙型を抽出する場合は、それらをスカラーアダプターでラップする必要があります。以下のセクション「スカラー値の抽出」を参照してください。

さまざまな LLM API プロバイダーへの接続

Instructor を使用すると、 llm.phpファイルで複数の API 接続を定義できます。これは、アプリケーションでさまざまな LLM または API プロバイダーを使用する場合に便利です。

デフォルト設定は、インストラクターコードベースのルートディレクトリの/config/llm.phpにあります。これには、インストラクターがすぐに使用できるすべての LLM API への事前定義された接続のセットが含まれています。

構成ファイルは、LLM API への接続とそのパラメーターを定義します。また、クライアント接続を指定せずに Instructor を呼び出すときに使用するデフォルトの接続も指定します。

    // This is fragment of /config/llm.php file
    ' defaultConnection ' => ' openai ' ,
    // . . .
    ' connections ' => [
        ' anthropic ' => [ ... ],
        ' azure ' => [ ... ],
        ' cohere1 ' => [ ... ],
        ' cohere2 ' => [ ... ],
        ' fireworks ' => [ ... ],
        ' gemini ' => [ ... ],
        ' grok ' => [ ... ],
        ' groq ' => [ ... ],
        ' mistral ' => [ ... ],
        ' ollama ' => [
            ' providerType ' => LLMProviderType :: Ollama -> value ,
            ' apiUrl ' => ' http://localhost:11434/v1 ' ,
            ' apiKey ' => Env :: get ( ' OLLAMA_API_KEY ' , '' ),
            ' endpoint ' => ' /chat/completions ' ,
            ' defaultModel ' => ' qwen2.5:0.5b ' ,
            ' defaultMaxTokens ' => 1024 ,
            ' httpClient ' => ' guzzle-ollama ' , // use custom HTTP client configuration
        ],
        ' openai ' => [ ... ],
        ' openrouter ' => [ ... ],
        ' together ' => [ ... ],
    // ...

利用可能な接続をカスタマイズするには、既存のエントリを変更するか、独自のエントリを追加します。

事前定義された接続を介した LLM API への接続は、接続名を指定してwithClientメソッドを呼び出すだけで簡単です。

 <?php
// ...
$ user = ( new Instructor )
    -> withConnection ( ' ollama ' )
    -> respond (
        messages: " His name is Jason and he is 28 years old. " ,
        responseModel: Person ::class,
    );
// ...

INSTRUCTOR_CONFIG_PATH環境変数を介して、インストラクターが使用する構成ファイルの場所を変更できます。デフォルトの構成ファイルのコピーを開始点として使用できます。

構造化対構造化の処理

インストラクターは、構造化データを入力として使用する方法を提供します。これは、オブジェクトデータを入力として使用し、LLM 推論の結果を持つ別のオブジェクトを取得する場合に便利です。

インストラクターのrespond()とrequest()メソッドのinputフィールドは、オブジェクトだけでなく、配列や単なる文字列にすることもできます。

 <?php
use Cognesy  Instructor  Instructor ;

class Email {
    public function __construct (
        public string $ address = '' ,
        public string $ subject = '' ,
        public string $ body = '' ,
    ) {}
}

$ email = new Email (
    address: ' joe@gmail ' ,
    subject: ' Status update ' ,
    body: ' Your account has been updated. '
);

$ translation = ( new Instructor )-> respond (
    input: $ email ,
    responseModel: Email ::class,
    prompt: ' Translate the text fields of email to Spanish. Keep other fields unchanged. ' ,
);

assert ( $ translation instanceof Email ); // true
dump ( $ translation );
// Email {
//     address: "joe@gmail",
//     subject: "Actualización de estado",
//     body: "Su cuenta ha sido actualizada."
// }
?>

検証

インストラクターは、データモデルで指定された検証ルールに照らして LLM 応答の結果を検証します。

利用可能な検証ルールの詳細については、Symfony Validation 制約を確認してください。

 use Symfony  Component  Validator  Constraints as Assert ;

class Person {
    public string $ name ;
    #[ Assert  PositiveOrZero ]
    public int $ age ;
}

$ text = " His name is Jason, he is -28 years old. " ;
$ person = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => $ text ]],
    responseModel: Person ::class,
);

// if the resulting object does not validate, Instructor throws an exception

最大再試行回数

maxRetries パラメーターが指定されており、LLM 応答が検証基準を満たしていない場合、インストラクターは、結果が要件を満たすか、maxRetries に達するまで、その後の推論を試行します。

インストラクターは検証エラーを使用して応答で特定された問題を LLM に通知し、LLM が次回の試行で自己修正を試行できるようにします。

 use Symfony  Component  Validator  Constraints as Assert ;

class Person {
    #[ Assert  Length (min: 3 )]
    public string $ name ;
    #[ Assert  PositiveOrZero ]
    public int $ age ;
}

$ text = " His name is JX, aka Jason, he is -28 years old. " ;
$ person = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => $ text ]],
    responseModel: Person ::class,
    maxRetries: 3 ,
);

// if all LLM's attempts to self-correct the results fail, Instructor throws an exception

インストラクターに電話する別の方法

request()メソッドを呼び出してリクエストのパラメータを設定し、その後get()を呼び出して応答を取得できます。

 use Cognesy  Instructor  Instructor ;

$ instructor = ( new Instructor )-> request (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Person ::class,
);
$ person = $ instructor -> get ();

ストリーミングのサポート

インストラクターは部分的な結果のストリーミングをサポートしているため、データが利用可能になるとすぐに処理を開始できます。

 <?php
use Cognesy  Instructor  Instructor ;

$ stream = ( new Instructor )-> request (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Person ::class,
    options: [ ' stream ' => true ]
)-> stream ();

foreach ( $ stream as $ partialPerson ) {
    // process partial person data
    echo $ partialPerson -> name ;
    echo $ partialPerson -> age ;
}

// after streaming is done you can get the final, fully processed person object...
$ person = $ stream -> getLastUpdate ()
// . . . to, for example, save it to the database
$ db -> save ( $ person );
?>

部分的な結果

onPartialUpdate()コールバックを定義して、LLM が推論を完了する前に UI の更新を開始するために使用できる部分的な結果を受け取ることができます。

注: 部分的な更新は検証されません。応答は完全に受信された後にのみ検証されます。

 use Cognesy  Instructor  Instructor ;

function updateUI ( $ person ) {
    // Here you get partially completed Person object update UI with the partial result
}

$ person = ( new Instructor )-> request (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Person ::class,
    options: [ ' stream ' => true ]
)-> onPartialUpdate (
    fn( $ partial ) => updateUI ( $ partial )
)-> get ();

// Here you get completed and validated Person object
$ this -> db -> save ( $ person ); // ...for example: save to DB

ショートカット

入力としての文字列

メッセージの配列の代わりに文字列を指定できます。これは、単一のテキストブロックからデータを抽出し、コードを単純に保ちたい場合に便利です。

 // Usually, you work with sequences of messages:

$ value = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => " His name is Jason, he is 28 years old. " ]],
    responseModel: Person ::class,
);

// ...but if you want to keep it simple, you can just pass a string:

$ value = ( new Instructor )-> respond (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Person ::class,
);

スカラー値の抽出

特に、文字列、整数、ブール値、または浮動小数点数の形式で単純な答えを取得しようとしている場合は、応答モデルのクラスを定義せずに、単に結果を迅速に取得したい場合があります。 Instructor は、そのような場合に備えて簡素化された API を提供します。

 use Cognesy  Instructor  Extras  Scalar  Scalar ;
use Cognesy  Instructor  Instructor ;

$ value = ( new Instructor )-> respond (
    messages: " His name is Jason, he is 28 years old. " ,
    responseModel: Scalar :: integer ( ' age ' ),
);

var_dump ( $ value );
// int(28)

この例では、テキストから単一の整数値を抽出しています。 Scalar::string() 、 Scalar::boolean() 、およびScalar::float()を使用して、他のタイプの値を抽出することもできます。

列挙値の抽出

さらに、 Scalar::enum()使用して、Scalar アダプターを使用して、提供されたオプションの 1 つを抽出できます。

 use Cognesy  Instructor  Extras  Scalar  Scalar ;
use Cognesy  Instructor  Instructor ;

enum ActivityType : string {
    case Work = ' work ' ;
    case Entertainment = ' entertainment ' ;
    case Sport = ' sport ' ;
    case Other = ' other ' ;
}

$ value = ( new Instructor )-> respond (
    messages: " His name is Jason, he currently plays Doom Eternal. " ,
    responseModel: Scalar :: enum ( ActivityType ::class, ' activityType ' ),
);

var_dump ( $ value );
// enum(ActivityType:Entertainment)

オブジェクトのシーケンスの抽出

Sequence は、提供されたコンテキストから Instructor によって抽出されるオブジェクトのリストを表すために使用できるラッパークラスです。

通常は、特定のクラスのオブジェクトのリストを処理するためだけに、単一の配列プロパティを持つ専用のクラスを作成しない方が便利です。

シーケンスの追加のユニークな機能は、プロパティの更新ではなく、シーケンス内の完了したアイテムごとにストリーミングできることです。

 class Person
{
    public string $ name ;
    public int $ age ;
}

$ text = <<<TEXT
    Jason is 25 years old. Jane is 18 yo. John is 30 years old
    and Anna is 2 years younger than him.
TEXT ;

$ list = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => $ text ]],
    responseModel: Sequence :: of ( Person ::class),
    options: [ ' stream ' => true ]
);

シーケンスの詳細については、「シーケンス」セクションを参照してください。

データモデルの指定

タイプヒント

PHP の型ヒントを使用して、抽出されたデータの型を指定します。

指定されたフィールドがオプションであることを示すには、null 許容型を使用します。

    class Person {
        public string $ name ;
        public ? int $ age ;
        public Address $ address ;
    }

DocBlock タイプのヒント

PHP DocBlock スタイルのコメントを使用して、抽出されたデータの種類を指定することもできます。これは、LLM のプロパティタイプを指定したいが、コードレベルでタイプを強制できない、または強制したくない場合に便利です。

 class Person {
    /** @var string */
    public $ name ;
    /** @var int */
    public $ age ;
    /** @var Address $address person's address */
    public $ address ;
}

詳細については、DocBlock Web サイトの PHPDoc ドキュメントを参照してください。

型付きコレクション/配列

PHP は現在、配列要素の型を指定するためのジェネリックスやタイプヒントをサポートしていません。

PHP DocBlock スタイルのコメントを使用して、配列要素の型を指定します。

 class Person {
    // ...
}

class Event {
    // ...
    /** @var Person[] list of extracted event participants */
    public array $ participants ;
    // ...
}

複雑なデータ抽出

講師はテキストから複雑なデータ構造を取得できます。応答モデルには、ネストされたオブジェクト、配列、列挙型を含めることができます。

 use Cognesy  Instructor  Instructor ;

// define a data structures to extract data into
class Person {
    public string $ name ;
    public int $ age ;
    public string $ profession ;
    /** @var Skill[] */
    public array $ skills ;
}

class Skill {
    public string $ name ;
    public SkillType $ type ;
}

enum SkillType {
    case Technical = ' technical ' ;
    case Other = ' other ' ;
}

$ text = " Alex is 25 years old software engineer, who knows PHP, Python and can play the guitar. " ;

$ person = ( new Instructor )-> respond (
    messages: [[ ' role ' => ' user ' , ' content ' => $ text ]],
    responseModel: Person ::class,
); // client is passed explicitly, can specify e.g. different base URL

// data is extracted into an object of given class
assert ( $ person instanceof Person ); // true

// you can access object's extracted property values
echo $ person -> name ; // Alex
echo $ person -> age ; // 25
echo $ person -> profession ; // software engineer
echo $ person -> skills [ 0 ]-> name ; // PHP
echo $ person -> skills [ 0 ]-> type ; // SkillType::Technical
// ...

var_dump ( $ person );
// Person {
//     name: "Alex",
//     age: 25,
//     profession: "software engineer",
//     skills: [
//         Skill {
//              name: "PHP",
//              type: SkillType::Technical,
//         },
//         Skill {
//              name: "Python",
//              type: SkillType::Technical,
//         },
//         Skill {
//              name: "guitar",
//              type: SkillType::Other
//         },
//     ]
// }

動的データスキーマ

実行時にデータの形状を定義したい場合は、 Structureクラスを使用できます。

構造体を使用すると、LLM によって抽出されるデータの任意の形状を定義および変更できます。クラスは実行中に宣言または変更できないため、この目的には最適ではない可能性があります。

構造体を使用すると、たとえばユーザー入力や処理のコンテキストに基づいてカスタムデータ形状を動的に定義し、提供されたテキストやチャットメッセージから LLM が推測するために必要な情報を指定できます。

以下の例は、構造を定義し、それを応答モデルとして使用する方法を示しています。

 <?php
use Cognesy  Instructor  Extras  Structure  Field ;
use Cognesy  Instructor  Extras  Structure  Structure ;

enum Role : string {
    case Manager = ' manager ' ;
    case Line = ' line ' ;
}

$ structure = Structure :: define ( ' person ' , [
    Field :: string ( ' name ' ),
    Field :: int ( ' age ' ),
    Field :: enum ( ' role ' , Role ::class),
]);

$ person = ( new Instructor )-> respond (
    messages: ' Jason is 25 years old and is a manager. ' ,
    responseModel: $ structure ,
);

// you can access structure data via field API...
assert ( $ person -> field ( ' name ' ) === ' Jason ' );
// ...or as structure object properties
assert ( $ person -> age === 25 );
?>

詳細については、「構造」セクションを参照してください。

LLM モデルとオプションの変更

OpenAI / LLM エンドポイントに渡されるモデルおよびその他のオプションを指定できます。

 use Cognesy  Instructor  Features  LLM  Data  LLMConfig ;
use Cognesy  Instructor  Features  LLM  Drivers  OpenAIDriver ;
use Cognesy  Instructor  Instructor ;

// OpenAI auth params
$ yourApiKey = Env :: get ( ' OPENAI_API_KEY ' ); // use your own API key

// Create instance of OpenAI driver initialized with custom parameters
$ driver = new OpenAIDriver ( new LLMConfig (
    apiUrl: ' https://api.openai.com/v1 ' , // you can change base URI
    apiKey: $ yourApiKey ,
    endpoint: ' /chat/completions ' ,
    metadata: [ ' organization ' => '' ],
    model: ' gpt-4o-mini ' ,
    maxTokens: 128 ,
));

/// Get Instructor with the default client component overridden with your own
$ instructor = ( new Instructor )-> withDriver ( $ driver );

$ user = $ instructor -> respond (
    messages: " Jason (@jxnlco) is 25 years old and is the admin of this project. He likes playing football and reading books. " ,
    responseModel: User ::class,
    model: ' gpt-3.5-turbo ' ,
    options: [ ' stream ' => true ]
);

言語モデルと API プロバイダーのサポート

インストラクターは、次の API プロバイダーに対するすぐに使用できるサポートを提供します。

人間的
Azure OpenAI
コヒア
花火AI
グロク
ミストラル
オラマ (ローカルホスト上)
OpenAI
オープンルーター
一緒にAI

使用例については、コードリポジトリのハブセクションまたはexamplesディレクトリを確認してください。

LLM の追加命令として DocBlock を使用する

PHP DocBlocks (/** */) を使用すると、クラスまたはフィールドレベルで LLM に追加の指示を提供できます。たとえば、期待される内容や LLM がデータをどのように処理するかを明確にすることができます。

インストラクターは、定義されたクラスおよびプロパティから PHP DocBlocks コメントを抽出し、LLM に送信される応答モデルの仕様に含めます。

PHP DocBlocks 命令の使用は必須ではありませんが、LLM の推論結果を改善するための意図を明確にしたい場合があります。

 /**
 * Represents a skill of a person and context in which it was mentioned. 
 */
class Skill {
    public string $ name ;
    /** @var SkillType $type type of the skill, derived from the description and context */
    public SkillType $ type ;
    /** Directly quoted, full sentence mentioning person's skill */
    public string $ context ;
}

検証のカスタマイズ

検証ミックスイン

ValidationMixin トレイトを使用すると、簡単なカスタムデータオブジェクト検証の機能を追加できます。

 use Cognesy  Instructor  Features  Validation  Traits  ValidationMixin ;

class User {
    use ValidationMixin ;

    public int $ age ;
    public int $ name ;

    public function validate () : array {
        if ( $ this -> age < 18 ) {
            return [ " User has to be adult to sign the contract. " ];
        }
        return [];
    }
}

検証コールバック

講師は Symfony 検証コンポーネントを使用して、抽出されたデータを検証します。 #[Assert/Callback] アノテーションを使用して、完全にカスタマイズされた検証ロジックを構築できます。

 use Cognesy  Instructor  Instructor ;
use Symfony  Component  Validator  Constraints as Assert ;
use Symfony

拡大する