تنزيل typegpt - تنزيل رمز المصدر typegpt

Typegpt - جعل GPT آمنة للإنتاج

من الصعب بطبيعتها إنتاج مخرجات من LLMs في بنية متسقة. يقوم Typegpt بتبسيط هذه العملية لتكون سهلة مثل تحديد فئة في Python.

تشغيل مشاريعنا الخاصة ، مثل Spexia

تثبيت

pip install typegpt

الاستخدام

حدد مخطط الإخراج الموجه والمطلوب كفئة فرعية في بيثون:

 from typegpt import BaseLLMResponse , PromptTemplate

class ExamplePrompt ( PromptTemplate ):

    def __init__ ( self , sentence : str ):
        self . sentence = sentence

    def system_prompt ( self ) -> str :
        return "Given a sentence, extract sentence parts."

    def user_prompt ( self ) -> str :
        return self . sentence

    class Output ( BaseLLMResponse ):
        num_sentences : int
        adjectives : list [ str ]
        nouns : list [ str ]
        verbs : list [ str ]

إذا كنت تستخدم Openai كمزود LLM الخاص بك ، فما عليك سوى استبدال اسم فئة عميل Openai مع TypeOpenAI الفئة الفرعية (لاستخدام غير متزامن في AsyncTypeOpenAI التزامن ، أو لاستخدام Azure TypeAzureOpenAI / AsyncTypeAzureOpenAI ) لجعله آمنًا. لا يزال بإمكانك استخدامه كما ستفعل من قبل ، ولكن يمكنك الآن الاتصال بوظيفة generate_output لاستكمال الدردشة مثل هذا لإنشاء كائن الإخراج:

 from typegpt . openai import TypeOpenAI

prompt = ExamplePrompt ( "The young athlete demonstrated exceptional skill and agility on the field." )

client = TypeOpenAI ( api_key = "<your api key>" ) # subclass of `OpenAI`

output = client . chat . completions . generate_output ( model = "gpt-3.5-turbo" , prompt = prompt , max_output_tokens = 1000 )

وستحصل على إخراج لطيف مثل هذا:

 Output ( num_sentences = 1 , adjectives = [ 'young' , 'exceptional' ], nouns = [ 'athlete' , 'skill' , 'agility' , 'field' ], verbs = [ 'demonstrated' ])

أنواع الإخراج

يمكن أن يحتوي نوع الإخراج الخاص بك على سلسلة أو عدد صحيح أو تعويم أو منطقية أو قوائم من هذه. من الممكن أيضًا وضع علامة على العناصر على أنها اختيارية. يمكن توفير القيم الافتراضية أيضًا.

مثال 1

 class Output ( BaseLLMResponse ):
    title : str = "My Recipe"
    description : str | None
    num_ingredients : int
    ingredients : list [ int ]
    estimated_time : float
    is_oven_required : bool

هنا ، سيقوم المحللون بتحليل description إذا أعادته LLM ، لكنه لن يتطلب ذلك. None افتراضيًا. وينطبق الشيء نفسه على title ، لأنه يحتوي على قيمة افتراضية.

مثال 2

يمكنك أيضًا تحديد المزيد من القيود أو إعطاء LLM المزيد من المعلومات لبعض العناصر:

 class Output ( BaseLLMResponse ):
    title : str = LLMOutput ( instruction = "The title for the recipe." )
    description : str | None = LLMOutput ( instruction = "An optional description for the recipe." )
    num_ingredients : int
    ingredients : list [ int ] = LLMArrayOutput ( expected_count = ( 1 , 5 ), instruction = lambda pos : f"The id of the { pos . ordinal } ingredient" ) # between 1 and 5 ingredients expected (and required at parse time)
    estimated_time : float = LLMOutput ( instruction = "The estimated time to cook" )
    is_oven_required : bool

مثال 3

بشكل افتراضي ، تتوقع المكتبة دائمًا استجابة سطر واحد فقط لكل عنصر. يمكنك تجاوز هذا عن طريق ضبط multiline=True في LLMOutput :

 class Output ( BaseLLMResponse ):
    description : str  = LLMOutput ( instruction = "A description for the recipe." , multiline = True )
    items : list [ str ] = LLMArrayOutput ( expected_count = 5 , instruction = lambda pos : f"The { pos . ordinal } item in the list" , multiline = True )

مثال 4

يمكنك عش أنواع الاستجابة. لاحظ أنك تحتاج إلى استخدام BaseLLMArrayElement للفصول التي تريد أن تعشش داخل قائمة. لإضافة تعليمات داخل عنصر من BaseLLMArrayElement ، يجب عليك استخدام LLMArrayElementOutput بدلاً من LLMOutput .

 class Output ( BaseLLMResponse ):

    class Item ( BaseLLMArrayElement ):

        class Description ( BaseLLMResponse ):
            short : str | None
            long : str

        title : str
        description : Description
        price : float = LLMArrayElementOutput ( instruction = lambda pos : f"The price of the { pos . ordinal } item" )

    items : list [ Item ]
    count : int

استخدام متقدم

تقليل موجه تلقائي

قد يكون لديك مطالبة تستخدم عددًا كبيرًا من الرموز بسبب التبعيات الكبيرة المحتملة. للتأكد من أن المطالبة تناسب دائمًا ضمن حد الرمز المميز لـ LLM ، يمكنك تنفيذ الوظيفة reduce_if_possible داخل فئة المطالبة الخاصة بك:

 class SummaryPrompt ( PromptTemplate ):

    def __init__ ( self , article : str ):
        self . article = article

    def system_prompt ( self ) -> str :
        return "Summarize the given news article"

    def user_prompt ( self ) -> str :
        return f"ARTICLE: { self . article } "

    def reduce_if_possible ( self ) -> bool :
        if len ( self . article ) > 100 :
            # remove last 100 characters at a time
            self . article = self . article [: - 100 ]
            return True
        return False

    class Output ( BaseLLMResponse ):
        summary : str

داخل وظيفة reduce_if_possible ، يجب أن تقلل من حجم المطالبة في خطوات صغيرة وإرجاع True إذا انخفضت بنجاح. تسمى الوظيفة بشكل متكرر حتى تناسب المطالبة. عند الاتصال بوظيفة Openai generate_output ، يضمن هذا تلقائيًا أن المطالبة مناسبة للنماذج المحددة. بالإضافة إلى ذلك ، يمكنك تحديد حد رمز إدخال مخصص مع نفس التأثير لتوفير التكاليف: client.chat.completions.generate_output(..., max_input_tokens=2000) .

إعادة المحاولة التلقائية

في بعض الحالات ، قد لا تزال GPT إرجاع الإخراج الذي لا يتبع المخطط بشكل صحيح. عندما يحدث هذا ، يلقي عميل Openai LLMParseException . لإعادة إعادة المحاولة تلقائيًا عندما لا يفي الإخراج بالمخطط الخاص بك ، يمكنك تعيين retry_on_parse_error إلى عدد عمليات إعادة المحاولة التي تريد السماح بها:

 out = client . chat . completions . generate_output ( "gpt-3.5-turbo" , prompt = prompt , ..., retry_on_parse_error = 3 )

الآن ، ستحاول المكتبة الاتصال بـ GPT ثلاث مرات قبل إلقاء خطأ. ومع ذلك ، تأكد من استخدام هذا فقط عندما لا تكون درجة الحرارة صفر.

سلامة النوع الثابت الكامل

 prompt = ExamplePrompt (...)
output = client . chat . completions . generate_output ( model = "gpt-4" , prompt = prompt , ...)

نظرًا لنظام النوع المحدود لـ Python ، يكون نوع الإخراج من نوع BaseLLMResponse بدلاً من الفئة الفرعية الصريحة ExamplePrompt.Output . لتحقيق سلامة النوع الكامل في الكود الخاص بك ، ما عليك سوى إضافة المعلمة output_type=ExamplePrompt.Output :

 prompt = ExamplePrompt (...)
output = client . chat . completions . generate_output ( model = "gpt-4" , prompt = prompt , output_type = ExamplePrompt . Output , ...)

هذه المعلمة ليست مجرد نوع من الديكور. يمكن أيضًا استخدامه للكتابة فوق نوع الإخراج الفعلي الذي يحاول GPT التنبؤ به.

القليل من اللقطة

إعطاء أمثلة على النموذج لشرح المهام التي يصعب شرحها:

 class ExamplePrompt ( PromptTemplate ):

    class Output ( BaseLLMResponse ):
        class Ingredient ( BaseLLMResponse ):
            name : str
            quantity : int

        ingredients : list [ Ingredient ]

    def system_prompt ( self ) -> str :
        return "Given a recipe, extract the ingredients."

    def few_shot_examples ( self ) -> list [ FewShotExample [ Output ]]:
        return [
            FewShotExample (
                input = "Let's take two apples, three bananas, and four oranges." ,
                output = self . Output ( ingredients = [
                    self . Output . Ingredient ( name = "apple" , quantity = 2 ),
                    self . Output . Ingredient ( name = "banana" , quantity = 3 ),
                    self . Output . Ingredient ( name = "orange" , quantity = 4 ),
                ])
            ),
            FewShotExample (
                input = "My recipe requires five eggs and two cups of flour." ,
                output = self . Output ( ingredients = [
                    self . Output . Ingredient ( name = "egg" , quantity = 5 ),
                    self . Output . Ingredient ( name = "flour cups" , quantity = 2 ),
                ])
            )
        ]

    def user_prompt ( self ) -> str :
        ...

أزور

تأكد من استخدام AzureChatModel كنموذج عند إنشاء الإخراج ، والذي يتكون من deployment_id ونموذج الأساس المقابل (يتم استخدام هذا لتقليل المطالبات تلقائيًا إذا لزم الأمر).

 from typegpt . openai import AzureChatModel , TypeAzureOpenAI

client = TypeAzureOpenAI (
    azure_endpoint = "<your azure endpoint>" ,
    api_key = "<your api key>" ,
    api_version = "2023-05-15" ,
)

out = client . chat . completions . generate_output ( model = AzureChatModel ( deployment_id = "gpt-35-turbo" , base_model = "gpt-3.5-turbo" ), prompt = prompt , max_output_tokens = 1000 )

دعم غير Openai LLM

يمكن لأي LLM التي لديها فكرة عن النظام ومطالبات المستخدمين استخدام هذه المكتبة. قم بإنشاء رسائل النظام ورسائل المستخدم (بما في ذلك موجه المخطط) مثل هذا:

 messages = prompt . generate_messages (
    token_limit = max_prompt_length , token_counter = lambda messages : num_tokens_from_messages ( messages )
)

عندما يكون max_prompt_length هو الحد الأقصى لعدد الرموز التي يُسمح للمطالبة بالاستخدام ، ويجب أن يكون num_tokens_from_messages وظيفة تعتبر استخدام الرمز المميز المتوقع لقائمة معينة من الرسائل. إرجاع 0 هنا إذا كنت لا ترغب في تقليل حجم المطالبة تلقائيًا.

استخدم الرسائل التي تم إنشاؤها للاتصال بـ LLM. تحليل سلسلة الإكمال التي تتلقاها مرة أخرى في فئة الإخراج المطلوبة مثل هذا:

 out = ExamplePrompt . Output . parse_response ( completion )

كيف تعمل

تقوم هذه المكتبة تلقائيًا بإنشاء مخطط متوافق مع LLM من فئة الإخراج المحددة ويضيف إرشادات إلى نهاية مطالبة النظام بالالتزام بهذا المخطط. على سبيل المثال ، للمطالبة التجريدية التالية:

 class DemoPrompt ( PromptTemplate ):

    def system_prompt ( self ) -> str :
        return "This is a system prompt"

    def user_prompt ( self ) -> str :
        return "This is a user prompt"

    class Output ( BaseLLMResponse ):
        title : str
        description : str = LLMOutput ( "Custom instruction" )
        mice : list [ str ]

سيتم إنشاء موجه النظام التالي:

 This is a system prompt

Always return the answer in the following format:
"""
TITLE: <Put the title here>
DESCRIPTION: <Custom instruction>
MOUSE 1: <Put the first mouse here>
MOUSE 2: <Put the second mouse here>
...
"""

لاحظ كيف يتم تحويل "الفئران" الجمع تلقائيًا إلى "الماوس" المفرد لتجنب الخلط بين نموذج اللغة.

يوسع