ollama python Download - ollama python Quellcode herunterladen

ollama python

Python

v0.4.4

Herunterladen

Ollama Python-Bibliothek

Die Ollama-Python-Bibliothek bietet die einfachste Möglichkeit, Python 3.8+-Projekte in Ollama zu integrieren.

Voraussetzungen

Ollama sollte installiert sein und ausgeführt werden
Ziehen Sie ein Modell zur Verwendung mit der Bibliothek: ollama pull <model> , z. B. ollama pull llama3.2
- Weitere Informationen zu den verfügbaren Modellen finden Sie auf Ollama.com.

Installieren

pip install ollama

Verwendung

 from ollama import chat
from ollama import ChatResponse

response : ChatResponse = chat ( model = 'llama3.2' , messages = [
  {
    'role' : 'user' ,
    'content' : 'Why is the sky blue?' ,
  },
])
print ( response [ 'message' ][ 'content' ])
# or access fields directly from the response object
print ( response . message . content )

Weitere Informationen zu den Antworttypen finden Sie unter _types.py.

Streaming-Antworten

Das Antwort-Streaming kann durch Festlegen stream=True aktiviert werden.

 from ollama import chat

stream = chat (
    model = 'llama3.2' ,
    messages = [{ 'role' : 'user' , 'content' : 'Why is the sky blue?' }],
    stream = True ,
)

for chunk in stream :
  print ( chunk [ 'message' ][ 'content' ], end = '' , flush = True )

Benutzerdefinierter Client

Ein benutzerdefinierter Client kann durch Instanziieren Client oder AsyncClient von ollama erstellt werden.

Alle zusätzlichen Schlüsselwortargumente werden an httpx.Client übergeben.

 from ollama import Client
client = Client (
  host = 'http://localhost:11434' ,
  headers = { 'x-some-header' : 'some-value' }
)
response = client . chat ( model = 'llama3.2' , messages = [
  {
    'role' : 'user' ,
    'content' : 'Why is the sky blue?' ,
  },
])

Asynchroner Client

Die AsyncClient -Klasse wird verwendet, um asynchrone Anforderungen zu stellen. Sie kann mit denselben Feldern wie die Client -Klasse konfiguriert werden.

 import asyncio
from ollama import AsyncClient

async def chat ():
  message = { 'role' : 'user' , 'content' : 'Why is the sky blue?' }
  response = await AsyncClient (). chat ( model = 'llama3.2' , messages = [ message ])

asyncio . run ( chat ())

Durch das Festlegen von stream=True werden Funktionen so geändert, dass sie einen asynchronen Python-Generator zurückgeben:

 import asyncio
from ollama import AsyncClient

async def chat ():
  message = { 'role' : 'user' , 'content' : 'Why is the sky blue?' }
  async for part in await AsyncClient (). chat ( model = 'llama3.2' , messages = [ message ], stream = True ):
    print ( part [ 'message' ][ 'content' ], end = '' , flush = True )

asyncio . run ( chat ())

API

Die API der Ollama-Python-Bibliothek basiert auf der Ollama-REST-API

Chatten

 ollama . chat ( model = 'llama3.2' , messages = [{ 'role' : 'user' , 'content' : 'Why is the sky blue?' }])

Erzeugen

 ollama . generate ( model = 'llama3.2' , prompt = 'Why is the sky blue?' )

Liste

 ollama . list ()

Zeigen

 ollama . show ( 'llama3.2' )

Erstellen

 modelfile = '''
FROM llama3.2
SYSTEM You are mario from super mario bros.
'''

ollama . create ( model = 'example' , modelfile = modelfile )

Kopie

 ollama . copy ( 'llama3.2' , 'user/llama3.2' )

Löschen

 ollama . delete ( 'llama3.2' )

Ziehen

 ollama . pull ( 'llama3.2' )

Drücken

 ollama . push ( 'user/llama3.2' )

Einbetten

 ollama . embed ( model = 'llama3.2' , input = 'The sky is blue because of rayleigh scattering' )

Einbetten (Stapel)

 ollama . embed ( model = 'llama3.2' , input = [ 'The sky is blue because of rayleigh scattering' , 'Grass is green because of chlorophyll' ])

Ps

 ollama . ps ()

Fehler

Fehler werden ausgelöst, wenn Anfragen einen Fehlerstatus zurückgeben oder wenn beim Streaming ein Fehler erkannt wird.

 model = 'does-not-yet-exist'

try :
  ollama . chat ( model )
except ollama . ResponseError as e :
  print ( 'Error:' , e . error )
  if e . status_code == 404 :
    ollama . pull ( model )