Pustaka Ollama Python menyediakan cara termudah untuk mengintegrasikan proyek Python 3.8+ dengan Ollama.
ollama pull <model>
misalnya ollama pull llama3.2
pip install ollama
from ollama import chat
from ollama import ChatResponse
response : ChatResponse = chat ( model = 'llama3.2' , messages = [
{
'role' : 'user' ,
'content' : 'Why is the sky blue?' ,
},
])
print ( response [ 'message' ][ 'content' ])
# or access fields directly from the response object
print ( response . message . content )
Lihat _types.py untuk informasi selengkapnya tentang tipe respons.
Streaming respons dapat diaktifkan dengan mengatur stream=True
.
from ollama import chat
stream = chat (
model = 'llama3.2' ,
messages = [{ 'role' : 'user' , 'content' : 'Why is the sky blue?' }],
stream = True ,
)
for chunk in stream :
print ( chunk [ 'message' ][ 'content' ], end = '' , flush = True )
Klien khusus dapat dibuat dengan membuat instance Client
atau AsyncClient
dari ollama
.
Semua argumen kata kunci tambahan diteruskan ke httpx.Client
.
from ollama import Client
client = Client (
host = 'http://localhost:11434' ,
headers = { 'x-some-header' : 'some-value' }
)
response = client . chat ( model = 'llama3.2' , messages = [
{
'role' : 'user' ,
'content' : 'Why is the sky blue?' ,
},
])
Kelas AsyncClient
digunakan untuk membuat permintaan asinkron. Itu dapat dikonfigurasi dengan bidang yang sama dengan kelas Client
.
import asyncio
from ollama import AsyncClient
async def chat ():
message = { 'role' : 'user' , 'content' : 'Why is the sky blue?' }
response = await AsyncClient (). chat ( model = 'llama3.2' , messages = [ message ])
asyncio . run ( chat ())
Pengaturan stream=True
mengubah fungsi untuk mengembalikan generator asinkron Python:
import asyncio
from ollama import AsyncClient
async def chat ():
message = { 'role' : 'user' , 'content' : 'Why is the sky blue?' }
async for part in await AsyncClient (). chat ( model = 'llama3.2' , messages = [ message ], stream = True ):
print ( part [ 'message' ][ 'content' ], end = '' , flush = True )
asyncio . run ( chat ())
API perpustakaan Ollama Python dirancang berdasarkan Ollama REST API
ollama . chat ( model = 'llama3.2' , messages = [{ 'role' : 'user' , 'content' : 'Why is the sky blue?' }])
ollama . generate ( model = 'llama3.2' , prompt = 'Why is the sky blue?' )
ollama . list ()
ollama . show ( 'llama3.2' )
modelfile = '''
FROM llama3.2
SYSTEM You are mario from super mario bros.
'''
ollama . create ( model = 'example' , modelfile = modelfile )
ollama . copy ( 'llama3.2' , 'user/llama3.2' )
ollama . delete ( 'llama3.2' )
ollama . pull ( 'llama3.2' )
ollama . push ( 'user/llama3.2' )
ollama . embed ( model = 'llama3.2' , input = 'The sky is blue because of rayleigh scattering' )
ollama . embed ( model = 'llama3.2' , input = [ 'The sky is blue because of rayleigh scattering' , 'Grass is green because of chlorophyll' ])
ollama . ps ()
Kesalahan dimunculkan jika permintaan mengembalikan status kesalahan atau jika kesalahan terdeteksi saat streaming.
model = 'does-not-yet-exist'
try :
ollama . chat ( model )
except ollama . ResponseError as e :
print ( 'Error:' , e . error )
if e . status_code == 404 :
ollama . pull ( model )