pixeltable Download - pixeltable Source code download

Pixeltable is a Python library providing a declarative interface for multimodal data (text, images, audio, video). It features built-in versioning, lineage tracking, and incremental updates, enabling users to store, transform, index, and iterate on data for their ML workflows.

Data transformations, model inference, and custom logic are embedded as computed columns.

Load/Query all data types: Interact with video data at the frame level and documents at the chunk level
Incremental updates for data transformation: Maintain an embedding index colocated with your data
Lazy evaluation and cache management: Eliminates the need for manual frame extraction
Integrates with any Python libraries: Use built-in and custom functions (UDFs) without complex pipelines
Data format agnostic and extensibility: Access tables as Parquet files, PyTorch datasets, or COCO annotations

? Installation

pip install pixeltable

Pixeltable is persistent. Unlike in-memory Python libraries such as Pandas, Pixeltable is a database.

Getting Started

Learn how to create tables, populate them with data, and enhance them with built-in or user-defined transformations.

Topic	Notebook	Topic	Notebook
10-Minute Tour of Pixeltable		Tables and Data Operations
User-Defined Functions (UDFs)		Object Detection Models
Incremental Prompt Engineering		Working with External Files
Integrating with Label Studio		Audio/Video Transcript Indexing
Multimodal Application		Document Indexing and RAG
Context-Aware Discord Bot		Image/Text Similarity Search

? Code Samples

Import media data into Pixeltable (videos, images, audio...)

import pixeltable as pxt

v = pxt.create_table('external_data.videos', {'video': pxt.Video})

prefix = 's3://multimedia-commons/'
paths = [
    'data/videos/mp4/ffe/ffb/ffeffbef41bbc269810b2a1a888de.mp4',
    'data/videos/mp4/ffe/feb/ffefebb41485539f964760e6115fbc44.mp4',
    'data/videos/mp4/ffe/f73/ffef7384d698b5f70d411c696247169.mp4'
]
v.insert({'video': prefix + p} for p in paths)

Learn how to work with data in Pixeltable.

Object detection in images using DETR model

import pixeltable as pxt
from pixeltable.functions import huggingface

# Create a table to store data persistently
t = pxt.create_table('image', {'image': pxt.Image})

# Insert some images
prefix = 'https://upload.wikimedia.org/wikipedia/commons'
paths = [
    '/1/15/Cat_August_2010-4.jpg',
    '/e/e1/Example_of_a_Dog.jpg',
    '/thumb/b/bf/Bird_Diversity_2013.png/300px-Bird_Diversity_2013.png'
]
t.insert({'image': prefix + p} for p in paths)

# Add a computed column for image classification
t.add_computed_column(classification=huggingface.detr_for_object_detection(
    t.image,
    model_id='facebook/detr-resnet-50'
))

# Retrieve the rows where cats have been identified
t.select(animal = t.image,
         classification = t.classification.label_text[0]) 
.where(t.classification.label_text[0]=='cat').head()

Learn about computed columns and object detection: Comparing object detection models.

Extend Pixeltable's capabilities with user-defined functions

@pxt.udf
def draw_boxes(img: PIL.Image.Image, boxes: list[list[float]]) -> PIL.Image.Image:
    result = img.copy()  # Create a copy of `img`
    d = PIL.ImageDraw.Draw(result)
    for box in boxes:
        d.rectangle(box, width=3)  # Draw bounding box rectangles on the copied image
    return result

Learn more about user-defined functions: UDFs in Pixeltable.

Automate data operations with views, e.g., split documents into chunks

# In this example, the view is defined by iteration over the chunks of a DocumentSplitter
chunks_table = pxt.create_view(
    'rag_demo.chunks',
    documents_table,
    iterator=DocumentSplitter.create(
        document=documents_table.document,
        separators='token_limit', limit=300)
)

Learn how to leverage views to build your RAG workflow.

Evaluate model performance

# The computation of the mAP metric can become a query over the evaluation output
frames_view.select(mean_ap(frames_view.eval_yolox_tiny), mean_ap(frames_view.eval_yolox_m)).show()

Learn how to leverage Pixeltable for Model analytics.

Working with inference services

chat_table = pxt.create_table('together_demo.chat', {'input': pxt.String})

# The chat-completions API expects JSON-formatted input:
messages = [{'role': 'user', 'content': chat_table.input}]

# This example shows how additional parameters from the Together API can be used in Pixeltable
chat_table.add_computed_column(
    output=chat_completions(
        messages=messages,
        model='mistralai/Mixtral-8x7B-Instruct-v0.1',
        max_tokens=300,
        stop=['n'],
        temperature=0.7,
        top_p=0.9,
        top_k=40,
        repetition_penalty=1.1,
        logprobs=1,
        echo=True
    )
)
chat_table.add_computed_column(
    response=chat_table.output.choices[0].message.content
)

# Start a conversation
chat_table.insert([
    {'input': 'How many species of felids have been classified?'},
    {'input': 'Can you make me a coffee?'}
])
chat_table.select(chat_table.input, chat_table.response).head()

Learn how to interact with inference services such as Together AI in Pixeltable.

Text and image similarity search on video frames with embedding indexes

import pixeltable as pxt
from pixeltable.functions.huggingface import clip_image, clip_text
from pixeltable.iterators import FrameIterator
import PIL.Image

video_table = pxt.create_table('videos', {'video': pxt.Video})

video_table.insert([{'video': '/video.mp4'}])

frames_view = pxt.create_view(
    'frames', video_table, iterator=FrameIterator.create(video=video_table.video))

@pxt.expr_udf
def embed_image(img: PIL.Image.Image):
    return clip_image(img, model_id='openai/clip-vit-base-patch32')

@pxt.expr_udf
def str_embed(s: str):
    return clip_text(s, model_id='openai/clip-vit-base-patch32')

# Create an index on the 'frame' column that allows text and image search
frames_view.add_embedding_index('frame', string_embed=str_embed, image_embed=embed_image)

# Now we will retrieve images based on a sample image
sample_image = '/image.jpeg'
sim = frames_view.frame.similarity(sample_image)
frames_view.order_by(sim, asc=False).limit(5).select(frames_view.frame, sim=sim).collect()

# Now we will retrieve images based on a string
sample_text = 'red truck'
sim = frames_view.frame.similarity(sample_text)
frames_view.order_by(sim, asc=False).limit(5).select(frames_view.frame, sim=sim).collect()

Learn how to work with Embedding and Vector Indexes.

AI Stack Comparison

Computer Vision Workflows

Requirement	Traditional	Pixeltable
Frame Extraction	ffmpeg + custom code	Automatic via FrameIterator
Object Detection	Multiple scripts + caching	Single computed column
Video Indexing	Custom pipelines + Vector DB	Native similarity search
Annotation Management	Separate tools + custom code	Label Studio integration
Model Evaluation	Custom metrics pipeline	Built-in mAP computation

? LLM Workflows

Requirement	Traditional	Pixeltable
Document Chunking	Tool + custom code	Native DocumentSplitter
Embedding Generation	Separate pipeline + caching	Computed columns
Vector Search	External vector DB	Built-in vector indexing
Prompt Management	Custom tracking solution	Version-controlled columns
Chain Management	Tool + custom code	Computed column DAGs

? Multimodal Workflows

Requirement	Traditional	Pixeltable
Data Types	Multiple storage systems	Unified table interface
Cross-Modal Search	Complex integration	Native similarity support
Pipeline Orchestration	Multiple tools (Airflow, etc.)	Single declarative interface
Asset Management	Custom tracking system	Automatic lineage
Quality Control	Multiple validation tools	Computed validation columns

❓ FAQ

What is Pixeltable?

Pixeltable unifies data storage, versioning, and indexing with orchestration and model versioning under a declarative table interface, with transformations, model inference, and custom logic represented as computed columns.

What problems does Pixeltable solve?

Today's solutions for AI app development require extensive custom coding and infrastructure plumbing. Tracking lineage and versions between and across data transformations, models, and deployments is cumbersome. Pixeltable lets ML Engineers and Data Scientists focus on exploration, modeling, and app development without dealing with the customary data plumbing.

What does Pixeltable provide me with? Pixeltable provides:

Data storage and versioning
Combined Data and Model Lineage
Indexing (e.g. embedding vectors) and Data Retrieval
Orchestration of multimodal workloads
Incremental updates
Code is automatically production-ready

Why should you use Pixeltable?

It gives you transparency and reproducibility
- All generated data is automatically recorded and versioned
- You will never need to re-run a workload because you lost track of the input data
It saves you money
- All data changes are automatically incremental
- You never need to re-run pipelines from scratch because you’re adding data
It integrates with any existing Python code or libraries
- Bring your ever-changing code and workloads
- You choose the models, tools, and AI practices (e.g., your embedding model for a vector index); Pixeltable orchestrates the data

What is Pixeltable not providing?

Pixeltable is not a low-code, prescriptive AI solution. We empower you to use the best frameworks and techniques for your specific needs.
We do not aim to replace your existing AI toolkit, but rather enhance it by streamlining the underlying data infrastructure and orchestration.

Tip

Check out the Integrations section, and feel free to submit a request for additional ones.