fastembed Download - fastembed Source code download

fastembed

Other source code

0.4.1

Download

What is FastEmbed?

FastEmbed is a lightweight, fast, Python library built for embedding generation. We support popular text models. Please open a GitHub issue if you want us to add a new model.

The default text embedding (TextEmbedding) model is Flag Embedding, presented in the MTEB leaderboard. It supports "query" and "passage" prefixes for the input text. Here is an example for Retrieval Embedding Generation and how to use FastEmbed with Qdrant.

Why FastEmbed?

Light: FastEmbed is a lightweight library with few external dependencies. We don't require a GPU and don't download GBs of PyTorch dependencies, and instead use the ONNX Runtime. This makes it a great candidate for serverless runtimes like AWS Lambda.
Fast: FastEmbed is designed for speed. We use the ONNX Runtime, which is faster than PyTorch. We also use data parallelism for encoding large datasets.
Accurate: FastEmbed is better than OpenAI Ada-002. We also support an ever-expanding set of models, including a few multilingual models.

Installation

To install the FastEmbed library, pip works best. You can install it with or without GPU support:

pip install fastembed# or with GPU supportpip install fastembed-gpu

Quickstart

from fastembed import TextEmbeddingfrom typing import List# Example list of documentsdocuments: List[str] = [    "This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc.",    "fastembed is supported by and maintained by Qdrant.",
]# This will trigger the model download and initializationembedding_model = TextEmbedding()print("The model BAAI/bge-small-en-v1.5 is ready to use.")embeddings_generator = embedding_model.embed(documents)  # reminder this is a generatorembeddings_list = list(embedding_model.embed(documents))  # you can also convert the generator to a list, and that to a numpy arraylen(embeddings_list[0]) # Vector of 384 dimensions

Fastembed supports a variety of models for different tasks and modalities. The list of all the available models can be found here

Dense text embeddings

from fastembed import TextEmbeddingmodel = TextEmbedding(model_name="BAAI/bge-small-en-v1.5")embeddings = list(model.embed(documents))# [#   array([-0.1115,  0.0097,  0.0052,  0.0195, ...], dtype=float32),#   array([-0.1019,  0.0635, -0.0332,  0.0522, ...], dtype=float32)# ]

Sparse text embeddings

SPLADE++

from fastembed import SparseTextEmbeddingmodel = SparseTextEmbedding(model_name="prithivida/Splade_PP_en_v1")embeddings = list(model.embed(documents))# [#   SparseEmbedding(indices=[ 17, 123, 919, ... ], values=[0.71, 0.22, 0.39, ...]),#   SparseEmbedding(indices=[ 38,  12,  91, ... ], values=[0.11, 0.22, 0.39, ...])# ]

Late interaction models (aka ColBERT)

from fastembed import LateInteractionTextEmbeddingmodel = LateInteractionTextEmbedding(model_name="colbert-ir/colbertv2.0")embeddings = list(model.embed(documents))# [#   array([#       [-0.1115,  0.0097,  0.0052,  0.0195, ...],#       [-0.1019,  0.0635, -0.0332,  0.0522, ...],#   ]),#   array([#       [-0.9019,  0.0335, -0.0032,  0.0991, ...],#       [-0.2115,  0.8097,  0.1052,  0.0195, ...],#   ]),  # ]

Image embeddings

from fastembed import ImageEmbeddingimages = [    "./path/to/image1.jpg",    "./path/to/image2.jpg",
]model = ImageEmbedding(model_name="Qdrant/clip-ViT-B-32-vision")embeddings = list(model.embed(images))# [#   array([-0.1115,  0.0097,  0.0052,  0.0195, ...], dtype=float32),#   array([-0.1019,  0.0635, -0.0332,  0.0522, ...], dtype=float32)# ]

FastEmbed on a GPU

FastEmbed supports running on GPU devices. It requires installation of the fastembed-gpu package.

pip install fastembed-gpu

Check our example for detailed instructions, CUDA 12.x support and troubleshooting of the common issues.

from fastembed import TextEmbeddingembedding_model = TextEmbedding(    model_name="BAAI/bge-small-en-v1.5", 
    providers=["CUDAExecutionProvider"]
)print("The model BAAI/bge-small-en-v1.5 is ready to use on a GPU.")

Usage with Qdrant

Installation with Qdrant Client in Python:

pip install qdrant-client[fastembed]

or

pip install qdrant-client[fastembed-gpu]

You might have to use quotes pip install 'qdrant-client[fastembed]' on zsh.

from qdrant_client import QdrantClient# Initialize the clientclient = QdrantClient("localhost", port=6333) # For production# client = QdrantClient(":memory:") # For small experiments# Prepare your documents, metadata, and IDsdocs = ["Qdrant has Langchain integrations", "Qdrant also has Llama Index integrations"]metadata = [
    {"source": "Langchain-docs"},
    {"source": "Llama-index-docs"},
]ids = [42, 2]# If you want to change the model:# client.set_model("sentence-transformers/all-MiniLM-L6-v2")# List of supported models: https://qdrant.github.io/fastembed/examples/Supported_Models# Use the new add() instead of upsert()# This internally calls embed() of the configured embedding modelclient.add(    collection_name="demo_collection",    documents=docs,    metadata=metadata,    ids=ids)search_result = client.query(    collection_name="demo_collection",    query_text="This is a query document")print(search_result)

Expand

Additional Information

Version 0.4.1
Type Other source code
Update Time 2024-11-01
size 50MB
From Github

Related Applications

waymo open dataset

2024-11-18
SmartTube

2024-12-14
Sunamu

2024-12-14
MySchedule.py

2024-12-15
viptools for eslam

2024-12-15
VITAident

2024-12-15

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
waymo open dataset

Other source code

December 2023 Update
SmartTube

Other source code

24.71 Stable
Sunamu

Other source code

Release 2.2.0
waymo open dataset

Other source code

December 2023 Update
termwind

Other categories

v2.3.0
wp functions

Other categories

1.0.0

Related Information All