This repository focuses on experimenting with the LangChain library for building powerful applications with large language models (LLMs). By leveraging state-of-the-art language models like OpenAI's GPT-3.5 Turbo (and soon GPT-4), this project showcases how to create a searchable database from a YouTube video transcript, perform similarity search queries using the FAISS library, and respond to user questions with relevant and precise information.
LangChain is a comprehensive framework designed for developing applications powered by language models. It goes beyond merely calling an LLM via an API, as the most advanced and differentiated applications are also data-aware and agentic, enabling language models to connect with other data sources and interact with their environment. The LangChain framework is specifically built to address these principles.
The Python-specific portion of LangChain's documentation covers several main modules, each providing examples, how-to guides, reference docs, and conceptual guides. These modules include:
With LangChain, developers can create various applications, such as customer support chatbots, automated content generators, data analysis tools, and intelligent search engines. These applications can help businesses streamline their workflows, reduce manual labor, and improve customer experiences.
By selling LangChain-based applications as a service to businesses, you can provide tailored solutions to meet their specific needs. For instance, companies can benefit from customizable chatbots that handle customer inquiries, personalized content creation tools for marketing, or internal data analysis systems that harness the power of LLMs to extract valuable insights. The possibilities are vast, and LangChain's flexible framework makes it the ideal choice for developing and deploying advanced language model applications in diverse industries.
The OpenAI API is powered by a diverse set of models with different capabilities and price points. You can also make limited customizations to our original base models for your specific use case with fine-tuning.
git clone https://github.com/daveebbelaar/langchain-experiments.git
Python 3.6 or higher using venv
or conda
. Using venv
:
cd langchain-experiments
python3 -m venv env
source env/bin/activate
Using conda
:
cd langchain-experiments
conda create -n langchain-env python=3.8
conda activate langchain-env
pip install -r requirements.txt
First, create a .env
file in the root directory of the project. Inside the file, add your OpenAI API key:
OPENAI_API_KEY="your_api_key_here"
Save the file and close it. In your Python script or Jupyter notebook, load the .env
file using the following code:
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())
By using the right naming convention for the environment variable, you don't have to manually store the key in a separate variable and pass it to the function. The library or package that requires the API key will automatically recognize the OPENAI_API_KEY
environment variable and use its value.
When needed, you can access the OPENAI_API_KEY
as an environment variable:
import os
api_key = os.environ['OPENAI_API_KEY']
Now your Python environment is set up, and you can proceed with running the experiments.
This document is provided to you by Datalumina. We help data analysts, engineers, and scientists launch and scale a successful freelance business — $100k+ /year, fun projects, happy clients. If you want to learn more about what we do, you can visit our website and subscribe to our newsletter. Feel free to share this document with your data friends and colleagues.
For video tutorials on how to use the LangChain library and run experiments, visit the YouTube channel: youtube.com/@daveebbelaar