document answer langchain pinecone openai
1.0.0
Langchain provides an easy-to-use integration for processing and querying documents with Pinecone and OpenAI's embeddings. With this repository, you can load a PDF, split its contents, generate embeddings, and create a question-answering system using the aforementioned tools.
embbeding_doc.py
: The primary script for loading a PDF, splitting its content, generating embeddings using OpenAI, and saving them with Pinecone.constants.py
: Holds the constants used across the repository.app.py
: A Streamlit application that allows you to query the embedded documents using a question-answering chain.Set Up Configuration:
You must create a config.py
file that defines the following:
OPENAI_API_KEY = 'YOUR_OPENAI_API_KEY'
PINECONE_API_KEY = 'YOUR_PINECONE_API_KEY'
PINECONE_API_ENVIRONMENT = 'YOUR_PINECONE_ENVIRONMENT'
Run embbeding_doc.py
:
This will load the provided PDF, split its content, generate embeddings, and save them to Pinecone.
$ python embbeding_doc.py
Start the Streamlit Application:
Use Streamlit to run the app.py
script.
$ streamlit run app.py
Once the application is running, you can enter questions related to the PDF content, and it will provide relevant answers using the created embeddings and the question-answering chain.