Thank you for your interest in my application. Please be aware that this is only a Proof of Concept system and may contain bugs or unfinished features. If you like this app you can ❤️ follow me on Twitter for news and updates.
? The primary use case for this app is to assist users in answering questions about board game rules based on the instruction manual. While the app can be used for other tasks, helping users with board game rules is particularly meaningful to me since I'm an avid fan of board games myself. Additionally, this use case is relatively harmless, even in cases where the model may experience hallucinations.
The app can be accessed on the Streamlit Community Cloud at https://ask-my-pdf.streamlit.app/. ? However, to use the app, you will need your own OpenAI's API key.
? The app implements the following academic papers:
In-Context Retrieval-Augmented Language Models aka RALM
Precise Zero-Shot Dense Retrieval without Relevance Labels aka HyDE (Hypothetical Document Embeddings)
Clone the repo:
git clone https://github.com/mobarski/ask-my-pdf
Install dependencies:
pip install -r ask-my-pdf/requirements.txt
Run the app:
cd ask-my-pdf/src
run.sh
or run.bat
STORAGE_SALT - cryptograpic salt used when deriving user/folder name and encryption key from API key, hexadecimal notation, 2-16 characters
STORAGE_MODE - index storage mode: S3, LOCAL, DICT (default)
STATS_MODE - usage stats storage mode: REDIS, DICT (default)
FEEDBACK_MODE - user feedback storage mode: REDIS, NONE (default)
CACHE_MODE - embeddings cache mode: S3, DISK, NONE (default)
STORAGE_PATH - directory path for index storage
CACHE_PATH - directory path for embeddings cache
S3_REGION - region code
S3_BUCKET - bucket name (storage)
S3_SECRET - secret key
S3_KEY - access key
S3_URL - URL
S3_PREFIX - object name prefix
S3_CACHE_BUCKET - bucket name (cache)
S3_CACHE_PREFIX - object name prefix (cache)