ConversAI is an innovative conversational AI framework designed to empower users with intelligent interactions across various document formats and web content. Utilizing advanced natural language processing (NLP) techniques, ConversAI enables seamless text extraction and querying capabilities, making it an invaluable tool for researchers, students, professionals, and anyone who regularly interacts with text-based information.
In an era characterized by information overload, efficient data processing is crucial. ConversAI addresses this challenge by leveraging state-of-the-art technologies to transform unstructured data into actionable insights. Whether extracting meaningful information from PDFs, fetching transcripts from YouTube videos, or gathering data from multiple web pages, ConversAI provides a user-friendly interface that simplifies these complex tasks.
With its modular design, ConversAI is not just a tool but a platform that can be extended and customized to fit diverse user requirements.
Before running ConversAI, ensure you have the following dependencies installed:
apt-get update && apt-get upgrade -y
apt-get install poppler-utils -y
Additionally, you need to set up your environment variables for the GROQ API:
GROQ_API_KEY
in your environment variables.Clone the repository:
git clone https://github.com/rauhanahmed/ConversAI.git
cd ConversAI
Install the required packages:
pip install -r requirements.txt
To launch the application, run the following command:
python app.py
The Gradio interface will open in your default web browser.
In case a GPU is unavailable, please modify the config.ini
file as follows:
Under the [EMBEDDINGS]
section, change:
device = cuda
to:
device = cpu
Under the [EASYOCR]
section, change:
gpu = true
to:
gpu = false
These adjustments will ensure that the application runs smoothly on CPU resources.
After using the interface, be sure to click the "Clear" button to reset the fields. This is crucial because session management has not been implemented in this version, and failing to clear inputs may lead to unintended data persistence during subsequent interactions.
Here's a comprehensive view of the project's directory tree:
ConversAI/
├── app.py # Main application file
├── config.ini # Configuration file
├── params.yaml # Prompts for the application
├── requirements.txt # Required Python packages
├── src/ # Source code directory
│ ├── components/ # Component modules
│ │ ├── loaders/ # Data loaders
│ │ │ ├── pdfLoader.py
│ │ │ ├── websiteCrawler.py
│ │ │ └── youtubeLoader.py
│ │ ├── rag/ # Retrieval-Augmented Generation components
│ │ │ └── RAG.py
│ │ └── vectors/ # Vector storage and processing
│ │ └── vectorstore.py
│ ├── utils/ # Utility functions and classes
│ │ ├── exceptions.py
│ │ ├── functions.py
│ │ ├── logging.py
│ ├── pipelines/ # Pipeline logic for data processing
│ │ └── completePipeline.py
└── README.md # Project documentation
ConversAI is more than just a tool; it’s a comprehensive solution for managing and extracting insights from a multitude of document formats and web sources. With its powerful capabilities and user-friendly interface, ConversAI is poised to make information retrieval and processing easier and more efficient than ever before.
Sure! Here’s an updated section to include your contributions and acknowledgments:
This project was developed while working as an AI Engineer at Tech Consulting Partners. I built ConversAI from scratch, implementing advanced document retrieval methods, reranking techniques, hybrid search methodologies, multiple integrations with large language models (LLMs), and lots of other complex functionalities.
The backend includes user management features, sophisticated data storage solutions (including S3 storage management), database management, and vector databases. The deployment strategy leverages robust APIs, Docker containers, CI/CD practices, model monitoring, and cloud platform deployment.
This open-source prototype serves as a stepping stone towards a more comprehensive project aimed at public good, showcasing the immense potential of advanced AI technologies in everyday applications. I extend my heartfelt gratitude to Tech Consulting Partners for entrusting me with this initiative and for their invaluable support throughout the development process.
This project is licensed under the MIT License - see the LICENSE file for details.
We hope you enjoy using ConversAI! For any questions or feedback, please reach out via the project repository or email.