Fast Framework to build Enterprise RAG (Retriever Augmented Generation) Pipelines at Scale - powered by watsonx
Welcome to the SuperKnowa GitHub repository! SuperKnowa framework accelerates your Enterprise Generative AI applications to get prod-ready solutions quickly on your private data. Here, you will find a diverse collection of pluggable components designed to tackle various Generative AI use cases using Large Language Models (LLMs). Think of these components as building blocks, much like Lego pieces, that you can assemble to address a wide range of challenges in the realm of AI-driven text generation. These are battle-tested from 1M to 200M private knowledge base & scaled to billions of retriever tokens.
The overall pipeline of the SuperKnowa RAG framework & key building blocks:
Configurable components for the SuperKnowa RAG pipeline using a single file:
SuperKnowa is a powerful framework developed using watsonx (watch the video on watsonx.ai here) that harnesses the capabilities of Large Language Models (LLMs) to offer a range of advanced Generative AI use cases. This repository introduces you to the various use cases covered by SuperKnowa.
Learn more about SuperKnowa in our insightful blog post:
Cover Blog - SuperKnowa: Building Enterprise RAG Solutions at Scale https://medium.com/towards-generative-ai/superknowa-simplest-framework-yet-to-swiftly-build-enterprise-rag-solutions-at-scale-ca90b49be28a
Try the SuperKnowa framework with a live application built on the private knowledge base of 1M diverse docs:
https://superknowa.tsglwatson.buildlab.cloud/
(In case you don't have IBM ID, please get it here - https://www.ibm.com/account/reg/us-en/signup?formid=urx-19776)
You can get started by updating the config.yaml
file and run the LLMQnA.py script for quickly configuring your RAG pipeline:
retriever:
indexName: superknowa
query: What is IBM Cloud?
....
reranker:
query: What is IBM Data and Analytics Reference Architecture?
...
LLMQnA:
question: What is IBM Data and Analytics Reference Architecture?
...
To explore SuperKnowa's features and capabilities, refer to the blog series, code examples, and resources provided in this repository.
For detailed instructions and examples, navigate to each component's directory. Unleash the potential of Large Language Models in your projects using SuperKnowa's Generative AI Lego Components!
Let's unlock the potential of Generative AI with SuperKnowa and shape the future of AI-powered knowledge processing!
Indexing Documents
Elastic Search
Solr
Watson Discovery
Neural Retriever
Elastic Search
Solr
Re-Ranker
In-context learning using LLM
LLM Evaluations
LLM Model Evaluation
MLFLOW Integration
Fine-Tuning
Instruct DB
Fine Tuning Falcon 7B using QLORA
Fine Tuning LLAMA2 7B using QLORA
RLHF Model
Deploy & Infer
Backend
Deployment
AI Alignment Tool
Enterprise LLM Use Cases
Measure the alignment of AI models on the metrics of helpfulness, harmfulness and accuracy by capturing human inputs.
Build your various online & offline experiments for evaluations and compare the AI alignment results using an interactive dashboard.
The Eval_Package is a tool designed to evaluate the performance of the LLM (Language Model) on a dataset containing questions, context, and ideal answers. It allows you to run evaluations on various datasets and assess how well the Model generates the answer on dozens of statistical metrics like BLUE, ROUGE, etc.
The MLflow_Package is a comprehensive toolkit designed to integrate the results from the Eval_Package and efficiently track and manage experiments. It also enables you to create a leaderboard for evaluation comparisons and visualize metrics through a dashboard.
Below is a list of Generative AI use cases built using the SuperKnowa framework.
Engage in natural language conversations with SuperKnowa's conversational Question & Answer (Q&A) system. Ask questions based on the private enterprise knowledge base, and receive detailed, context-aware responses.
Leverage SuperKnowa's "Ask your documents" feature to unlock the potential of your PDFs and text documents. SuperKnowa can help you extract relevant information, answer specific questions, and assist in information retrieval.
Effortlessly generate coherent and informative summaries with SuperKnowa's summarization feature across large text corpus using FlanT5 and UL2. Extract the main points and essential details from articles, reports, and other texts, allowing for efficient content comprehension.
SuperKnowa's abstractive summarisation feature goes beyond simple extraction using FlanUL2, and LLAMA2. It can analyze lengthy PDF documents and generate concise abstractive summaries, capturing the essence of the content. Additionally, SuperKnowa identifies key points, making it easier to comprehend and communicate complex information.
Experience the power of SuperKnowa's Text-to-SQL capability, which transforms natural language queries into structured SQL queries. Interact with databases using plain language, eliminating the need for expertise in SQL.
Created & Architected By
Builders
This framework is developed by Build Lab, IBM Ecosystem. Please note that this content is made available to foster Embeddable AI technology adoption and serve ecosystem partners. The content may include systems & methods pending patent with the USPTO and protected under US Patent Laws. SuperKnowa is not a product but a framework built on the top of IBM watsonx along with other products like LLAMA models from Meta & ML Flow from Databricks. Using SuperKnowa implicitly requires agreeing to the Terms and conditions of those products. This framework is made available on an as-is basis to accelerate Enterprise GenAI applications development. In case of any questions, please reach out to [email protected].
Copyright @ 2023 IBM Corporation.