2.2 Clone the Repository
git clone https://github.com/YiVal/YiVal.git
cd YiVal
Setup with Poetry: Initialize the Python virtual environment and install
dependencies using Poetry. Make sure to run the below cmd in /YiVal
directory:
poetry install --sync
After setting up, you can quickly get started with YiVal by generating datasets of random tech startup business names.
Navigate to the yival Directory:
cd /YiVal/src/yival
Set OpenAI API Key: Replace $YOUR_OPENAI_API_KEY
with your
actual OpenAI API key.
On macOS or Linux systems,
export OPENAI_API_KEY=$YOUR_OPENAI_API_KEY
On Windows systems,
setx OPENAI_API_KEY $YOUR_OPENAI_API_KEY
Define YiVal Configuration:
Create a configuration file named config_data_generation.yml
for automated
test dataset generation with the following content:
description: Generate test data
dataset:
data_generators:
openai_prompt_data_generator:
chunk_size: 100000
diversify: true
model_name: gpt-4
input_function:
description: # Description of the function
Given a tech startup business, generate a corresponding landing
page headline
name: headline_generation_for_business
parameters:
tech_startup_business: str # Parameter name and type
number_of_examples: 3
output_csv_path: generated_examples.csv
source_type: machine_generated
Execute YiVal:
Run the following command from within the /YiVal/src/yival
directory:
yival run config_data_generation.yml
Check the Generated Dataset:
The generated test dataset will be stored in generated_examples.csv
.
Please refer to YiVal Docs Page for more details about YiVal!
Use Case Demo | Supported Features | Github Link | Video Demo Link |
---|---|---|---|
? Craft your AI story with ChatGPT and MidJourney | Multi-modal support: Design an AI-powered narrative using YiVal's multi-modal support of simultaneous text and images. It supports native and seamless Reinforcement Learning from Human Feedback(RLHF) and Reinforcement Learning from AI Feedback(RLAIF). Please watch the video above for this use case. | ||
? Evaluate performance of multiple LLMs with your own Q&A test dataset | Convenientlyevaluate and compare performance of your model of choice against 100+ models, thanks to LiteLLM. Analyze model performance benchmarks tailored to your customized test data or use case. | ||
Startup Company Headline Generation Bot | Streamline generation of headlines for your startup with automated test datacreation, prompt crafting, results evaluation, and performance enhancement via GPT-4. | ||
? Build a Customized Travel Guide Bot | Leverageautomated prompts inspired by the travel community's most popular suggestions, such as those from awesome-chatgpt-prompts. | ||
Build a Cheaper Translator: Use GPT-3.5 to teach Llama2 to create a translator with lower inference cost | UsingReplicate and GPT-3.5's test data, you can fine-tune Llama2's translation bot. Benefit from 18x savings while experiencing only a 6% performance decrease. | ||
?️ Chat with Your Favorite Characters - Dantan Ji from Till the End of the Moon | Bring your favorite characters to life through automated prompt creation andcharacter script retrieval. | ||
?Evaluate guardrails's performance in generating Python(.py) outputs | Guardrails: where are my guardrails? ? <br> Yival: I am here. ️<br><br> The integrated evaluation experiment is carried out with 80 LeetCode problems in csv, using guardrail and using only GPT-4. The accuracy drops from 0.625 to 0.55 with guardrail, latency increases by 44%, and cost increases by 140%. Guardrail still has a long way to go from demo to production. |
||
?Visualize different foods around the world!? | Just give the place where the food belongs and the best season to taste it, and you can get a video of the season-specific food!? | ||
?News article summary with CoD | By integrating the"Chain of Density" method, evaluate the enhancer's ability in text summarization.? Using 3 articles points generated by GPT-4 for evaluation, the coherent score increased by 20.03%, the attributive score increased by 25.18%!, the average token usage from 2054.6 -> 1473.4(-28.3%) . | ||
? Automated TikTok Title Generation Bot | With only two input lines, you can easily createconcise and polished TikTok video titles based on your desired target audience and video content summaries. This is presented by our auto-prompt feature: the process is automated, so you can input your requirements and enjoy the results hassle-free! |
If you want to contribute to YiVal, be sure to review the contribution guidelines. We use GitHub issues for tracking requests and bugs. Please join YiVal's discord channel for general questions and discussion. Join our collaborative community where your unique expertise as researchers and software engineers is highly valued! Contribute to our project and be a part of an innovative space where every line of code and research insight actively fuels advancements in technology, fostering a future that is intelligently connected and universally accessible.
? YiVal welcomes your contributions! ?
? Thanks so much to all of our amazing contributors ?
Paper | Author | Topics | YiVal Contributor | Data Generator | Variation Generator | Evaluator | Selector | Enhancer | Config |
---|---|---|---|---|---|---|---|---|---|
Large Language Models Are Human-Level Prompt Engineers | Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han | YiVal Evolver, Auto-Prompting | OpenAIPromptDataGenerator | OpenAIPromptVariationGenerator | OpenAIPromptEvaluator, OpenAIEloEvaluator | AHPSelector | OpenAIPromptBasedCombinationEnhancer | config | |
BERTScore: Evaluating Text Generation with BERT | Tianyi Zhang, Varsha Kishore, Felix Wu | YiVal Evaluator, bertscore, rouge | @crazycth | - | - | BertScoreEvaluator | - | - | - |
AlpacaEval | Xuechen Li, Tianyi Zhang, Yann Dubois et. al | YiVal Evaluator | - | - | AlpacaEvalEvaluator | - | - | config | |
Chain of Density | Griffin Adams Alexander R. Fabbri et. al | Prompt Engineering | - | ChainOfDensityGenerator | - | - | - | config | |
Large Language Models as Optimizers | Chengrun Yang Xuezhi Wang et. al | Prompt Engineering | @crazycth | - | - | - | - | optimize_by_prompt_enhancer | config |
LoRA: Low-Rank Adaptation of Large Language Models | Edward J. Hu Yelong Shen et. al | LLM Finetune | @crazycth | - | - | - | - | sft_trainer | config |