Empowering Truly Autonomous AI Agents Through Adversarial Evolutionary Reinforcement Learning
Website
Whitepaper
Techpaper
Docs
Telegram
Twitter/X
Overview
Features
Getting Started
Installation
Components
Evolutionary Loop
Detailed Walkthrough
License
Contributing
Citation
evolveRL is a groundbreaking framework that enables AI agents to self-improve through evolutionary and adversarial mechanisms. Unlike traditional approaches that rely heavily on manual prompt engineering, evolveRL allows agents to systematically generate, test, and refine their own prompts and configurations, bridging the gap between theoretical autonomy and actual self-reliance.
In the emerging AI agent economy, many envision a future where agents run autonomously with minimal human oversight. However, if humans must constantly update AI prompts to handle new tasks or edge cases, the agents aren't truly sovereign. evolveRL solves this by enabling continuous self-improvement through:
Autonomous Evolution: Agents detect gaps and update their own prompts
Adversarial Testing: Robust validation against challenging scenarios
Performance-Based Selection: Natural emergence of optimal configurations
Continuous Adaptation: Real-time response to changing conditions
Evolutionary Optimization: Evolve prompts and behaviors using genetic algorithms
Domain Agnostic: Specialization for any domain
Robust Evaluation: Comprehensive judging and evaluation
Adversarial Testing: Generate challenging scenarios to ensure robustness
State Management: Save and load evolved models and their states
Multiple Model Support: Use OpenAI's GPT or Anthropic's Claude, or run LLaMA locally (coming soon)
Self-Improvement Loop: Continuous evolution without human intervention
# Basic installationpip install evolverl# Install with all dependenciespip install evolverl[all]
from evolverl.evolution import Evolution, EvolutionConfigfrom evolverl.llm import LLMConfigfrom evolverl.agent import Agent, AgentConfig# Configure LLM backendllm_config = LLMConfig(model_name="gpt-4",model_type="openai", # or "anthropic"openai_api_key="your-api-key" # or anthropic_api_key for Claude)# Create agent with system promptagent_config = AgentConfig(llm_config=llm_config)agent = Agent(agent_config)agent.set_default_prompt("""You are an expert AI agent specialized in mathematics.You break down complex problems step by step and show your work clearly.""")# Configure evolution processconfig = EvolutionConfig(population_size=5,generations=10,mutation_rate=0.1,crossover_rate=0.8,output_dir="agents")# Create evolution instanceevolution = Evolution(config, experiment_id="math_solver")# Run evolution processawait evolution.evolve(domain="mathematics",description="Solve complex math problems with detailed explanations")
You can also use agents directly without evolution:
# Create and configure agentagent = Agent(AgentConfig(llm_config=llm_config))agent.set_default_prompt("You are a helpful AI assistant...")# Send messagesresponse = await agent.send_message("What is 2+2?")print(response)
train_agent.py
is a single file CLI that runs the evolution process. Be sure to update the config file default_config.json
first, as well as keep your OpenAI or Anthropic API key as environment variables or in the .env
.
# Basic usage with OpenAIpython train_agent.py --domain math --description "Solve math problems" -v# Use Anthropic's Claudepython train_agent.py --provider anthropic --domain math --description "Solve math problems"# Load domain from filepython train_agent.py --domain-file domains/math_solver.json# Custom output directorypython train_agent.py --domain math --description "..." --output-dir ./my_agents# Increase verbosity (up to -vvvvv)python train_agent.py --domain math --description "..." -vvv
Current domain examples are in natural language. You can add more details when building your own use cases. In addition, you may include any examples you believe are important for the agent to know.
agents/ ├── {experiment_id}_gen0.json # Best agent from generation 0 ├── {experiment_id}_gen0_full.json # All variants and scores from generation 0 ├── {experiment_id}_gen1.json # Best agent from generation 1 ├── {experiment_id}_gen1_full.json # All variants and scores from generation 1 └── {experiment_id}_best.json # Best agent overall
The individual .json
(not the *_full.json
) contains the AgentConfig
for the best agent of the generation or overall. You may initiate an agent directly from its AgentConfig
file by calling agent.load_config(PATH_TO_CONFIG_FILE)
. Be sure to update the API key as it will not be stored in the AgentConfig
file.
{"population_size": 5,"generations": 10,"mutation_rate": 0.1,"crossover_rate": 0.8,"min_score_threshold": 0.7,"tournament_size": 2,"max_interaction_attempts": 5,"output_dir": "agents","llm_config": {"model_name": "gpt-4o-mini","model_type": "openai","max_tokens": 500,"temperature": 0.7} }
agents/ ├── {experiment_id}_gen0.json # Best agent from generation 0 ├── {experiment_id}_gen0_full.json # All variants from generation 0 ├── {experiment_id}_gen1.json # Best agent from generation 1 ├── {experiment_id}_gen1_full.json # All variants from generation 1 └── {experiment_id}_best.json # Best agent overall
The evolution process shows real-time progress with nested progress bars:
Generation 2/10: 100%|██████████| 6/6 [00:15<00:00, best_score=0875, avg_score=0834] Overall Progress: 15%|██ | 12/80 [00:30<02:45, generation=2/10, best_overall=0875]
This may take a while depending on the number of generations and population size per generation.
MIT License - see LICENSE file for details
Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request
@software{evolverl2024,title={evolveRL: Evolutionary Reinforcement Learning for LLMs},author={TheHandsomeDev},year={2025},url={https://www.evolverl.com/}}