Run
One-shot lifecycle command that chains init → baseline → spawn → eval → merge in a single invocation
What Is Run?
The Run skill in Claude Code is a powerful, streamlined command designed for AgentHub users who need to execute a complete agent lifecycle in a single step. Tagged simply as "One-shot lifecycle command that chains init → baseline → spawn → eval → merge in a single invocation," the /hub:run command encapsulates a series of critical development and evaluation phases—initializing the environment, capturing a baseline, spawning multiple agent variants, evaluating their outputs, and merging the best result—into one seamless workflow.
Instead of invoking each phase separately (which can be repetitive and error-prone), Run enables developers and machine learning practitioners to orchestrate the entire lifecycle with a single, well-structured command. This capability is particularly beneficial for teams focusing on rapid iteration, agent-based code optimization, or automated software improvement pipelines.
Why Use Run?
The Run skill addresses several common pain points in agent-based development and experimentation workflows:
- Efficiency: By chaining all lifecycle steps—
init,baseline,spawn,eval, andmerge—users save time and reduce manual intervention. - Consistency: Automating the lifecycle ensures that each run is reproducible and that all necessary steps are executed in the correct order.
- Parallelism: With built-in support for spawning multiple agents,
Runfacilitates parallel experimentation, comparison, and selection of the best outcome. - Flexibility: The command supports a range of use cases, from code optimization and refactoring to test writing and creative generation, all customizable via parameters.
- Scalability: Whether you are running a single agent or orchestrating dozens,
Runabstracts the complexity into a simple, parameterized interface.
In short, Run is designed for those who want to streamline agent-based development workflows, automate evaluations, and ensure optimal results with minimal manual overhead.
How to Get Started
Getting started with the Run skill is straightforward. After installing and configuring AgentHub and its skills, you can invoke /hub:run directly from your preferred terminal or chat interface.
Example 1:
Code Optimization
/hub:run --task "Reduce p50 latency" --agents 3 \
--eval "pytest bench.py --json" --metric p50_ms --direction lower \
--template optimizerThis command will:
- Describe the task to all agents ("Reduce p50 latency"),
- Spawn three agent variants using the
optimizertemplate, - Evaluate each agent's solution using a benchmark test,
- Extract the
p50_msmetric, - Treat lower values as better,
- Automatically merge the best-performing agent’s changes.
Example 2:
Automated Refactoring
/hub:run --task "Refactor auth module" --agents 2 --template refactorerHere, two refactorer agents independently refactor the authentication module. No external eval command is specified, so only agent template logic applies.
Example 3:
Test Coverage Improvement
/hub:run --task "Cover untested utils" --agents 3 \
--eval "pytest --cov=utils --cov-report=json" --metric coverage_pct --direction higher \
--template test-writerThree agents attempt to maximize code coverage in the utils module, and the agent producing the highest coverage percentage is selected.
Example 4:
Creative Generation
/hub:run --task "Write 3 email subject lines for spring sale campaign" --agents 3 --judgeFor tasks requiring subjective evaluation, such as creative writing, you can use the --judge flag to invoke LLM-based judging.
Key Features
- One-Shot Lifecycle: Executes initialization, baseline capture, agent spawning, evaluation, and merging in a single invocation.
- Flexible Agent Templates: Supports various agent roles, including
optimizer,refactorer,test-writer, andbug-fixer. - Evaluation Integration: Accepts any shell command for automated evaluation, with metrics extracted from command output.
- Metric Directionality: Specify whether higher or lower metric values are preferred via the
--directionflag. - Parallel Agent Support: Easily set the number of agents to run in parallel for comparative experimentation.
- Judge Mode: For tasks without quantitative evaluation, enable LLM-based judging with the
--judgeflag.
Best Practices
- Define Clear Tasks: Always provide a concise, unambiguous task description with the
--taskparameter. This ensures agents understand the objective. - Choose Appropriate Templates: Select the agent template (
--template) that best matches your problem (e.g., useoptimizerfor performance tuning). - Set Meaningful Evaluation: When using
--eval, ensure the command outputs the metric you intend to optimize, and specify both--metricand--direction. - Leverage Parallelism Thoughtfully: More agents can increase the chance of finding an optimal solution, but also consume more resources.
- Monitor Output Carefully: Review merged results, especially after creative or refactoring tasks, before deploying to production.
Important Notes
- Parameter Requirements:
--taskis always required. If you provide--eval, you must also supply both--metricand--direction. The--agentsparameter defaults to 3 if omitted. - Evaluation Output: The evaluation command must output the metric in a machine-readable format so the skill can extract and compare results.
- Judge Mode vs. Eval Mode: Use
--judgefor subjective tasks; otherwise, always provide an evaluation command and metric for objective comparison. - Template Extensibility: You can create custom templates to extend agent behaviors, though built-in templates cover most common scenarios.
- Manual Review: While
Runautomates merging, always review critical or production code changes post-merge for safety. - Skill Source and Updates: The latest version and documentation are available at AgentHub Run Skill GitHub.
By integrating the Run skill into your workflow, you can automate, accelerate, and standardize agent-driven development processes with minimal configuration and maximum flexibility.
More Skills You Might Like
Explore similar skills to enhance your workflow
Requesting Code Review
requesting-code-review skill for programming & development
Azure Deployment Preflight
azure-deployment-preflight skill for programming & development
Expo Tailwind Setup
Set up Tailwind CSS with NativeWind in Expo React Native projects
Solana Dev
Modern Solana development covering Anchor programs, token creation, wallet connection, and deployment
Game Developer
Streamline game engine workflows and asset pipeline integration for professional game development
Azure Role Selector
azure-role-selector skill for programming & development