Run
Run a single experiment iteration. Edit the target file, evaluate, keep or discard
Category: development Source: alirezarezvani/claude-skillsWhat Is Run?
"Run" is a Claude Code skill designed to facilitate controlled, iterative experimentation within a development workflow. It enables developers and researchers to refine code by running a single experiment iteration—editing a target file, evaluating the outcome, and then deciding whether to keep or discard the change. This skill enforces a disciplined, measurable approach to code improvements by tightly integrating version control, experiment history, and strategy escalation.
The primary command, /ar:run, executes one iteration of the experiment pipeline. It ensures that each change is deliberate, traceable, and informed by past results. By automating the mundane yet crucial steps of experiment management, "Run" streamlines iterative development and research efforts.
Why Use Run?
Iterative experimentation is foundational to effective software development, especially in engineering and research domains where hypotheses must be tested and validated methodically. Uncontrolled changes can quickly lead to confusion, regressions, or duplication of effort. The "Run" skill addresses these challenges by:
- Enforcing Discipline: Each iteration is limited to a single, purposeful change, reducing the risk of introducing multiple variables at once.
- Leveraging History: By systematically reviewing experiment results, developers avoid repeating failed approaches and build upon what works.
- Strategic Escalation: The skill guides users through increasingly sophisticated strategies as the number of iterations grows, ensuring that simple fixes are attempted before more radical modifications.
- Seamless Version Control: Automatic branch checkouts and commit management ensure that every experiment is isolated and reproducible.
This approach is particularly valuable in contexts such as algorithm optimization, parameter tuning, and exploratory code refactoring, where clear experimental boundaries are essential.
How to Get Started
To begin using the "Run" skill, ensure you have the Autoresearch Agent and the relevant skill scripts installed from the official repository.
Running an Experiment Iteration
Run an iteration for a specific experiment:
/ar:run engineering/api-speed
If you omit the experiment name, the skill will prompt you to select one:
/ar:run
This will internally execute:
python {skill_path}/scripts/setup_experiment.py --list
and present available experiments for selection.
Workflow Integration
Ensure your project follows the expected directory structure:
.autoresearch/{domain}/{name}/config.cfg— experiment configuration.autoresearch/{domain}/{name}/program.md— strategy and constraints.autoresearch/{domain}/{name}/results.tsv— experiment history
Each experiment is managed in its own Git branch: autoresearch/{domain}/{name}.
Key Features
1. Automated Context Loading
Upon invocation, "Run" automatically retrieves the experiment's configuration, prior results, and strategic documentation. It checks out the corresponding Git branch to ensure all changes are tracked in isolation.
cat .autoresearch/engineering/api-speed/config.cfg
cat .autoresearch/engineering/api-speed/program.md
cat .autoresearch/engineering/api-speed/results.tsv
git checkout autoresearch/engineering/api-speed
2. Results-Driven Decision Making
The skill examines the results.tsv file to inform the next change. It identifies successful patterns, avoids repeated failures, and escalates strategies based on iteration count.
Example: Interpreting History
Suppose results.tsv contains:
run_id change_description outcome
1 increased timeout improved
2 added retry loop no effect
3 switched parser crash
4 tuned caching improved
The next iteration might avoid parser changes (due to crash) and focus on further caching or timeout adjustments.
3. Strategy Escalation
As experiment runs accumulate, "Run" recommends shifting tactics:
- Runs 1–5: Try obvious, low-risk improvements.
- Runs 6–15: Systematically explore parameters.
- Runs 16–30: Attempt more significant structural changes.
- Runs 31+: Pursue radical or unconventional solutions.
4. Targeted File Editing
Each iteration is restricted to editing only the target file specified by the experiment configuration. This focused approach prevents uncontrolled sprawl and makes outcomes attributable to a single change.
Example: Editing the Target File
If the experiment targets api_handler.py, you might adjust a timeout parameter:
## Before
timeout = 5
## After
timeout = 10
After editing, the skill commits the change, runs evaluation scripts, and records the result.
5. Keep or Discard Changes
Based on evaluation results, the skill either keeps the change (committing to history) or discards it, ensuring that only beneficial changes persist.
Best Practices
- Review Experiment History: Always study
results.tsvto avoid duplicating failed experiments and to recognize successful strategies. - Make Atomic Changes: Limit each iteration to a single, well-defined modification for clear attribution.
- Document Rationale: Update
program.mdwith the reasoning behind each change to provide context for future runs. - Escalate Thoughtfully: Follow the recommended escalation path—do not jump to radical changes before exhausting simpler options.
- Clean Up Regularly: Remove stale experiment branches and obsolete configurations to maintain a manageable workspace.
Important Notes
- Single Change Policy: "Run" enforces a strict one-change-per-iteration rule. Attempting to modify multiple files or introduce sweeping changes will violate the skill’s constraints.
- Branch Isolation: All changes occur in the experiment’s dedicated Git branch. Merge carefully to prevent conflicts with the main codebase.
- Result Evaluation: The quality of experiment outcomes depends on the robustness of your evaluation scripts. Ensure these scripts reliably indicate improvement, regression, or neutrality.
- Manual Intervention: While much is automated, human judgment remains essential in interpreting results and deciding subsequent strategies.
- Skill Extensibility: The "Run" skill is designed to be extended or integrated into larger automation pipelines, supporting scalable and reproducible research workflows.