Semgrep Rule Variant Creator

Semgrep Rule Variant Creator automation and integration

Semgrep Rule Variant Creator is a community skill for generating variations of existing Semgrep rules, covering pattern adaptation across languages, severity adjustment, context expansion, framework-specific variants, and rule evolution for comprehensive coverage.

What Is This?

Overview

Semgrep Rule Variant Creator provides methods for deriving new rules from existing ones to expand detection coverage. It covers pattern adaptation that translates rules between programming languages while preserving detection intent, severity adjustment that creates rule tiers with different strictness levels for development and production, context expansion that broadens patterns to catch additional code constructs expressing the same vulnerability, framework-specific variants that customize rules for particular web frameworks or libraries, and rule evolution that updates rules as APIs and coding patterns change. The skill helps teams maximize detection coverage from existing rules.

Who Should Use This

This skill serves security engineers scaling rule coverage across polyglot codebases, platform teams adapting community rules for internal frameworks, and developers maintaining evolving rule sets.

Why Use It?

Problems It Solves

A single Semgrep rule only covers one language and one code pattern for each vulnerability type. Manually rewriting rules for each language and framework combination is repetitive and error-prone. Teams using multiple web frameworks need separate rules detecting the same vulnerability in different API surfaces. Rule maintenance burden grows linearly with the number of variants needed.

Core Highlights

Language adapter translates rules across programming languages. Severity tuner creates development and production strictness levels. Pattern expander broadens detection to cover equivalent constructs. Framework specializer customizes rules for specific library APIs.

How to Use It?

Basic Usage

rules:
  - id: sqli-raw-python
    patterns:
      - pattern: |
          cursor.execute(
            f"...")
    message: >
      SQL injection via
      f-string in raw
      cursor.execute.
    languages: [python]
    severity: ERROR

  - id: sqli-django-raw
    patterns:
      - pattern: |
          $MODEL.objects.raw(
            f"...")
    message: >
      SQL injection via
      f-string in Django
      raw query.
    languages: [python]
    severity: ERROR

  - id: sqli-flask-text
    patterns:
      - pattern: |
          db.engine.execute(
            text(f"..."))
    message: >
      SQL injection in
      Flask SQLAlchemy
      text() call.
    languages: [python]
    severity: ERROR

Real-World Examples

import yaml
import copy
from pathlib import Path

class VariantGenerator:
  def __init__(
    self,
    base_rule: dict
  ):
    self.base = base_rule

  def for_language(
    self,
    lang: str,
    pattern: str,
    rule_id: str = None
  ) -> dict:
    variant = copy\
      .deepcopy(self.base)
    variant['id'] = (
      rule_id or
      f'{self.base["id"]}'
      f'-{lang}')
    variant['languages'] = (
      [lang])
    variant['patterns'] = [
      {'pattern': pattern}]
    return variant

  def with_severity(
    self,
    severity: str,
    rule_id: str = None
  ) -> dict:
    variant = copy\
      .deepcopy(self.base)
    variant['id'] = (
      rule_id or
      f'{self.base["id"]}'
      f'-{severity.lower()}')
    variant['severity'] = (
      severity)
    return variant

  def save_variants(
    self,
    variants: list,
    output: str
  ):
    rules = {
      'rules': variants}
    Path(output).write_text(
      yaml.dump(rules))

base = {
  'id': 'no-eval',
  'patterns': [
    {'pattern':
       'eval(...)'}],
  'message': 'Avoid eval',
  'languages': ['python'],
  'severity': 'ERROR'}
gen = VariantGenerator(base)
js_rule = gen.for_language(
  'javascript',
  'eval(...)')
warn_rule = gen\
  .with_severity(
    'WARNING')
gen.save_variants(
  [base, js_rule,
   warn_rule],
  'variants.yaml')
print('Generated 3 variants')

Advanced Tips

Group rule variants into separate configuration files by severity to enable different enforcement levels in CI versus local development. Use Semgrep registry tags to organize variants and enable selective scanning by category. Automate variant generation from a base rule template when supporting new frameworks.

When to Use It?

Use Cases

Adapt a Python SQL injection rule to detect equivalent patterns in Django, Flask, and SQLAlchemy code. Create warning-level variants of error rules for gradual adoption in existing codebases. Generate JavaScript equivalents of Python security rules for fullstack project scanning.

Related Topics

Semgrep, rule authoring, static analysis, multi-language scanning, security rules, code patterns, and CI configuration.

Important Notes

Requirements

Existing Semgrep rules to use as base templates for variant creation. Understanding of target language syntax for accurate pattern translation. Semgrep CLI for testing generated variants against sample code.

Usage Recommendations

Do: test each variant independently against language-specific code samples since patterns do not transfer directly between languages. Maintain a mapping between base rules and their variants for coordinated updates. Use consistent naming conventions that indicate language and framework in variant rule identifiers.

Don't: assume that a rule pattern valid in one language produces correct matches in another without testing. Create excessive variants that overlap in detection coverage since this produces duplicate findings. Modify base rules without updating all derived variants to maintain consistency.

Limitations

Automated translation between languages cannot handle all idiom differences and requires manual review. Variant rules increase the total scan time proportionally to the number of rules evaluated. Framework-specific patterns may break when libraries update their APIs in new major versions.