Variant Analysis
Automate and integrate Variant Analysis for scalable genomic variant detection and interpretation
Variant Analysis is a community skill for analyzing code variants and vulnerability patterns across codebases, covering CodeQL query writing, pattern matching, taint analysis, security bug detection, and systematic variant discovery for software security auditing.
What Is This?
Overview
Variant Analysis provides guidance on finding variations of known vulnerabilities across large codebases using query-based analysis tools. It covers CodeQL query writing that defines structural code patterns using a declarative query language to search for vulnerable coding patterns, pattern matching that identifies similar code structures across files and repositories based on AST shapes and data flow characteristics, taint analysis that tracks untrusted input from sources through transformations to sensitive sinks where it could cause harm, security bug detection that identifies common vulnerability classes like injection, buffer overflow, and authentication bypass patterns, and systematic discovery that finds all instances of a vulnerability pattern after an initial bug is identified. The skill helps security auditors find related vulnerabilities systematically.
Who Should Use This
This skill serves security researchers hunting for vulnerability variants, code auditors reviewing large codebases for pattern violations, and development teams scanning for known-bad coding patterns.
Why Use It?
Problems It Solves
Finding all instances of a vulnerability pattern manually in large codebases is impractical and error-prone. Grep-based searches miss semantic variations that use different syntax for the same vulnerable pattern. After fixing one bug, related variants often remain undiscovered in other parts of the code. Taint tracking through complex call chains requires automated data flow analysis.
Core Highlights
Query engine searches code using structural pattern matching. Taint tracker follows untrusted data from source to sink. Variant finder discovers related bugs from initial examples. Pattern library provides queries for common vulnerability classes.
How to Use It?
Basic Usage
// CodeQL: Find SQL
// injection variants
import python
import semmle.python
.dataflow.new.DataFlow
import semmle.python
.dataflow.new.TaintTracking
import semmle.python
.Concepts
module SqlInjection
implements
DataFlow::ConfigSig
{
predicate isSource(
DataFlow::Node src
) {
src instanceof
RemoteFlowSource
}
predicate isSink(
DataFlow::Node sink
) {
exists(
SqlExecution exec |
sink = exec
.getAnInput()
)
}
}
from
DataFlow::Node source,
DataFlow::Node sink
where
TaintTracking::
Global<SqlInjection>
::flow(source, sink)
select sink,
"SQL injection from "
+ "$@ to $@.",
source, "user input",
sink, "query"Real-World Examples
// Find command injection
import python
import semmle.python
.dataflow.new.DataFlow
import semmle.python
.Concepts
from Call call,
DataFlow::Node arg
where
call.getFunc()
.(Attribute)
.getName() = "system"
and arg.asExpr() =
call.getAnArg()
and exists(
RemoteFlowSource rfs |
DataFlow::localFlow(
rfs, arg))
select call,
"Potential command "
+ "injection via $@.",
arg, "user input"Advanced Tips
Start variant analysis from a known vulnerability and generalize the pattern by abstracting specific variable names and call targets. Use CodeQL path queries to visualize data flow paths from source to sink for easier triage. Build a library of reusable query modules for common source and sink patterns across projects.
When to Use It?
Use Cases
Search for SQL injection variants across a large Python web application after finding one instance. Audit a codebase for command injection patterns using taint tracking from request handlers. Build custom CodeQL queries for project-specific security patterns and API misuse.
Related Topics
CodeQL, static analysis, security auditing, taint tracking, vulnerability research, SAST, and code review.
Important Notes
Requirements
CodeQL CLI and database creation tools for building queryable code representations from source repositories. Understanding of the CodeQL query language including predicates, classes, and data flow library concepts. Source code access for creating CodeQL databases from the target project.
Usage Recommendations
Do: start with existing CodeQL queries from the public library and customize them for your target patterns. Test queries against known vulnerable code samples to verify detection accuracy before running on production codebases. Use path queries to understand complete data flow paths.
Don't: rely solely on automated results without manual review since false positives are common in taint analysis. Write overly specific queries that only match exact syntax patterns since variants often use different coding styles. Skip sanitizer modeling since ignoring input validation produces excessive false positive results.
Limitations
Static analysis cannot detect vulnerabilities that depend on runtime state or configuration values. CodeQL database creation requires successful project compilation for compiled languages. Complex taint propagation through serialization boundaries or inter-process communication may not be tracked.
More Skills You Might Like
Explore similar skills to enhance your workflow
Googlecalendar Automation
Automate Google Calendar tasks via Rube MCP (Composio)
Continuous Learning
Automate and integrate continuous learning processes to support ongoing skill development
Composio Search Automation
Automate Composio Search tasks via Rube MCP (Composio)
Membervault Automation
Automate Membervault tasks via Rube MCP (Composio)
Bart Automation
Automate Bart operations through Composio's Bart toolkit via Rube MCP
Imgix Automation
Automate Imgix operations through Composio's Imgix toolkit via Rube MCP