Codeql

CodeQL static analysis automation and integration for finding code vulnerabilities

Codeql is a community skill for performing semantic code analysis using CodeQL, covering query writing, database creation, vulnerability detection, custom rule development, and CI integration for automated security scanning of codebases.

What Is This?

Overview

Codeql provides patterns for using GitHub CodeQL to analyze code as data through declarative queries. It covers query writing that defines patterns to search for in code using the CodeQL query language, database creation that extracts a queryable representation from source code, vulnerability detection that identifies security issues like SQL injection, XSS, and path traversal, custom rule development that builds project-specific queries beyond the default suite, and CI integration that runs CodeQL scans on every pull request. The skill enables deep semantic analysis that finds vulnerabilities pattern-based tools miss.

Who Should Use This

This skill serves security engineers writing custom CodeQL queries for vulnerability detection, development teams integrating CodeQL into their CI security scanning, and security researchers analyzing open source projects for vulnerability patterns.

Why Use It?

Problems It Solves

Pattern-based static analysis misses vulnerabilities that require understanding data flow across functions. Writing custom detection rules in traditional SAST tools is complex and limited. Security scanning that only checks syntax patterns produces high false positive rates. Analyzing how user input flows through an application requires semantic understanding of the codebase. Tracking vulnerability fixes across branches needs integrated scanning that reports status per branch.

Core Highlights

Database builder extracts a full semantic model from source code. Query engine runs declarative searches over code structure and data flow. Vulnerability finder applies taint tracking from sources to sinks. CI scanner runs analysis on every commit and reports findings in PRs.

How to Use It?

Basic Usage

// CodeQL query for SQL injection
import javascript

from
  DataFlow::PathNode source,
  DataFlow::PathNode sink
where
  exists(
    SqlInjection::
      Configuration cfg |
    cfg.hasFlowPath(
      source, sink)
  )
select
  sink.getNode(),
  source, sink,
  "SQL injection from "
  + "$@ reaches this "
  + "query.",
  source.getNode(),
  "user input"

Real-World Examples

name: CodeQL Analysis
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  analyze:
    runs-on: ubuntu-latest
    permissions:
      security-events: write
    steps:
      - uses: actions/checkout@v4
      - uses: github/codeql-action\
          /init@v3
        with:
          languages: javascript
          queries: +security-\
            extended
      - uses: github/codeql-action\
          /analyze@v3

Advanced Tips

Write custom taint tracking configurations to trace data flow from application-specific input sources to dangerous sinks. Use CodeQL query suites to group related queries for targeted analysis like authentication or cryptography. Add CodeQL database creation to your build pipeline so databases stay current with the latest code. Use path queries with DataFlow::PathNode to generate detailed trace visualizations showing exactly how tainted data flows from source to sink.

When to Use It?

Use Cases

Scan a web application for OWASP Top 10 vulnerabilities using the default CodeQL query suite. Write custom queries to detect project-specific anti-patterns like hardcoded credentials. Integrate CodeQL analysis into GitHub Actions for automated PR security reviews.

Important Notes

Requirements

CodeQL CLI for local analysis or GitHub Advanced Security for CI integration. Source code in a supported language including JavaScript, Python, Java, C/C++, C#, Go, and Ruby. Sufficient disk space for CodeQL database creation which can reach several gigabytes for large projects. Build system access for compiled languages that require code compilation during database extraction.

Usage Recommendations

Do: start with the default security query suite before writing custom queries. Review and triage findings rather than automatically blocking PRs on all alerts. Update CodeQL CLI and query packs regularly for new vulnerability detection rules.

Don't: treat a clean CodeQL scan as proof of security since no tool catches all vulnerabilities. Ignore false positives without adding suppressions that future scans will also need. Run CodeQL on minified or generated code which produces meaningless results.

Limitations

CodeQL analysis requires building a database which adds time to CI pipelines. Custom query writing requires learning the CodeQL language which has a significant learning curve. Dynamic code patterns like eval and reflection reduce taint tracking accuracy. Query results require human review to distinguish true positives from false positives before filing security bugs. Large codebases produce databases that consume significant disk space and memory during analysis.

More Skills You Might Like

Explore similar skills to enhance your workflow