Query

Executes SQL or natural language queries against DuckDB databases and ad-hoc files

What Is This?

Overview

The Query skill enables developers and data professionals to execute SQL queries directly against DuckDB databases or ad-hoc against raw files such as CSV, Parquet, and JSON. It accepts both standard SQL syntax and natural language questions, translating them into precise DuckDB-compatible statements using DuckDB Friendly SQL idioms. This makes data exploration fast and accessible without requiring a separate database server or complex setup.

Built on top of DuckDB, one of the fastest in-process analytical databases available, this skill integrates seamlessly into development workflows. Whether you are inspecting a local data file, running aggregations on a connected database, or asking plain-English questions about your data, the Query skill handles the translation and execution in a single step.

The skill operates through the Bash tool, resolving state from a local configuration file to determine the correct execution mode. It supports both persistent database connections and stateless file-based queries, giving developers flexibility depending on their project requirements.

Who Should Use This

Data engineers who need to quickly inspect and validate pipeline outputs stored in Parquet or CSV files
Backend developers integrating DuckDB into applications who want to test queries interactively during development
Analysts working with large local datasets who prefer SQL over Python or R for data exploration
DevOps engineers auditing log files or structured exports without importing data into a full database system
Data scientists prototyping transformations before moving them into production pipelines
Technical writers and documentation authors who need to verify data examples used in guides and tutorials

Why Use It?

Problems It Solves

Eliminates the need to spin up a database server just to run a quick query against a local file
Removes the friction of writing boilerplate connection code every time you want to inspect data
Bridges the gap between natural language questions and executable SQL for users less comfortable with query syntax
Reduces context switching by keeping data queries inside the same development environment and workflow
Avoids data export and import cycles when working with files that DuckDB can query directly

Core Highlights

Accepts raw SQL or natural language input for maximum flexibility
Supports querying CSV, Parquet, JSON, and other file formats without importing them first
Uses DuckDB Friendly SQL idioms for cleaner, more readable query syntax
Resolves database state automatically from a local state file
Operates entirely in-process with no server dependency
Integrates with Bash for scripting and automation
Handles both persistent database connections and one-off file queries
Provides fast analytical performance suitable for large datasets

How to Use It?

Basic Usage

To run a SQL query against an attached DuckDB database, pass the query directly as an argument:

query "SELECT * FROM orders LIMIT 10"

To query a file directly without a persistent database, use the --file flag:

query "SELECT COUNT(*) FROM read_csv_auto('data.csv')" --file data.csv

You can also ask a natural language question and the skill will translate it into SQL:

query "How many orders were placed in the last 30 days?"

Specific Scenarios

Scenario 1: Inspecting a Parquet export from a data pipeline

After a pipeline run produces a Parquet file, you can validate its contents immediately:

query "SELECT schema_name, COUNT(*) FROM read_parquet('output.parquet') GROUP BY schema_name"

Scenario 2: Aggregating log data stored as JSON

query "SELECT level, COUNT(*) as count FROM read_json_auto('app.log.json') GROUP BY level ORDER BY count DESC"

Real-World Examples

A backend developer testing a new reporting feature can run aggregation queries against a local DuckDB database to verify results before writing application code. A data engineer can spot-check a freshly generated CSV export by querying it directly without loading it into any system. An analyst can ask plain-English questions about sales data and receive accurate SQL-backed results in seconds.

When to Use It?

Use Cases

Validating data pipeline outputs before promoting to production
Exploring unfamiliar datasets quickly during onboarding or audits
Prototyping SQL transformations for ETL processes
Running ad-hoc reports against local exports without a BI tool
Debugging data quality issues in CSV or JSON files
Testing query performance and optimization strategies locally
Answering one-off business questions from structured file exports

Important Notes

Requirements

DuckDB must be installed and accessible in the system PATH
The Bash tool must be available in the execution environment
A valid state file must exist for persistent database connection mode
Input files must be accessible at the specified path with appropriate read permissions

More Skills You Might Like

Explore similar skills to enhance your workflow