How do I run a DuckDB transformation?
Create, configure, and run a DuckDB SQL transformation in Keboola from start to finish — create the configuration, map input, write the SQL, map output, run it, and confirm the result landed in Storage.
You have a table in Keboola Storage and you want to transform it with DuckDB SQL and write the result back to Storage. This page takes you from nothing to a successful run using a small worked example. For all settings and syntax rules, see the reference; for when to choose DuckDB, see the explanation.
Time: ~10 minutes · You will need: a Keboola project where you can create configurations, and one table in Storage to read from.
Before you start
Section titled “Before you start”Get a table into Storage to use as the input. If you do not have one handy, upload the sample CSV file as a new table — the example SQL below expects a sample table with order_date and order_amount columns.
Step 1 — Create the transformation
Section titled “Step 1 — Create the transformation”- Open Components → Transformations and click New Transformation.
- Select DuckDB Transformation.
- Name it, optionally add a description and folder, and click Create Transformation.
Step 2 — Add the input mapping
Section titled “Step 2 — Add the input mapping”- In Input Mapping, add your Storage table.
- Set its Destination (staging table name) to
sample. - Save the mapping.
Step 3 — Write the SQL script
Section titled “Step 3 — Write the SQL script”In the code editor, paste:
CREATE TABLE "output" ASSELECT "order_date", SUM("order_amount") AS "sum_orders_amount"FROM "sample"GROUP BY "order_date";End every statement with a semicolon (;). Quote identifiers that need exact case ("sample"). You can split longer scripts into blocks, which DuckDB runs with automatic dependency analysis (see block-based orchestration).
If
SUM()fails with a type error, your input is loading asVARCHAR. Either cast explicitly, or enable Infer input table data types — see Step: typed inputs.
Step 4 — Add the output mapping
Section titled “Step 4 — Add the output mapping”- In Output Mapping, set Source to
output(the table the script creates). - Set Destination to a new Storage table, for example
out.c-main.orders. - Save the mapping.
Step 5 — Run it and confirm the result
Section titled “Step 5 — Run it and confirm the result”- Click Run on the transformation.
- Wait for the job to finish with a success status.
- Open Storage, find your destination table, and confirm it has one row per
order_datewith the summed amount.
Optional: work with typed inputs
Section titled “Optional: work with typed inputs”By default, input columns load as VARCHAR, so numeric and date functions need explicit casts. To use real types directly, enable Infer input table data types in the configuration settings — DuckDB then detects types like INTEGER, FLOAT, and DATE. See Infer input table data types.
Make it faster (backend size)
Section titled “Make it faster (backend size)”If the job is slow or runs out of memory, raise the Backend size (XSmall → Small → Medium → Large). The sizes and their memory are listed in the reference. For datasets over 10 GB, also see memory management.
Check before you run (sync actions)
Section titled “Check before you run (sync actions)”You can validate without a full run using sync actions — for example Syntax check to catch SQL errors, or Expected input tables to confirm the inputs your script references.
Troubleshooting
Section titled “Troubleshooting”| Symptom | Likely cause | Fix |
|---|---|---|
| Syntax error between statements | Missing semicolon | End every statement with ; (reference). |
SUM()/aggregation fails on a column | Input loaded as VARCHAR | Cast explicitly, or enable Infer input table data types. |
table not found for a mixed-case name | Unquoted name folded to lowercase | Quote the identifier ("MyTable"); see case sensitivity. |
| Run succeeds but nothing in Storage | Missing/incorrect output mapping | Add an output mapping whose Source matches the table the script created (output). |
Related
Section titled “Related”- DuckDB transformation reference — settings, backends, SQL extensions.
- When should I use DuckDB? — DuckDB vs. Snowflake.
- Snowflake to DuckDB migration guide.