⌘+k ctrl+k
1.0 (stable)
Search Shortcut cmd + k | ctrl + k
Partitioning

DuckLake tables can be partitioned by a user-defined set of partition keys. When a DuckLake table has partitioning keys defined, any new data is split up into separate data files along the partitioning keys. During query planning, DuckLake uses file-level zone maps — per-column min/max statistics stored in ducklake_file_column_stats — to decide which files to scan. For order-preserving partition transforms (identity, year, month, day, hour), partitioned writes naturally produce tight zone maps on the partition column, so file-level pruning effectively narrows scans to the matching partition files.

The partitioning keys defined on a table only affect new data written to the table. Previously written data will be kept partitioned by the keys the table had when that data was written. This allows the partition layout of a table to evolve over-time as needed.

The partitioning keys for a file are stored in DuckLake. These keys do not need to be necessarily stored within the files, or in the paths to the files.

Examples

By default, DuckLake uses Hive partitioning. If you want to avoid this style of partitions, you can opt out globally via CALL my_ducklake.set_option('hive_file_pattern', false), or per schema/table by passing the schema or table_name argument.

Set the partitioning keys of a table, such that new data added to the table is partitioned by these keys.

To partition on a column, use:

ALTER TABLE tbl SET PARTITIONED BY (part_key);

You can also partition using functions. For example, to partition based on the year/month of a timestamp, use:

ALTER TABLE tbl SET PARTITIONED BY (year(ts), month(ts));

To distribute rows into a fixed number of buckets using an Iceberg-compatible hash (Murmur3), use the bucket transform:

ALTER TABLE tbl SET PARTITIONED BY (bucket(8, user_id));

Bucket partitioning can be combined with other transforms:

ALTER TABLE tbl SET PARTITIONED BY (bucket(8, user_id), month(ts));

Remove the partitioning keys of a table, such that new data added to the table is no longer partitioned.

ALTER TABLE tbl RESET PARTITIONED BY;

DuckLake supports the following partition clauses:

Transform Expression
identity col_name
bucket bucket(N, col_name)
year year(ts)
month month(ts)
day day(ts)
hour hour(ts)
© 2026 DuckDB Foundation, Amsterdam NL
Manifesto Blog