This page contains the specification for the DuckLake format, version 0.1.
Building Blocks
DuckLake requires two main components:
- Catalog database: DuckLake requires a database that supports transactions and primary key constraints as defined by the SQL-92 standard.
- Data storage: The DuckLake specification requires storing the data in Parquet format on blob storage (also known as object storage).
Catalog Database
DuckLake uses SQL tables and queries to define the catalog information (metadata, statistics, etc.). This specification explains the schema and semantics of these:
If you are reading this specification for the first time, we recommend starting with the “Queries” page, which introduces the queries used by DuckLake.