This table contains column-level statistics for a single data file.
| Column name | Column type | |
|---|---|---|
data_file_id |
BIGINT |
|
table_id |
BIGINT |
|
column_id |
BIGINT |
|
column_size_bytes |
BIGINT |
|
value_count |
BIGINT |
|
null_count |
BIGINT |
|
min_value |
VARCHAR |
|
max_value |
VARCHAR |
|
contains_nan |
BOOLEAN |
|
extra_stats |
VARCHAR |
data_file_idrefers to adata_file_idfrom theducklake_data_filetable.table_idrefers to atable_idfrom theducklake_tabletable.column_idrefers to acolumn_idfrom theducklake_columntable.column_size_bytesis the byte size of the column.value_countis the number of values in the column. This does not have to correspond to the number of records in the file for nested types.null_countis the number of values in the column that areNULL.min_valuecontains the minimum value for the column, encoded as a string. This does not have to be exact but has to be a lower bound. The value has to be cast to the actual type for accurate comparison, e.g., on integer types.max_valuecontains the maximum value for the column, encoded as a string. This does not have to be exact but has to be an upper bound. The value has to be cast to the actual type for accurate comparison, e.g., on integer types.contains_nanis a flag whether the column contains anyNaNvalues. This is only relevant for floating-point types.extra_statscontains different statistics from specific types such as geometry types.