Skip to content

feat(evolution): support v3 column default values in UpdateSchema (3/4)#793

Open
huan233usc wants to merge 1 commit into
apache:mainfrom
huan233usc:feat/default-values-evolution
Open

feat(evolution): support v3 column default values in UpdateSchema (3/4)#793
huan233usc wants to merge 1 commit into
apache:mainfrom
huan233usc:feat/default-values-evolution

Conversation

@huan233usc

@huan233usc huan233usc commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

What

Part 3 of 4 of Iceberg v3 column default-value support (POC #731), built on the
schema-layer support merged in #746. Independent of the read-path PRs (#792 and
the Avro follow-up).

Adds default-value handling to UpdateSchema (schema evolution).

Changes

  • AddColumn / AddRequiredColumn take an optional default_value. When
    provided it is set as both the column's initial-default and write-default.
    A non-null default also lets a required column be added without
    AllowIncompatibleChanges() — rows written before the change read the default
    instead of null.
  • UpdateColumnDefault(name, default) (new) sets, or clears with
    std::nullopt, a column's write-default; the initial-default is fixed when
    the column is added.
  • Defaults are cast to the column type (rejecting uncastable or out-of-range
    values) and preserved across rename / doc / type-promotion updates and
    nested field-id reassignment.
  • RequireColumn may now mark a column that was added with a default required.

The SchemaField constructor stores defaults verbatim (it does not coerce
them), so the cast/promotion is performed explicitly at each evolution site —
the same effect as Java, where NestedField's constructor runs castDefault.
Same-scale decimal precision widening is handled directly (the unscaled value is
unchanged), since Literal::CastTo does not cast between decimal types.

Tests

13 cases in update_schema_test.cc: add optional/required/nested column with a
default, mismatched/narrowing rejection, UpdateColumnDefault
(set / clear / cast-to-type / pre-existing column), require-after-default, and
preservation across doc updates and type promotion — including same-scale
decimal precision promotion.

Stack

  1. feat(schema): represent, serialize and validate v3 column default values (1/4) #746 — schema: represent / serialize / validate (merged)
  2. feat(parquet): apply column default values when reading missing fields (2/4) #792 — read path: Parquet
  3. read path: Avro (follows)
  4. this PR — schema evolution: addColumn / updateColumnDefault

AddColumn / AddRequiredColumn now accept an optional default value, used as
both the initial-default and write-default of the new column; a non-null
default also lets a required column be added (or an added column be made
required) without AllowIncompatibleChanges(). UpdateColumnDefault sets or
clears the write-default. Defaults are cast to the column type (rejecting
uncastable or out-of-range values) and preserved across rename / doc / type
updates and nested field-id reassignment.

Part 4 of the v3 column-default-values work (POC apache#731), built on apache#746.
@huan233usc huan233usc changed the title feat(evolution): support v3 column default values in UpdateSchema (4/4) feat(evolution): support v3 column default values in UpdateSchema (3/4) Jun 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant