Spark: Throw unsupported for ADD COLUMN with default value #13464

amogh-jahagirdar · 2025-07-04T22:39:04Z

Currently, ALTER TABLE ADD COLUMN with default values is unsupported, however the DDL will succeed and silently ignore setting the default value in Iceberg metadata. There is an ongoing PR for supporting this but there's still more work to be done on that.

It'd be ideal at least in the interim if we can explicitly surface an unsupported operation exception to users when the default value is specified .

Note:

Create table with default values will already fail in Spark since the SparkCatalog/SparkSessionCatalog already don't surface default values as a capability
SparkSessionCatalog with alter table add column will also already fail in Spark 3.4/3.5 because the analyzer tries to set this as the delegating catalog during analysis of the default values and we fail https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java#L364 . It's not as explicit of an error message but it's something that was more of an issue in spark itself so working around that for a clearer message doesn't make sense at least considering we are working on supporting it anyways.

amogh-jahagirdar · 2025-07-04T22:51:17Z

spark/v3.4/spark/src/test/java/org/apache/iceberg/spark/CatalogTestBase.java

-        SparkCatalogConfig.SPARK.implementation(),
-        SparkCatalogConfig.SPARK.properties()
-      },
      {


I'm not sure yet what's going on but I see that when the parameters are executed in this order the REST catalog is set as the underlying catalog for the SparkSessionCatalog instead of just the regular SparkCatalog that I'd expect given this definition.

I'm not sure if there's some weird classloader caching happening in Spark between the test executions that's leading to that behavior but the reason it surfaces in the new test is because in spark 3.4/3.5 we can ignore spark session catalog since it already fails but for an expected different message (so we want to skip that case). I just reordered the parameterization and everything works as expected.

Ultimately, will need to figure out what's really going on here but wanted to provide context to reviewers for why this change was made.

singhpk234 · 2025-07-06T14:21:30Z

spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java

+    if (add.defaultValue() != null) {
+      throw new UnsupportedOperationException(
+          String.format(
+              "Cannot add column %s since default values are currently unsupported",
+              leafName(add.fieldNames())));
+    }


[doubt][not a blocker] A while back we had a thread to align with Spark's standardized error handling (doc : https://docs.google.com/document/d/11qHUiCcKMJ-xAyfL__Yv7B1b5N-80-GIwE8AV96A2Ac/edit?tab=t.0) if we are ok
can we throw this documented error class then https://github.com/apache/spark/blob/master/sql/api/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala#L731

Thanks for the doc, I'll double check if there's some standard error code but I don't really agree with the linked error class since that's specific to query parsing failures (e.g. some tests in the original Spark PR demonstrate the intention) whereas the intention of the implemented error is to surface that the Iceberg-Spark implementation just doesn't support this yet.

Thank you for taking a look ! Agree, this error is specific to iceberg connector for spark not supporting this feature, brought this up to get your thoughts on this !

if there's some standard error code
Presently for REPLACE columns spark throws

Expecting actual throwable to be an instance of: java.lang.UnsupportedOperationException but was: org.apache.spark.sql.catalyst.parser.ParseException: [UNSUPPORTED_DEFAULT_VALUE.WITHOUT_SUGGESTION] DEFAULT column values is not supported. SQLSTATE: 0A000 == SQL (line 1, position 1) == ALTER TABLE t1 REPLACE COLUMNS (x STRING DEFAULT 42)

When we throw this presently in iceberg connector way past parsing stage, i think its at the time of commit throws this, we don't really have much handle of the parser here ~

RussellSpitzer · 2025-07-07T16:08:20Z

spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java

+    if (add.defaultValue() != null) {
+      throw new UnsupportedOperationException(
+          String.format(
+              "Cannot add column %s since default values are currently unsupported",


"since setting default values in Spark is currently unsupported?"

I thought the columns can be read but the code for altering the schema is missing?

Yeah they can be read, you're right this error message does leave the impression that it's generally not supported but that's not true. it's specifically that setting default values in Spark is unsupported

RussellSpitzer

Looks good to me, I have a minor nit on the wording of the error message since I think it slightly gives the impression that a table with default values will be broken in Spark when we actually mean that a table with Default Values cannot be created by Spark .

amogh-jahagirdar · 2025-07-07T22:01:30Z

Thanks for the reviews @singhpk234 @nastra @RussellSpitzer !

github-actions bot added the spark label Jul 4, 2025

amogh-jahagirdar commented Jul 4, 2025

View reviewed changes

amogh-jahagirdar requested review from RussellSpitzer and nastra July 4, 2025 22:51

amogh-jahagirdar force-pushed the block-spark-default-value-ddl branch from 65e5b7c to 4778f60 Compare July 4, 2025 22:53

singhpk234 reviewed Jul 6, 2025

View reviewed changes

nastra approved these changes Jul 7, 2025

View reviewed changes

singhpk234 approved these changes Jul 7, 2025

View reviewed changes

RussellSpitzer reviewed Jul 7, 2025

View reviewed changes

RussellSpitzer approved these changes Jul 7, 2025

View reviewed changes

Spark: Throw unsupported for ADD COLUMN with default value

1a20bcb

amogh-jahagirdar force-pushed the block-spark-default-value-ddl branch from 4778f60 to 1a20bcb Compare July 7, 2025 20:28

amogh-jahagirdar merged commit 401ab27 into apache:main Jul 7, 2025
27 checks passed

stevenzwu added this to the Iceberg 1.10.0 milestone Jul 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark: Throw unsupported for ADD COLUMN with default value #13464

Spark: Throw unsupported for ADD COLUMN with default value #13464

Uh oh!

amogh-jahagirdar commented Jul 4, 2025

Uh oh!

amogh-jahagirdar Jul 4, 2025 •

edited

Loading

Uh oh!

singhpk234 Jul 6, 2025

Uh oh!

amogh-jahagirdar Jul 7, 2025

Uh oh!

singhpk234 Jul 7, 2025

Uh oh!

RussellSpitzer Jul 7, 2025

Uh oh!

amogh-jahagirdar Jul 7, 2025

Uh oh!

RussellSpitzer left a comment

Uh oh!

amogh-jahagirdar commented Jul 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Spark: Throw unsupported for ADD COLUMN with default value #13464

Spark: Throw unsupported for ADD COLUMN with default value #13464

Uh oh!

Conversation

amogh-jahagirdar commented Jul 4, 2025

Uh oh!

amogh-jahagirdar Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

singhpk234 Jul 6, 2025

Choose a reason for hiding this comment

Uh oh!

amogh-jahagirdar Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

singhpk234 Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

RussellSpitzer Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

amogh-jahagirdar Jul 7, 2025

Choose a reason for hiding this comment

Uh oh!

RussellSpitzer left a comment

Choose a reason for hiding this comment

Uh oh!

amogh-jahagirdar commented Jul 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

amogh-jahagirdar Jul 4, 2025 •

edited

Loading