Skip to content

Conversation

@amogh-jahagirdar
Copy link
Contributor

Currently, ALTER TABLE ADD COLUMN with default values is unsupported, however the DDL will succeed and silently ignore setting the default value in Iceberg metadata. There is an ongoing PR for supporting this but there's still more work to be done on that.

It'd be ideal at least in the interim if we can explicitly surface an unsupported operation exception to users when the default value is specified .

Note:

  1. Create table with default values will already fail in Spark since the SparkCatalog/SparkSessionCatalog already don't surface default values as a capability
  2. SparkSessionCatalog with alter table add column will also already fail in Spark 3.4/3.5 because the analyzer tries to set this as the delegating catalog during analysis of the default values and we fail https://github.com/apache/iceberg/blob/main/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java#L364 . It's not as explicit of an error message but it's something that was more of an issue in spark itself so working around that for a clearer message doesn't make sense at least considering we are working on supporting it anyways.

@github-actions github-actions bot added the spark label Jul 4, 2025
SparkCatalogConfig.SPARK.implementation(),
SparkCatalogConfig.SPARK.properties()
},
{
Copy link
Contributor Author

@amogh-jahagirdar amogh-jahagirdar Jul 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure yet what's going on but I see that when the parameters are executed in this order the REST catalog is set as the underlying catalog for the SparkSessionCatalog instead of just the regular SparkCatalog that I'd expect given this definition.

I'm not sure if there's some weird classloader caching happening in Spark between the test executions that's leading to that behavior but the reason it surfaces in the new test is because in spark 3.4/3.5 we can ignore spark session catalog since it already fails but for an expected different message (so we want to skip that case). I just reordered the parameterization and everything works as expected.

Ultimately, will need to figure out what's really going on here but wanted to provide context to reviewers for why this change was made.

@amogh-jahagirdar amogh-jahagirdar force-pushed the block-spark-default-value-ddl branch from 65e5b7c to 4778f60 Compare July 4, 2025 22:53
Comment on lines 237 to 242
if (add.defaultValue() != null) {
throw new UnsupportedOperationException(
String.format(
"Cannot add column %s since default values are currently unsupported",
leafName(add.fieldNames())));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the doc, I'll double check if there's some standard error code but I don't really agree with the linked error class since that's specific to query parsing failures (e.g. some tests in the original Spark PR demonstrate the intention) whereas the intention of the implemented error is to surface that the Iceberg-Spark implementation just doesn't support this yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking a look ! Agree, this error is specific to iceberg connector for spark not supporting this feature, brought this up to get your thoughts on this !

if there's some standard error code
Presently for REPLACE columns spark throws

Expecting actual throwable to be an instance of:
  java.lang.UnsupportedOperationException
but was:
  org.apache.spark.sql.catalyst.parser.ParseException: 
[UNSUPPORTED_DEFAULT_VALUE.WITHOUT_SUGGESTION] DEFAULT column values is not supported.  SQLSTATE: 0A000
== SQL (line 1, position 1) ==
ALTER TABLE t1 REPLACE COLUMNS (x STRING DEFAULT 42)

When we throw this presently in iceberg connector way past parsing stage, i think its at the time of commit throws this, we don't really have much handle of the parser here ~

if (add.defaultValue() != null) {
throw new UnsupportedOperationException(
String.format(
"Cannot add column %s since default values are currently unsupported",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"since setting default values in Spark is currently unsupported?"

I thought the columns can be read but the code for altering the schema is missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah they can be read, you're right this error message does leave the impression that it's generally not supported but that's not true. it's specifically that setting default values in Spark is unsupported

Copy link
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, I have a minor nit on the wording of the error message since I think it slightly gives the impression that a table with default values will be broken in Spark when we actually mean that a table with Default Values cannot be created by Spark .

@amogh-jahagirdar amogh-jahagirdar force-pushed the block-spark-default-value-ddl branch from 4778f60 to 1a20bcb Compare July 7, 2025 20:28
@amogh-jahagirdar
Copy link
Contributor Author

Thanks for the reviews @singhpk234 @nastra @RussellSpitzer !

@amogh-jahagirdar amogh-jahagirdar merged commit 401ab27 into apache:main Jul 7, 2025
27 checks passed
@stevenzwu stevenzwu added this to the Iceberg 1.10.0 milestone Jul 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants