Allow table read/write options to be configured and/or enforced at catalog level using catalog properties

**Background:** #4011 allowed table properties to be set on newly-created tables via catalog properties.

**Proposal:** It would be nice to have these propagate to runtime table properties as well for read/write time.

**Problem Solved:**  There are a lot of table properties that should be overriden at runtime, see : https://iceberg.apache.org/docs/latest/configuration/ .  For example, now the delete write-distribution-mode defaults to hash which cause unexpected shuffles (see https://github.com/apache/iceberg/issues/5224), and these are not always desirable.  Or users would like to override vectorization due to compatibility bugs (https://github.com/apache/iceberg/pull/2740) or change read/write split size dynamically.  

But, in spark, spark.sql() cannot take any options, so users are stuck setting table properties for this.  This is not user-friendly, as imagine concurrent jobs are running, setting a table property just for one job may break something for another job if it picks it up.

This proposal could be one way without changing Spark, for Iceberg to be able to override table properties at runtime, for the cases where there is no other way to override these properties.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow table read/write options to be configured and/or enforced at catalog level using catalog properties #5343

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow table read/write options to be configured and/or enforced at catalog level using catalog properties #5343

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions