REST Catalog 503 errors should not be cleanable failures

### Apache Iceberg version

1.10.1 (latest release)

### Query engine

Spark

### Please describe the bug 🐞

Currently, HTTP 503 responses are not retried, yet they are still classified as cleanable failures for CreateTable transactions (stage create + updateTable request). This can lead to table corruption in scenarios where the commit is successfully persisted by the catalog, but an intermediate component returns a 503 to the client.

https://github.com/apache/iceberg/blob/7bac8650f65279c470d7d2c005c40a858933134a/core/src/main/java/org/apache/iceberg/rest/RESTTableOperations.java#L172

In our setup, Spark communicates with the catalog through Envoy (acting as a reverse proxy). When Envoy returns a 503 due to a transient downstream issue, the client assumes the commit failed and proceeds with cleanup. However, the catalog may have already committed the transaction successfully. As a result, valid manifest files can be incorrectly cleaned up, leaving the table in an corrupted state.

This behavior makes 503 responses unsafe to treat as cleanable failures, especially in deployments with proxies between the client and the catalog.

Should we use a `tableCommitErrorHandler` instead of a `tableErrorHandler` also in case of CREATE updateType and not only for REPLACE and SIMPLE?

Previous related work:
https://github.com/apache/iceberg/pull/13619 and [thread](https://lists.apache.org/thread/oqonscy1b4qlmovnjtbcohz38kgprgmq)

### Willingness to contribute

- [x] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REST Catalog 503 errors should not be cleanable failures #15050

Apache Iceberg version

Query engine

Please describe the bug 🐞

Willingness to contribute

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

REST Catalog 503 errors should not be cleanable failures #15050

Description

Apache Iceberg version

Query engine

Please describe the bug 🐞

Willingness to contribute

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions